git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers
@ 2022-09-13 19:25 Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 1/8] wincred: ignore unknown lines (do not die) Matthew John Cheetham via GitGitGadget
                   ` (10 more replies)
  0 siblings, 11 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham

Hello! I have an RFC to update the existing credential helper design in
order to allow for some new scenarios, and future evolution of auth methods
that Git hosts may wish to provide. I outline the background, summary of
changes and some challenges below. I also attach a series of patches to
illustrate the design proposal.

One missing element from the patches are extensive tests of the new
behaviour. It appears existing tests focus either on the credential helper
protocol/format, or rely on testing basic authentication only via an Apache
webserver. In order to have a full end to end test coverage of these new
features it make be that we need a more comprehensive test bed to mock these
more nuanced authentication methods. I lean on the experts on the list for
advice here.


Background
==========

Git uses a variety of protocols [1]: local, Smart HTTP, Dumb HTTP, SSH, and
Git. Here I focus on the Smart HTTP protocol, and attempt to enhance the
authentication capabilities of this protocol to address limitations (see
below).

The Smart HTTP protocol in Git supports a few different types of HTTP
authentication - Basic and Digest (RFC 2617) [2], and Negotiate (RFC 2478)
[3]. Git uses a extensible model where credential helpers can provide
credentials for protocols [4]. Several helpers support alternatives such as
OAuth authentication (RFC 6749) [5], but this is typically done as an
extension. For example, a helper might use basic auth and set the password
to an OAuth Bearer access token. Git uses standard input and output to
communicate with credential helpers.

After a HTTP 401 response, Git would call a credential helper with the
following over standard input:

protocol=https
host=example.com


And then a credential helper would return over standard output:

protocol=https
host=example.com
username=bob@id.example.com
password=<BEARER-TOKEN>


Git then the following request to the remote, including the standard HTTP
Authorization header (RFC 7235 Section 4.2) [6]:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: Basic base64(bob@id.example.com:<BEARER-TOKEN>)


Credential helpers are encouraged (see gitcredentials.txt) to return the
minimum information necessary.


Limitations
===========

Because this credential model was built mostly for password based
authentication systems, it's somewhat limited. In particular:

 1. To generate valid credentials, additional information about the request
    (or indeed the requestee and their device) may be required. For example,
    OAuth is based around scopes. A scope, like "git.read", might be
    required to read data from the remote. However, the remote cannot tell
    the credential helper what scope is required for this request.

 2. This system is not fully extensible. Each time a new type of
    authentication (like OAuth Bearer) is invented, Git needs updates before
    credential helpers can take advantage of it (or leverage a new
    capability in libcurl).


Goals
=====

 * As a user with multiple federated cloud identities:
   
   * Reach out to a remote and have my credential helper automatically
     prompt me for the correct identity.
   * Leverage existing authentication systems built-in to many operating
     systems and devices to boost security and reduce reliance on passwords.

 * As a Git host and/or cloud identity provider:
   
   * Leverage newest identity standards, enhancements, and threat
     mitigations - all without updating Git.
   * Enforce security policies (like requiring two-factor authentication)
     dynamically.
   * Allow integration with third party standard based identity providers in
     enterprises allowing customers to have a single plane of control for
     critical identities with access to source code.


Design Principles
=================

 * Use the existing infrastructure. Git credential helpers are an
   already-working model.
 * Follow widely-adopted time-proven open standards, avoid net new ideas in
   the authentication space.
 * Minimize knowledge of authentication in Git; maintain modularity and
   extensibility.


Proposed Changes
================

 1. Teach Git to read HTTP response headers, specifically the standard
    WWW-Authenticate (RFC 7235 Section 4.1) headers.

 2. Teach Git to include extra information about HTTP responses that require
    authentication when calling credential helpers. Specifically the
    WWW-Authenticate header information.
    
    Because the extra information forms an ordered list, and the existing
    credential helper I/O format only provides for simple key=value pairs,
    we introduce a new convention for transmitting an ordered list of
    values. Key names that are suffixed with a C-style array syntax should
    have values considered to form an order list, i.e. key[n]=value, where n
    is a zero based index of the values.
    
    For the WWW-Authenticate header values we opt to use the key wwwauth[n].

 3. Teach Git to specify authentication schemes other than Basic in
    subsequent HTTP requests based on credential helper responses.


Handling the WWW-Authenticate header in detail
==============================================

RFC 6750 [7] envisions that OAuth Bearer resource servers would give
responses that include WWW-Authenticate headers, for example:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


Specifically, a WWW-Authenticate header consists of a scheme and arbitrary
attributes, depending on the scheme. This pattern enables generic OAuth or
OpenID Connect [8] authorities. Note that it is possible to have several
WWW-Authenticate challenges in a response.

First Git attempts to make a request, unauthenticated, which fails with a
401 response and includes WWW-Authenticate header(s).

Next, Git invokes a credential helper which may prompt the user. If the user
approves, a credential helper can generate a token (or any auth challenge
response) to be used for that request.

For example: with a remote that supports bearer tokens from an OpenID
Connect [8] authority, a credential helper can use OpenID Connect's
Discovery [9] and Dynamic Client Registration [9] to register a client and
make a request with the correct permissions to access the remote. In this
manner, a user can be dynamically sent to the right federated identity
provider for a remote without any up-front configuration or manual
processes.

Following from the principle of keeping authentication knowledge in Git to a
minimum, we modify Git to add all WWW-Authenticate values to the credential
helper call.

Git sends over standard input:

protocol=https
host=example.com
wwwauth[0]=Bearer realm="login.example", scope="git.readwrite"
wwwauth[1]=Basic realm="login.example"


A credential helper that understands the extra wwwauth[n] property can
decide on the "best" or correct authentication scheme, generate credentials
for the request, and interact with the user.

The credential helper would then return over standard output:

protocol=https
host=example.com
path=foo.git
username=bob@identity.example
password=<BEARER-TOKEN>


Note that WWW-Authenticate supports multiple challenges, either in one
header:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"


or in multiple headers:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


These have equivalent meaning (RFC 2616 Section 4.2 [11]). To simplify the
implementation, Git will not merge or split up any of these WWW-Authenticate
headers, and instead pass each header line as one credential helper
property. The credential helper is responsible for splitting, merging, and
otherwise parsing these header values.

An alternative option to sending the header fields individually would be to
merge the header values in to one key=value property, for example:

...
wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"



Future flexibility
==================

By allowing the credential helpers decide the best authentication scheme, we
can allow the remote Git server to both offer new schemes (or remove old
ones) that enlightened credential helpers could take immediate advantage of,
and to use credentials that are much more tightly scoped and bound to the
specific request.

For example imagine a new "FooBar" authentication scheme that is surfaced in
the following response:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: FooBar realm="login.example", algs="ES256 PS256"


With support for arbitrary authentication schemes, Git would call credential
helpers with the following over standard input:

protocol=https
host=example.com
wwwauth[0]=FooBar realm="login.example", algs="ES256 PS256", nonce="abc123"


And then an enlightened credential helper would return over standard output:

protocol=https
host=example.com
authtype=FooBar
username=bob@id.example.com
password=<FooBar credential>


Git would be expected to attach this authorization header to the next
request:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: FooBar <FooBar credential>



Should Git not control the set of authentication schemes?
=========================================================

One concern that the reader may have regarding these changes is in allowing
helpers to select the authentication mechanism to use, it may be possible
that a weaker form of authentication is used.

Take for example a Git remote server that responds with the following
authentication schemes:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Negotiate ...
WWW-Authenticate: Basic ...


Today Git (and libcurl) prefer to Negotiate over Basic authentication [12].
If a helper responded with authtype=basic Git would now be using a "less
secure" mechanism.

The reason we still propose the credential helper decide on the
authentication scheme is that Git is not the best placed entity to decide
what type of authentication should be used for a particular request (see
Design Principle 3).

OAuth Bearer tokens are often bundled in Basic Authorization headers [13],
but given that the tokens are/can be short-lived and have a highly scoped
set of permissions, this solution could be argued as being more secure than
something like NTLM [14]. Similarly, the user may wish to be consulted on
selecting a particular user account, or directly selecting an authentication
mechanism for a request that otherwise they would not be able to use.

Also, as new authentication protocols appear Git does not need to be
modified or updated for the user to take advantage of them; the credential
helpers take on the responsibility of learning and selecting the "best"
option.


Why not SSH?
============

There's nothing wrong with SSH. However, Git's Smart HTTP transport is
widely used, often with OAuth Bearer tokens. Git's Smart HTTP transport
sometimes requires less client setup than SSH transport, and works in
environments when SSH ports may be blocked. As long as Git supports HTTP
transport, it should support common and popular HTTP authentication methods.


References
==========

 * [1] Git on the Server - The Protocols
   https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols

 * [2] HTTP Authentication: Basic and Digest Access Authentication
   https://datatracker.ietf.org/doc/html/rfc2617

 * [3] The Simple and Protected GSS-API Negotiation Mechanism
   https://datatracker.ietf.org/doc/html/rfc2478

 * [4] Git Credentials - Custom Helpers
   https://git-scm.com/docs/gitcredentials#_custom_helpers

 * [5] The OAuth 2.0 Authorization Framework
   https://datatracker.ietf.org/doc/html/rfc6749

 * [6] Hypertext Transfer Protocol (HTTP/1.1): Authentication
   https://datatracker.ietf.org/doc/html/rfc7235

 * [7] The OAuth 2.0 Authorization Framework: Bearer Token Usage
   https://datatracker.ietf.org/doc/html/rfc6750

 * [8] OpenID Connect Core 1.0
   https://openid.net/specs/openid-connect-core-1_0.html

 * [9] OpenID Connect Discovery 1.0
   https://openid.net/specs/openid-connect-discovery-1_0.html

 * [10] OpenID Connect Dynamic Client Registration 1.0
   https://openid.net/specs/openid-connect-registration-1_0.html

 * [11] Hypertext Transfer Protocol (HTTP/1.1)
   https://datatracker.ietf.org/doc/html/rfc2616

 * [12] libcurl http.c pickoneauth Function
   https://github.com/curl/curl/blob/c495dcd02e885fc3f35164b1c3c5f72fa4b60c46/lib/http.c#L381-L416

 * [13] Git Credential Manager GitHub Host Provider (using PAT as password)
   https://github.com/GitCredentialManager/git-credential-manager/blob/f77b766f6875b90251249f2aa1702b921309cf00/src/shared/GitHub/GitHubHostProvider.cs#L157

 * [14] NT LAN Manager (NTLM) Authentication Protocol
   https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-nlmp/b38c36ed-2804-4868-a9ff-8dd3182128e4

Matthew John Cheetham (8):
  wincred: ignore unknown lines (do not die)
  netrc: ignore unknown lines (do not die)
  osxkeychain: clarify that we ignore unknown lines
  http: read HTTP WWW-Authenticate response headers
  credential: add WWW-Authenticate header to cred requests
  http: store all request headers on active_request_slot
  http: move proactive auth to first slot creation
  http: set specific auth scheme depending on credential

 Documentation/git-credential.txt              |  18 ++
 .../netrc/git-credential-netrc.perl           |   5 +-
 .../osxkeychain/git-credential-osxkeychain.c  |   5 +
 .../wincred/git-credential-wincred.c          |   7 +-
 credential.c                                  |  18 ++
 credential.h                                  |  11 +
 git-curl-compat.h                             |   7 +
 http-push.c                                   | 103 ++++-----
 http-walker.c                                 |   2 +-
 http.c                                        | 199 +++++++++++++-----
 http.h                                        |   4 +-
 remote-curl.c                                 |  36 ++--
 t/lib-httpd/apache.conf                       |  13 ++
 t/t5551-http-fetch-smart.sh                   |  46 ++++
 14 files changed, 335 insertions(+), 139 deletions(-)


base-commit: dd3f6c4cae7e3b15ce984dce8593ff7569650e24
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1352%2Fmjcheetham%2Femu-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1352/mjcheetham/emu-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1352
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH 1/8] wincred: ignore unknown lines (do not die)
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 2/8] netrc: " Matthew John Cheetham via GitGitGadget
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

It is the expectation that credential helpers be liberal in what they
accept and conservative in what they return, to allow for future growth
and evolution of the protocol/interaction.

All of the other helpers (store, cache, osxkeychain, libsecret,
gnome-keyring) except `netrc` currently ignore any credential lines
that are not recognised, whereas the Windows helper (wincred) instead
dies.

Fix the discrepancy and ignore unknown lines in the wincred helper.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 contrib/credential/wincred/git-credential-wincred.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/contrib/credential/wincred/git-credential-wincred.c b/contrib/credential/wincred/git-credential-wincred.c
index 5091048f9c6..ead6e267c78 100644
--- a/contrib/credential/wincred/git-credential-wincred.c
+++ b/contrib/credential/wincred/git-credential-wincred.c
@@ -278,8 +278,11 @@ static void read_credential(void)
 			wusername = utf8_to_utf16_dup(v);
 		} else if (!strcmp(buf, "password"))
 			password = utf8_to_utf16_dup(v);
-		else
-			die("unrecognized input");
+		/*
+		 * Ignore other lines; we don't know what they mean, but
+		 * this future-proofs us when later versions of git do
+		 * learn new lines, and the helpers are updated to match.
+		 */
 	}
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 2/8] netrc: ignore unknown lines (do not die)
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 1/8] wincred: ignore unknown lines (do not die) Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines Matthew John Cheetham via GitGitGadget
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Contrary to the documentation on credential helpers, as well as the help
text for git-credential-netrc itself, this helper will `die` when
presented with an unknown property/attribute/token.

Correct the behaviour here by skipping and ignoring any tokens that are
unknown. This means all helpers in the tree are consistent and ignore
any unknown credential properties/attributes.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 contrib/credential/netrc/git-credential-netrc.perl | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/credential/netrc/git-credential-netrc.perl b/contrib/credential/netrc/git-credential-netrc.perl
index bc57cc65884..9fb998ae090 100755
--- a/contrib/credential/netrc/git-credential-netrc.perl
+++ b/contrib/credential/netrc/git-credential-netrc.perl
@@ -356,7 +356,10 @@ sub read_credential_data_from_stdin {
 		next unless m/^([^=]+)=(.+)/;
 
 		my ($token, $value) = ($1, $2);
-		die "Unknown search token $token" unless exists $q{$token};
+
+		# skip any unknown tokens
+		next unless exists $q{$token};
+
 		$q{$token} = $value;
 		log_debug("We were given search token $token and value $value");
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 1/8] wincred: ignore unknown lines (do not die) Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 2/8] netrc: " Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-19 16:12   ` Derrick Stolee
  2022-09-13 19:25 ` [PATCH 4/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Like in all the other credential helpers, the osxkeychain helper
ignores unknown credential lines.

Add a comment (a la the other helpers) to make it clear and explicit
that this is the desired behaviour.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 contrib/credential/osxkeychain/git-credential-osxkeychain.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/contrib/credential/osxkeychain/git-credential-osxkeychain.c b/contrib/credential/osxkeychain/git-credential-osxkeychain.c
index bf77748d602..e29cc28779d 100644
--- a/contrib/credential/osxkeychain/git-credential-osxkeychain.c
+++ b/contrib/credential/osxkeychain/git-credential-osxkeychain.c
@@ -159,6 +159,11 @@ static void read_credential(void)
 			username = xstrdup(v);
 		else if (!strcmp(buf, "password"))
 			password = xstrdup(v);
+		/*
+		 * Ignore other lines; we don't know what they mean, but
+		 * this future-proofs us when later versions of git do
+		 * learn new lines, and the helpers are updated to match.
+		 */
 	}
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 4/8] http: read HTTP WWW-Authenticate response headers
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-19 16:21   ` Derrick Stolee
  2022-09-13 19:25 ` [PATCH 5/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

According to RFC2616 Section 4.2 [1], header field names are not
case-sensitive meaning when collecting multiple values for the same
field name, we can just use the case of the first observed instance of
each field name and no normalisation is required.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us one line at
a time.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 credential.c |  1 +
 credential.h | 10 +++++++
 http.c       | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 88 insertions(+)

diff --git a/credential.c b/credential.c
index f6389a50684..897b4679333 100644
--- a/credential.c
+++ b/credential.c
@@ -22,6 +22,7 @@ void credential_clear(struct credential *c)
 	free(c->username);
 	free(c->password);
 	string_list_clear(&c->helpers, 0);
+	strvec_clear(&c->wwwauth_headers);
 
 	credential_init(c);
 }
diff --git a/credential.h b/credential.h
index f430e77fea4..6a9d4e3de07 100644
--- a/credential.h
+++ b/credential.h
@@ -2,6 +2,7 @@
 #define CREDENTIAL_H
 
 #include "string-list.h"
+#include "strvec.h"
 
 /**
  * The credentials API provides an abstracted way of gathering username and
@@ -115,6 +116,14 @@ struct credential {
 	 */
 	struct string_list helpers;
 
+	/**
+	 * A `strvec` of WWW-Authenticate header values. Each string
+	 * is the value of a WWW-Authenticate header in an HTTP response,
+	 * in the order they were received in the response.
+	 */
+	struct strvec wwwauth_headers;
+	unsigned header_is_last_match:1;
+
 	unsigned approved:1,
 		 configured:1,
 		 quit:1,
@@ -130,6 +139,7 @@ struct credential {
 
 #define CREDENTIAL_INIT { \
 	.helpers = STRING_LIST_INIT_DUP, \
+	.wwwauth_headers = STRVEC_INIT, \
 }
 
 /* Initialize a credential structure, setting all fields to empty. */
diff --git a/http.c b/http.c
index 5d0502f51fd..091321af98e 100644
--- a/http.c
+++ b/http.c
@@ -183,6 +183,81 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	return nmemb;
 }
 
+static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
+{
+	size_t size = eltsize * nmemb;
+	struct strvec *values = &http_auth.wwwauth_headers;
+	struct strbuf buf = STRBUF_INIT;
+	const char *val;
+	const char *z = NULL;
+
+	/*
+	 * Header lines may not come NULL-terminated from libcurl so we must
+	 * limit all scans to the maximum length of the header line, or leverage
+	 * strbufs for all operations.
+	 *
+	 * In addition, it is possible that header values can be split over
+	 * multiple lines as per RFC 2616 (even though this has since been
+	 * deprecated in RFC 7230). A continuation header field value is
+	 * identified as starting with a space or horizontal tab.
+	 *
+	 * The formal definition of a header field as given in RFC 2616 is:
+	 *
+	 *   message-header = field-name ":" [ field-value ]
+	 *   field-name     = token
+	 *   field-value    = *( field-content | LWS )
+	 *   field-content  = <the OCTETs making up the field-value
+	 *                    and consisting of either *TEXT or combinations
+	 *                    of token, separators, and quoted-string>
+	 */
+
+	strbuf_add(&buf, ptr, size);
+
+	/* Strip the CRLF that should be present at the end of each field */
+	strbuf_trim_trailing_newline(&buf);
+
+	/* Start of a new WWW-Authenticate header */
+	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
+		while (isspace(*val)) val++;
+
+		strvec_push(values, val);
+		http_auth.header_is_last_match = 1;
+		goto exit;
+	}
+
+	/*
+	 * This line could be a continuation of the previously matched header
+	 * field. If this is the case then we should append this value to the
+	 * end of the previously consumed value.
+	 */
+	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
+		const char **v = values->v + values->nr - 1;
+		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
+
+		free((void*)*v);
+		*v = append;
+
+		goto exit;
+	}
+
+	/* This is the start of a new header we don't care about */
+	http_auth.header_is_last_match = 0;
+
+	/*
+	 * If this is a HTTP status line and not a header field, this signals
+	 * a different HTTP response. libcurl writes all the output of all
+	 * response headers of all responses, including redirects.
+	 * We only care about the last HTTP request response's headers so clear
+	 * the existing array.
+	 */
+	if (skip_iprefix(buf.buf, "http/", &z))
+		strvec_clear(values);
+
+exit:
+	strbuf_release(&buf);
+	return size;
+}
+
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
 	return nmemb;
@@ -1829,6 +1904,8 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
+
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 5/8] credential: add WWW-Authenticate header to cred requests
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 4/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-19 16:33   ` Derrick Stolee
  2022-09-13 19:25 ` [PATCH 6/8] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[n]` properties where `n` is a
zero-indexed number, reflecting the order the WWW-Authenticate headers
appeared in the HTTP response.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt |  9 +++++++
 credential.c                     | 12 +++++++++
 t/lib-httpd/apache.conf          | 13 +++++++++
 t/t5551-http-fetch-smart.sh      | 46 ++++++++++++++++++++++++++++++++
 4 files changed, 80 insertions(+)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index f18673017f5..7d4a788c63d 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -151,6 +151,15 @@ Git understands the following attributes:
 	were read (e.g., `url=https://example.com` would behave as if
 	`protocol=https` and `host=example.com` had been provided). This
 	can help callers avoid parsing URLs themselves.
+
+`wwwauth[n]`::
+
+	When an HTTP response is received that includes one or more
+	'WWW-Authenticate' authentication headers, these can be passed to Git
+	(and subsequent credential helpers) with these attributes.
+	Each 'WWW-Authenticate' header value should be passed as a separate
+	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
+	appear in the HTTP response.
 +
 Note that specifying a protocol is mandatory and if the URL
 doesn't specify a hostname (e.g., "cert:///path/to/file") the
diff --git a/credential.c b/credential.c
index 897b4679333..4ad40323fc7 100644
--- a/credential.c
+++ b/credential.c
@@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
 	fprintf(fp, "%s=%s\n", key, value);
 }
 
+static void credential_write_strvec(FILE *fp, const char *key,
+				    const struct strvec *vec)
+{
+	int i = 0;
+	for (; i < vec->nr; i++) {
+		const char *full_key = xstrfmt("%s[%d]", key, i);
+		credential_write_item(fp, full_key, vec->v[i], 0);
+		free((void*)full_key);
+	}
+}
+
 void credential_write(const struct credential *c, FILE *fp)
 {
 	credential_write_item(fp, "protocol", c->protocol, 1);
@@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
 static int run_credential_helper(struct credential *c,
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 497b9b9d927..fe118d76f98 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -235,6 +235,19 @@ SSLEngine On
 	Require valid-user
 </LocationMatch>
 
+# Advertise two additional auth methods above "Basic".
+# Neither of them actually work but serve test cases showing these
+# additional auth headers are consumed correctly.
+<Location /auth-wwwauth/>
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+	SetEnvIf Authorization "^\S+" authz
+	Header always add WWW-Authenticate "Bearer authority=https://login.example.com" env=!authz
+	Header always add WWW-Authenticate "FooAuth foo=bar baz=1" env=!authz
+</Location>
+
 RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
 RewriteCond %{REQUEST_URI} /git-receive-pack$
 RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6a38294a476..c99d8e253df 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -564,6 +564,52 @@ test_expect_success 'http auth forgets bogus credentials' '
 	expect_askpass both user@host
 '
 
+test_expect_success 'http auth sends www-auth headers to credential helper' '
+	write_script git-credential-tee <<-\EOF &&
+		cmd=$1
+		teefile=credential-$cmd
+		if [ -f "$teefile" ]; then
+			rm $teefile
+		fi
+		(
+			while read line;
+			do
+				if [ -z "$line" ]; then
+					exit 0
+				fi
+				echo "$line" >> $teefile
+				echo $line
+			done
+		) | git credential-store $cmd
+	EOF
+
+	cat >expected-get <<-EOF &&
+	protocol=http
+	host=127.0.0.1:5551
+	wwwauth[0]=Bearer authority=https://login.example.com
+	wwwauth[1]=FooAuth foo=bar baz=1
+	wwwauth[2]=Basic realm="git-auth"
+	EOF
+
+	cat >expected-store <<-EOF &&
+	protocol=http
+	host=127.0.0.1:5551
+	username=user@host
+	password=pass@host
+	EOF
+
+	rm -f .git-credentials &&
+	test_config credential.helper tee &&
+	set_askpass user@host pass@host &&
+	(
+		PATH="$PWD:$PATH" &&
+		git ls-remote "$HTTPD_URL/auth-wwwauth/smart/repo.git"
+	) &&
+	expect_askpass both user@host &&
+	test_cmp expected-get credential-get &&
+	test_cmp expected-store credential-store
+'
+
 test_expect_success 'client falls back from v2 to v0 to match server' '
 	GIT_TRACE_PACKET=$PWD/trace \
 	GIT_TEST_PROTOCOL_VERSION=2 \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 6/8] http: store all request headers on active_request_slot
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 5/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 7/8] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Once a list of headers has been set on the curl handle, it is not
possible to recover that `struct curl_slist` instance to add or modify
headers.

In future commits we will want to modify the set of request headers in
response to an authentication challenge/401 response from the server,
with information provided by a credential helper.

There are a number of different places where curl is used for an HTTP
request, and they do not have a common handling of request headers.
However, given that they all do call the `start_active_slot()` function,
either directly or indirectly via `run_slot()` or `run_one_slot()`, we
use this as the point to set the `CURLOPT_HTTPHEADER` option just
before the request is made.

We collect all request headers in a `struct curl_slist` on the
`struct active_request_slot` that is obtained from a call to
`get_active_slot(int)`. This function now takes a single argument to
define if the initial set of headers on the slot should include the
"Pragma: no-cache" header, along with all extra headers specified via
`http.extraHeader` config values.

The active request slot obtained from `get_active_slot(int)` will always
contain a fresh set of default headers and any headers set in previous
usages of this slot will be freed.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http-push.c   | 103 ++++++++++++++++++++++----------------------------
 http-walker.c |   2 +-
 http.c        |  82 ++++++++++++++++++----------------------
 http.h        |   4 +-
 remote-curl.c |  36 +++++++++---------
 5 files changed, 101 insertions(+), 126 deletions(-)

diff --git a/http-push.c b/http-push.c
index 5f4340a36e6..2b40959b376 100644
--- a/http-push.c
+++ b/http-push.c
@@ -211,29 +211,29 @@ static void curl_setup_http(CURL *curl, const char *url,
 	curl_easy_setopt(curl, CURLOPT_UPLOAD, 1);
 }
 
-static struct curl_slist *get_dav_token_headers(struct remote_lock *lock, enum dav_header_flag options)
+static struct curl_slist *append_dav_token_headers(struct curl_slist *headers,
+	struct remote_lock *lock, enum dav_header_flag options)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
 	if (options & DAV_HEADER_IF) {
 		strbuf_addf(&buf, "If: (<%s>)", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_LOCK) {
 		strbuf_addf(&buf, "Lock-Token: <%s>", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_TIMEOUT) {
 		strbuf_addf(&buf, "Timeout: Second-%ld", lock->timeout);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	strbuf_release(&buf);
 
-	return dav_headers;
+	return headers;
 }
 
 static void finish_request(struct transfer_request *request);
@@ -281,7 +281,7 @@ static void start_mkcol(struct transfer_request *request)
 
 	request->url = get_remote_object_url(repo->url, hex, 1);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MKCOL);
@@ -399,7 +399,7 @@ static void start_put(struct transfer_request *request)
 	strbuf_add(&buf, request->lock->tmpfile_suffix, the_hash_algo->hexsz + 1);
 	request->url = strbuf_detach(&buf, NULL);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http(slot->curl, request->url, DAV_PUT,
@@ -417,15 +417,13 @@ static void start_put(struct transfer_request *request)
 static void start_move(struct transfer_request *request)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MOVE);
-	dav_headers = curl_slist_append(dav_headers, request->dest);
-	dav_headers = curl_slist_append(dav_headers, "Overwrite: T");
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
+	slot->headers = curl_slist_append(slot->headers, request->dest);
+	slot->headers = curl_slist_append(slot->headers, "Overwrite: T");
 
 	if (start_active_slot(slot)) {
 		request->slot = slot;
@@ -440,17 +438,16 @@ static int refresh_lock(struct remote_lock *lock)
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
 	lock->refreshing = 1;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_LOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -464,7 +461,6 @@ static int refresh_lock(struct remote_lock *lock)
 	}
 
 	lock->refreshing = 0;
-	curl_slist_free_all(dav_headers);
 
 	return rc;
 }
@@ -838,7 +834,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	char *ep;
 	char timeout_header[25];
 	struct remote_lock *lock = NULL;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	char *escaped;
 
@@ -849,7 +844,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	while (ep) {
 		char saved_character = ep[1];
 		ep[1] = '\0';
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
 		curl_setup_http_get(slot->curl, url, DAV_MKCOL);
 		if (start_active_slot(slot)) {
@@ -875,14 +870,15 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
+	slot = get_active_slot(0);
+	slot->results = &results;
+
 	xsnprintf(timeout_header, sizeof(timeout_header), "Timeout: Second-%ld", timeout);
-	dav_headers = curl_slist_append(dav_headers, timeout_header);
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
+	slot->headers = curl_slist_append(slot->headers, timeout_header);
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
 
-	slot = get_active_slot();
-	slot->results = &results;
 	curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	CALLOC_ARRAY(lock, 1);
@@ -921,7 +917,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		fprintf(stderr, "Unable to start LOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
 
@@ -945,15 +940,14 @@ static int unlock_remote(struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct remote_lock *prev = repo->locks;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_LOCK);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_LOCK);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_UNLOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -966,8 +960,6 @@ static int unlock_remote(struct remote_lock *lock)
 		fprintf(stderr, "Unable to start UNLOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
-
 	if (repo->locks == lock) {
 		repo->locks = lock->next;
 	} else {
@@ -1121,7 +1113,6 @@ static void remote_ls(const char *path, int flags,
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	struct remote_ls_ctx ls;
 
@@ -1134,14 +1125,14 @@ static void remote_ls(const char *path, int flags,
 
 	strbuf_addstr(&out_buffer.buf, PROPFIND_ALL_REQUEST);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 1");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 1");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1177,7 +1168,6 @@ static void remote_ls(const char *path, int flags,
 	free(url);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 }
 
 static void get_remote_object_list(unsigned char parent)
@@ -1199,7 +1189,6 @@ static int locking_available(void)
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	int lock_flags = 0;
 	char *escaped;
@@ -1208,14 +1197,14 @@ static int locking_available(void)
 	strbuf_addf(&out_buffer.buf, PROPFIND_SUPPORTEDLOCK_REQUEST, escaped);
 	free(escaped);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 0");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 0");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, repo->url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1257,7 +1246,6 @@ static int locking_available(void)
 
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 
 	return lock_flags;
 }
@@ -1374,17 +1362,16 @@ static int update_remote(const struct object_id *oid, struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers;
-
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
 	strbuf_addf(&out_buffer.buf, "%s\n", oid_to_hex(oid));
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF);
+
 	curl_setup_http(slot->curl, lock->url, DAV_PUT,
 			&out_buffer, fwrite_null);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -1486,18 +1473,18 @@ static void update_remote_info_refs(struct remote_lock *lock)
 	struct buffer buffer = { STRBUF_INIT, 0 };
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 
 	remote_ls("refs/", (PROCESS_FILES | RECURSIVE),
 		  add_remote_info_ref, &buffer.buf);
 	if (!aborted) {
-		dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
+		slot->headers = append_dav_token_headers(slot->headers, lock,
+			DAV_HEADER_IF);
+
 		curl_setup_http(slot->curl, lock->url, DAV_PUT,
 				&buffer, fwrite_null);
-		curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 		if (start_active_slot(slot)) {
 			run_active_slot(slot);
@@ -1652,7 +1639,7 @@ static int delete_remote_branch(const char *pattern, int force)
 	if (dry_run)
 		return 0;
 	url = xstrfmt("%s%s", repo->url, remote_ref->name);
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
 	curl_setup_http_get(slot->curl, url, DAV_DELETE);
 	if (start_active_slot(slot)) {
diff --git a/http-walker.c b/http-walker.c
index b8f0f98ae14..8747de2fcdb 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -373,7 +373,7 @@ static void fetch_alternates(struct walker *walker, const char *base)
 	 * Use a callback to process the result, since another request
 	 * may fail and need to have alternates loaded before continuing
 	 */
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_alternates_response;
 	alt_req.walker = walker;
 	slot->callback_data = &alt_req;
diff --git a/http.c b/http.c
index 091321af98e..42616f746b1 100644
--- a/http.c
+++ b/http.c
@@ -124,8 +124,6 @@ static unsigned long empty_auth_useless =
 	| CURLAUTH_DIGEST_IE
 	| CURLAUTH_DIGEST;
 
-static struct curl_slist *pragma_header;
-static struct curl_slist *no_pragma_header;
 static struct string_list extra_http_headers = STRING_LIST_INIT_DUP;
 
 static struct curl_slist *host_resolutions;
@@ -1132,11 +1130,6 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
-	pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma: no-cache");
-	no_pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma:");
-
 	{
 		char *http_max_requests = getenv("GIT_HTTP_MAX_REQUESTS");
 		if (http_max_requests)
@@ -1198,6 +1191,8 @@ void http_cleanup(void)
 
 	while (slot != NULL) {
 		struct active_request_slot *next = slot->next;
+		if (slot->headers)
+			curl_slist_free_all(slot->headers);
 		if (slot->curl) {
 			xmulti_remove_handle(slot);
 			curl_easy_cleanup(slot->curl);
@@ -1214,12 +1209,6 @@ void http_cleanup(void)
 
 	string_list_clear(&extra_http_headers, 0);
 
-	curl_slist_free_all(pragma_header);
-	pragma_header = NULL;
-
-	curl_slist_free_all(no_pragma_header);
-	no_pragma_header = NULL;
-
 	curl_slist_free_all(host_resolutions);
 	host_resolutions = NULL;
 
@@ -1254,7 +1243,18 @@ void http_cleanup(void)
 	FREE_AND_NULL(cached_accept_language);
 }
 
-struct active_request_slot *get_active_slot(void)
+static struct curl_slist *http_copy_default_headers(void)
+{
+	struct curl_slist *headers = NULL;
+	const struct string_list_item *item;
+
+	for_each_string_list_item(item, &extra_http_headers)
+		headers = curl_slist_append(headers, item->string);
+
+	return headers;
+}
+
+struct active_request_slot *get_active_slot(int no_pragma_header)
 {
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
@@ -1276,6 +1276,7 @@ struct active_request_slot *get_active_slot(void)
 		newslot->curl = NULL;
 		newslot->in_use = 0;
 		newslot->next = NULL;
+		newslot->headers = NULL;
 
 		slot = active_queue_head;
 		if (!slot) {
@@ -1293,6 +1294,15 @@ struct active_request_slot *get_active_slot(void)
 		curl_session_count++;
 	}
 
+	if (slot->headers)
+		curl_slist_free_all(slot->headers);
+
+	slot->headers = http_copy_default_headers();
+
+	if (!no_pragma_header)
+		slot->headers = curl_slist_append(slot->headers,
+			"Pragma: no-cache");
+
 	active_requests++;
 	slot->in_use = 1;
 	slot->results = NULL;
@@ -1302,7 +1312,6 @@ struct active_request_slot *get_active_slot(void)
 	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
 	if (curl_save_cookies)
 		curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
 	curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
 	curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
 	curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
@@ -1334,9 +1343,12 @@ struct active_request_slot *get_active_slot(void)
 
 int start_active_slot(struct active_request_slot *slot)
 {
-	CURLMcode curlm_result = curl_multi_add_handle(curlm, slot->curl);
+	CURLMcode curlm_result;
 	int num_transfers;
 
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, slot->headers);
+	curlm_result = curl_multi_add_handle(curlm, slot->curl);
+
 	if (curlm_result != CURLM_OK &&
 	    curlm_result != CURLM_CALL_MULTI_PERFORM) {
 		warning("curl_multi_add_handle failed: %s",
@@ -1651,17 +1663,6 @@ int run_one_slot(struct active_request_slot *slot,
 	return handle_curl_result(results);
 }
 
-struct curl_slist *http_copy_default_headers(void)
-{
-	struct curl_slist *headers = NULL;
-	const struct string_list_item *item;
-
-	for_each_string_list_item(item, &extra_http_headers)
-		headers = curl_slist_append(headers, item->string);
-
-	return headers;
-}
-
 static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
 {
 	char *ptr;
@@ -1879,12 +1880,11 @@ static int http_request(const char *url,
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *headers = http_copy_default_headers();
-	struct strbuf buf = STRBUF_INIT;
+	int no_cache = options && options->no_cache;
 	const char *accept_language;
 	int ret;
 
-	slot = get_active_slot();
+	slot = get_active_slot(!no_cache);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
 	if (!result) {
@@ -1909,27 +1909,23 @@ static int http_request(const char *url,
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-		headers = curl_slist_append(headers, accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			accept_language);
 
-	strbuf_addstr(&buf, "Pragma:");
-	if (options && options->no_cache)
-		strbuf_addstr(&buf, " no-cache");
 	if (options && options->initial_request &&
 	    http_follow_config == HTTP_FOLLOW_INITIAL)
 		curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 1);
 
-	headers = curl_slist_append(headers, buf.buf);
-
 	/* Add additional headers here */
 	if (options && options->extra_headers) {
 		const struct string_list_item *item;
 		for_each_string_list_item(item, options->extra_headers) {
-			headers = curl_slist_append(headers, item->string);
+			slot->headers = curl_slist_append(slot->headers,
+				item->string);
 		}
 	}
 
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -1947,9 +1943,6 @@ static int http_request(const char *url,
 		curlinfo_strbuf(slot->curl, CURLINFO_EFFECTIVE_URL,
 				options->effective_url);
 
-	curl_slist_free_all(headers);
-	strbuf_release(&buf);
-
 	return ret;
 }
 
@@ -2310,12 +2303,10 @@ struct http_pack_request *new_direct_http_pack_request(
 		goto abort;
 	}
 
-	preq->slot = get_active_slot();
+	preq->slot = get_active_slot(1);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
-	curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER,
-		no_pragma_header);
 
 	/*
 	 * If there is data present from a previous transfer attempt,
@@ -2480,14 +2471,13 @@ struct http_object_request *new_http_object_request(const char *base_url,
 		}
 	}
 
-	freq->slot = get_active_slot();
+	freq->slot = get_active_slot(1);
 
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_URL, freq->url);
-	curl_easy_setopt(freq->slot->curl, CURLOPT_HTTPHEADER, no_pragma_header);
 
 	/*
 	 * If we have successfully processed data from a previous fetch
diff --git a/http.h b/http.h
index 3c94c479100..a304cc408b2 100644
--- a/http.h
+++ b/http.h
@@ -22,6 +22,7 @@ struct slot_results {
 struct active_request_slot {
 	CURL *curl;
 	int in_use;
+	struct curl_slist *headers;
 	CURLcode curl_result;
 	long http_code;
 	int *finished;
@@ -43,7 +44,7 @@ size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf);
 curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp);
 
 /* Slot lifecycle functions */
-struct active_request_slot *get_active_slot(void);
+struct active_request_slot *get_active_slot(int no_pragma_header);
 int start_active_slot(struct active_request_slot *slot);
 void run_active_slot(struct active_request_slot *slot);
 void finish_all_active_slots(void);
@@ -64,7 +65,6 @@ void step_active_slots(void);
 void http_init(struct remote *remote, const char *url,
 	       int proactive_auth);
 void http_cleanup(void);
-struct curl_slist *http_copy_default_headers(void);
 
 extern long int git_curl_ipresolve;
 extern int active_requests;
diff --git a/remote-curl.c b/remote-curl.c
index 72dfb8fb86a..edbd4504beb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -847,14 +847,13 @@ static int run_slot(struct active_request_slot *slot,
 static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	struct strbuf buf = STRBUF_INIT;
 	int err;
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -862,13 +861,11 @@ static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, NULL);
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, "0000");
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, 4);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buf);
 
 	err = run_slot(slot, results);
 
-	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 	return err;
 }
@@ -888,7 +885,6 @@ static curl_off_t xcurl_off_t(size_t len)
 static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_received)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	int use_gzip = rpc->gzip_request;
 	char *gzip_body = NULL;
 	size_t gzip_size = 0;
@@ -930,21 +926,23 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 			needs_100_continue = 1;
 	}
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
-	headers = curl_slist_append(headers, needs_100_continue ?
+retry:
+	slot = get_active_slot(0);
+
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, needs_100_continue ?
 		"Expect: 100-continue" : "Expect:");
 
 	/* Add Accept-Language header */
 	if (rpc->hdr_accept_language)
-		headers = curl_slist_append(headers, rpc->hdr_accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->hdr_accept_language);
 
 	/* Add the extra Git-Protocol header */
 	if (rpc->protocol_header)
-		headers = curl_slist_append(headers, rpc->protocol_header);
-
-retry:
-	slot = get_active_slot();
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->protocol_header);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -955,7 +953,8 @@ retry:
 		/* The request body is large and the size cannot be predicted.
 		 * We must use chunked encoding to send it.
 		 */
-		headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
+		slot->headers = curl_slist_append(slot->headers,
+			"Transfer-Encoding: chunked");
 		rpc->initial_buffer = 1;
 		curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out);
 		curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc);
@@ -1002,7 +1001,8 @@ retry:
 
 		gzip_size = stream.total_out;
 
-		headers = curl_slist_append(headers, "Content-Encoding: gzip");
+		slot->headers = curl_slist_append(slot->headers,
+			"Content-Encoding: gzip");
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, gzip_body);
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE_LARGE, xcurl_off_t(gzip_size));
 
@@ -1025,7 +1025,6 @@ retry:
 		}
 	}
 
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
@@ -1055,7 +1054,6 @@ retry:
 	if (stateless_connect)
 		packet_response_end(rpc->in);
 
-	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 7/8] http: move proactive auth to first slot creation
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 6/8] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-13 19:25 ` [PATCH 8/8] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Rather than proactively seek credentials to authenticate a request at
`http_init()` time, do it when the first `active_request_slot` is
created.

Because credential helpers may modify the headers used for a request we
can only auth when a slot is created (when we can first start to gather
request headers).

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/http.c b/http.c
index 42616f746b1..8e107ff19b8 100644
--- a/http.c
+++ b/http.c
@@ -514,18 +514,18 @@ static int curl_empty_auth_enabled(void)
 	return 0;
 }
 
-static void init_curl_http_auth(CURL *result)
+static void init_curl_http_auth(struct active_request_slot *slot)
 {
 	if (!http_auth.username || !*http_auth.username) {
 		if (curl_empty_auth_enabled())
-			curl_easy_setopt(result, CURLOPT_USERPWD, ":");
+			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
 	}
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(result, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(result, CURLOPT_PASSWORD, http_auth.password);
+	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
+	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
 }
 
 /* *var must be free-able */
@@ -900,9 +900,6 @@ static CURL *get_curl_handle(void)
 #endif
 	}
 
-	if (http_proactive_auth)
-		init_curl_http_auth(result);
-
 	if (getenv("GIT_SSL_VERSION"))
 		ssl_version = getenv("GIT_SSL_VERSION");
 	if (ssl_version && *ssl_version) {
@@ -1259,6 +1256,7 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
 
+	int proactive_auth = 0;
 	int num_transfers;
 
 	/* Wait for a slot to open up if the queue is full */
@@ -1281,6 +1279,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 		slot = active_queue_head;
 		if (!slot) {
 			active_queue_head = newslot;
+
+			/* Auth first slot if asked for proactive auth */
+			proactive_auth = http_proactive_auth;
 		} else {
 			while (slot->next != NULL)
 				slot = slot->next;
@@ -1335,8 +1336,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 
 	curl_easy_setopt(slot->curl, CURLOPT_IPRESOLVE, git_curl_ipresolve);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, http_auth_methods);
-	if (http_auth.password || curl_empty_auth_enabled())
-		init_curl_http_auth(slot->curl);
+
+	if (http_auth.password || curl_empty_auth_enabled() || proactive_auth)
+		init_curl_http_auth(slot);
 
 	return slot;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH 8/8] http: set specific auth scheme depending on credential
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 7/8] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
@ 2022-09-13 19:25 ` Matthew John Cheetham via GitGitGadget
  2022-09-19 16:42   ` Derrick Stolee
  2022-09-19 16:08 ` [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Derrick Stolee
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-09-13 19:25 UTC (permalink / raw)
  To: git; +Cc: Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a new credential field `authtype` that can be used by
credential helpers to indicate the type of the credential or
authentication mechanism to use for a request.

Modify http.c to now specify the correct authentication scheme or
credential type when authenticating the curl handle. If the new
`authtype` field in the credential structure is `NULL` or "Basic" then
use the existing username/password options. If the field is "Bearer"
then use the OAuth bearer token curl option. Otherwise, the `authtype`
field is the authentication scheme and the `password` field is the
raw, unencoded value.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt |  9 +++++++++
 credential.c                     |  5 +++++
 credential.h                     |  1 +
 git-curl-compat.h                |  7 +++++++
 http.c                           | 24 +++++++++++++++++++++---
 5 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index 7d4a788c63d..3b6ef6f4906 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -152,6 +152,15 @@ Git understands the following attributes:
 	`protocol=https` and `host=example.com` had been provided). This
 	can help callers avoid parsing URLs themselves.
 
+`authtype`::
+
+	Indicates the type of authentication scheme used. If this is not
+	present the default is "Basic".
+	Known values include "Basic", "Digest", and "Bearer".
+	If an unknown value is provided, this is taken as the authentication
+	scheme for the `Authorization` header, and the `password` field is
+	used as the raw unencoded authorization parameters of the same header.
+
 `wwwauth[n]`::
 
 	When an HTTP response is received that includes one or more
diff --git a/credential.c b/credential.c
index 4ad40323fc7..9d4a0f3fd51 100644
--- a/credential.c
+++ b/credential.c
@@ -21,6 +21,7 @@ void credential_clear(struct credential *c)
 	free(c->path);
 	free(c->username);
 	free(c->password);
+	free(c->authtype);
 	string_list_clear(&c->helpers, 0);
 	strvec_clear(&c->wwwauth_headers);
 
@@ -235,6 +236,9 @@ int credential_read(struct credential *c, FILE *fp)
 		} else if (!strcmp(key, "path")) {
 			free(c->path);
 			c->path = xstrdup(value);
+		} else if (!strcmp(key, "authtype")) {
+			free(c->authtype);
+			c->authtype = xstrdup(value);
 		} else if (!strcmp(key, "url")) {
 			credential_from_url(c, value);
 		} else if (!strcmp(key, "quit")) {
@@ -281,6 +285,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_item(fp, "authtype", c->authtype, 0);
 	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
diff --git a/credential.h b/credential.h
index 6a9d4e3de07..a6572aacf1d 100644
--- a/credential.h
+++ b/credential.h
@@ -135,6 +135,7 @@ struct credential {
 	char *protocol;
 	char *host;
 	char *path;
+	char *authtype;
 };
 
 #define CREDENTIAL_INIT { \
diff --git a/git-curl-compat.h b/git-curl-compat.h
index 56a83b6bbd8..74732500a9f 100644
--- a/git-curl-compat.h
+++ b/git-curl-compat.h
@@ -126,4 +126,11 @@
 #define GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS
 #endif
 
+/**
+ * CURLAUTH_BEARER was added in 7.61.0, released in July 2018.
+ */
+#if LIBCURL_VERSION_NUM >= 0x073D00
+#define GIT_CURL_HAVE_CURLAUTH_BEARER
+#endif
+
 #endif
diff --git a/http.c b/http.c
index 8e107ff19b8..d8913b2c641 100644
--- a/http.c
+++ b/http.c
@@ -516,7 +516,8 @@ static int curl_empty_auth_enabled(void)
 
 static void init_curl_http_auth(struct active_request_slot *slot)
 {
-	if (!http_auth.username || !*http_auth.username) {
+	if (!http_auth.authtype &&
+		(!http_auth.username || !*http_auth.username)) {
 		if (curl_empty_auth_enabled())
 			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
@@ -524,8 +525,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
+	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
+				|| !strcasecmp(http_auth.authtype, "digest")) {
+		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
+			http_auth.username);
+		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
+			http_auth.password);
+#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
+	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
+		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
+		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
+			http_auth.password);
+#endif
+	} else {
+		struct strbuf auth = STRBUF_INIT;
+		strbuf_addf(&auth, "Authorization: %s %s",
+			http_auth.authtype, http_auth.password);
+		slot->headers = curl_slist_append(slot->headers, auth.buf);
+		strbuf_release(&auth);
+	}
 }
 
 /* *var must be free-able */
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 171+ messages in thread

* Re: [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (7 preceding siblings ...)
  2022-09-13 19:25 ` [PATCH 8/8] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
@ 2022-09-19 16:08 ` Derrick Stolee
  2022-09-19 16:44   ` Derrick Stolee
  2022-09-21 22:19   ` Matthew John Cheetham
  2022-09-19 23:36 ` Lessley Dennington
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
  10 siblings, 2 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:08 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git; +Cc: Matthew John Cheetham

On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
> Hello! I have an RFC to update the existing credential helper design in
> order to allow for some new scenarios, and future evolution of auth methods
> that Git hosts may wish to provide. I outline the background, summary of
> changes and some challenges below. I also attach a series of patches to
> illustrate the design proposal.

It's unfortunate that we didn't get to talk about this during the
contributor summit, but it is super-technical and worth looking closely
at all the details. 

> One missing element from the patches are extensive tests of the new
> behaviour. It appears existing tests focus either on the credential helper
> protocol/format, or rely on testing basic authentication only via an Apache
> webserver. In order to have a full end to end test coverage of these new
> features it make be that we need a more comprehensive test bed to mock these
> more nuanced authentication methods. I lean on the experts on the list for
> advice here.

The microsoft/git fork has a feature (the GVFS Protocol) that requires a
custom HTTP server as a test helper. We might need a similar test helper
to return these WWW-Authenticate headers and check the full request list
from Git matches the spec. Doing that while also executing the proper Git
commands to serve the HTTP bodies is hopefully not too large. It might be
nice to adapt such a helper to replace the need for a full Apache install
in our test suite, but that's an independent concern from this RFC.

> Limitations
> ===========
> 
> Because this credential model was built mostly for password based
> authentication systems, it's somewhat limited. In particular:
> 
>  1. To generate valid credentials, additional information about the request
>     (or indeed the requestee and their device) may be required. For example,
>     OAuth is based around scopes. A scope, like "git.read", might be
>     required to read data from the remote. However, the remote cannot tell
>     the credential helper what scope is required for this request.
> 
>  2. This system is not fully extensible. Each time a new type of
>     authentication (like OAuth Bearer) is invented, Git needs updates before
>     credential helpers can take advantage of it (or leverage a new
>     capability in libcurl).
> 
> 
> Goals
> =====
> 
>  * As a user with multiple federated cloud identities:

I'm not sure if you mentioned it anywhere else, but this is specifically
for cases where a user might have multiple identities _on the same host
by DNS name_. The credential.useHttpPath config option might seem like it
could help here, but the credential helper might pick the wrong identity
that is the most-recent login. Either this workflow will require the user
to re-login with every new URL or the fetches/clones will fail when the
guess is wrong and the user would need to learn how to log into that other
identity.

Please correct me if I'm wrong about any of this, but the details of your
goals make it clear that the workflow will be greatly improved:

>    * Reach out to a remote and have my credential helper automatically
>      prompt me for the correct identity.
>    * Leverage existing authentication systems built-in to many operating
>      systems and devices to boost security and reduce reliance on passwords.
> 
>  * As a Git host and/or cloud identity provider:
>    
>    * Leverage newest identity standards, enhancements, and threat
>      mitigations - all without updating Git.
>    * Enforce security policies (like requiring two-factor authentication)
>      dynamically.
>    * Allow integration with third party standard based identity providers in
>      enterprises allowing customers to have a single plane of control for
>      critical identities with access to source code.

I had a question with this part of your proposal:

>     Because the extra information forms an ordered list, and the existing
>     credential helper I/O format only provides for simple key=value pairs,
>     we introduce a new convention for transmitting an ordered list of
>     values. Key names that are suffixed with a C-style array syntax should
>     have values considered to form an order list, i.e. key[n]=value, where n
>     is a zero based index of the values.
>     
>     For the WWW-Authenticate header values we opt to use the key wwwauth[n].
...
> Git sends over standard input:
> 
> protocol=https
> host=example.com
> wwwauth[0]=Bearer realm="login.example", scope="git.readwrite"
> wwwauth[1]=Basic realm="login.example"

The important part here is that we provide a way to specify a multi-valued
key as opposed to a "last one wins" key, right?

Using empty braces (wwwauth[]) would suffice to indicate this, right? That
allows us to not care about the values inside the braces. The biggest
issues I see with a value in the braces are:

1. What if it isn't an integer?
2. What if we are missing a value?
3. What if they come out of order?

Without a value inside, then the order in which they appear provides
implicit indices in their multi-valued list.

Other than that, I support this idea and will start looking at the code
now.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines
  2022-09-13 19:25 ` [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines Matthew John Cheetham via GitGitGadget
@ 2022-09-19 16:12   ` Derrick Stolee
  2022-09-21 22:48     ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:12 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Matthew John Cheetham, Matthew John Cheetham

On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Like in all the other credential helpers, the osxkeychain helper
> ignores unknown credential lines.
> 
> Add a comment (a la the other helpers) to make it clear and explicit
> that this is the desired behaviour.

I recommend that these first three patches be submitted for full
review and merging, since they seem important independent of this
RFC.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 4/8] http: read HTTP WWW-Authenticate response headers
  2022-09-13 19:25 ` [PATCH 4/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-09-19 16:21   ` Derrick Stolee
  2022-09-21 22:24     ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:21 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Matthew John Cheetham, Matthew John Cheetham

On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:

> +	/**
> +	 * A `strvec` of WWW-Authenticate header values. Each string
> +	 * is the value of a WWW-Authenticate header in an HTTP response,
> +	 * in the order they were received in the response.
> +	 */
> +	struct strvec wwwauth_headers;

I like this careful documentation.

> +	unsigned header_is_last_match:1;

But then this member is unclear how it is attached. It could use its
own "for internal use" comment if we don't want to describe it in full
detail here.

> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
> +{
> +	size_t size = eltsize * nmemb;
> +	struct strvec *values = &http_auth.wwwauth_headers;
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *val;
> +	const char *z = NULL;
> +
> +	/*
> +	 * Header lines may not come NULL-terminated from libcurl so we must
> +	 * limit all scans to the maximum length of the header line, or leverage
> +	 * strbufs for all operations.
> +	 *
> +	 * In addition, it is possible that header values can be split over
> +	 * multiple lines as per RFC 2616 (even though this has since been
> +	 * deprecated in RFC 7230). A continuation header field value is
> +	 * identified as starting with a space or horizontal tab.
> +	 *
> +	 * The formal definition of a header field as given in RFC 2616 is:
> +	 *
> +	 *   message-header = field-name ":" [ field-value ]
> +	 *   field-name     = token
> +	 *   field-value    = *( field-content | LWS )
> +	 *   field-content  = <the OCTETs making up the field-value
> +	 *                    and consisting of either *TEXT or combinations
> +	 *                    of token, separators, and quoted-string>
> +	 */
> +
> +	strbuf_add(&buf, ptr, size);
> +
> +	/* Strip the CRLF that should be present at the end of each field */

Is it really a CRLF? Or just an LF?

> +	strbuf_trim_trailing_newline(&buf);

Thankfully, this will trim an LF _or_ CR/LF pair, so either way would be fine.

> +	/* Start of a new WWW-Authenticate header */
> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
> +		while (isspace(*val)) val++;

Break the "val++;" to its own line:

		while (isspace(*val))
			val++;

While we are here, do we need to be careful about the end of the string at
this point? Is it possible that the server will send all spaces up until the
maximum header size (as mentioned in the message)?

> +
> +		strvec_push(values, val);
> +		http_auth.header_is_last_match = 1;
> +		goto exit;
> +	}
> +
> +	/*
> +	 * This line could be a continuation of the previously matched header
> +	 * field. If this is the case then we should append this value to the
> +	 * end of the previously consumed value.
> +	 */
> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
> +		const char **v = values->v + values->nr - 1;

I suppose we expect leading spaces as critical to this header, right?

> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);

We might have better luck using a strbuf, initializing it with the expected
size and using strbuf_add() to append the strings. Maybe I'm just prematurely
optimizing, though.

> +
> +		free((void*)*v);
> +		*v = append;
> +
> +		goto exit;
> +	}
> +
> +	/* This is the start of a new header we don't care about */
> +	http_auth.header_is_last_match = 0;
> +
> +	/*
> +	 * If this is a HTTP status line and not a header field, this signals
> +	 * a different HTTP response. libcurl writes all the output of all
> +	 * response headers of all responses, including redirects.
> +	 * We only care about the last HTTP request response's headers so clear
> +	 * the existing array.
> +	 */
> +	if (skip_iprefix(buf.buf, "http/", &z))
> +		strvec_clear(values);
> +
> +exit:
> +	strbuf_release(&buf);
> +	return size;
> +}
> +
>  size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
>  {
>  	return nmemb;
> @@ -1829,6 +1904,8 @@ static int http_request(const char *url,
>  					 fwrite_buffer);
>  	}
>  
> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);

Nice integration point!

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 5/8] credential: add WWW-Authenticate header to cred requests
  2022-09-13 19:25 ` [PATCH 5/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-09-19 16:33   ` Derrick Stolee
  2022-09-21 22:20     ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:33 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Matthew John Cheetham, Matthew John Cheetham

On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>

> In this case we send multiple `wwwauth[n]` properties where `n` is a
> zero-indexed number, reflecting the order the WWW-Authenticate headers
> appeared in the HTTP response.
> @@ -151,6 +151,15 @@ Git understands the following attributes:
>  	were read (e.g., `url=https://example.com` would behave as if
>  	`protocol=https` and `host=example.com` had been provided). This
>  	can help callers avoid parsing URLs themselves.
> +
> +`wwwauth[n]`::
> +
> +	When an HTTP response is received that includes one or more
> +	'WWW-Authenticate' authentication headers, these can be passed to Git
> +	(and subsequent credential helpers) with these attributes.
> +	Each 'WWW-Authenticate' header value should be passed as a separate
> +	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
> +	appear in the HTTP response.
>  +
>  Note that specifying a protocol is mandatory and if the URL
>  doesn't specify a hostname (e.g., "cert:///path/to/file") the

This "+" means that this paragraph should be connected to the previous
one, so it seems that you've inserted your new value in the middle of
the `url` key. You'll want to move yours to be after those two connected
paragraphs. Your diff hunk should look like this:

--- >8 ---

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index f18673017f..127ae29be3 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -160,6 +160,15 @@ empty string.
 Components which are missing from the URL (e.g., there is no
 username in the example above) will be left unset.
 
+`wwwauth[n]`::
+
+	When an HTTP response is received that includes one or more
+	'WWW-Authenticate' authentication headers, these can be passed to Git
+	(and subsequent credential helpers) with these attributes.
+	Each 'WWW-Authenticate' header value should be passed as a separate
+	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
+	appear in the HTTP response.
+
 GIT
 ---
 Part of the linkgit:git[1] suite


--- >8 ---

> diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
> index 497b9b9d927..fe118d76f98 100644
> --- a/t/lib-httpd/apache.conf
> +++ b/t/lib-httpd/apache.conf
> @@ -235,6 +235,19 @@ SSLEngine On
>  	Require valid-user
>  </LocationMatch>
>  
> +# Advertise two additional auth methods above "Basic".
> +# Neither of them actually work but serve test cases showing these
> +# additional auth headers are consumed correctly.
> +<Location /auth-wwwauth/>
> +	AuthType Basic
> +	AuthName "git-auth"
> +	AuthUserFile passwd
> +	Require valid-user
> +	SetEnvIf Authorization "^\S+" authz
> +	Header always add WWW-Authenticate "Bearer authority=https://login.example.com" env=!authz
> +	Header always add WWW-Authenticate "FooAuth foo=bar baz=1" env=!authz
> +</Location>
> +

This is cool that you've figured out how to make our Apache tests
add these headers! Maybe we won't need that extra test helper like
I thought (unless we want to confirm the second request sends the
right information).

> +test_expect_success 'http auth sends www-auth headers to credential helper' '
> +	write_script git-credential-tee <<-\EOF &&
> +		cmd=$1
> +		teefile=credential-$cmd
> +		if [ -f "$teefile" ]; then

I think we prefer using "test" over the braces (and linebreak
before then) like this:

		if test -n "$teefile"
		then

> +			rm $teefile
> +		fi

Alternatively, you could always run "rm -f $teefile" for
simplicity.

> +		(
> +			while read line;
> +			do
> +				if [ -z "$line" ]; then
> +					exit 0
> +				fi
> +				echo "$line" >> $teefile
> +				echo $line
> +			done
> +		) | git credential-store $cmd

Since I'm not sure, I'll ask the question: do we need the sub-shell
here, or could we pipe directly off of the "done"? Like this:

		while read line;
		do
			if [ -z "$line" ]; then
				exit 0
			fi
			echo "$line" >> $teefile
			echo $line
		done | git credential-store $cmd

> +	EOF


> +	cat >expected-get <<-EOF &&
> +	protocol=http
> +	host=127.0.0.1:5551
> +	wwwauth[0]=Bearer authority=https://login.example.com
> +	wwwauth[1]=FooAuth foo=bar baz=1
> +	wwwauth[2]=Basic realm="git-auth"
> +	EOF
> +
> +	cat >expected-store <<-EOF &&
> +	protocol=http
> +	host=127.0.0.1:5551
> +	username=user@host
> +	password=pass@host
> +	EOF
> +
> +	rm -f .git-credentials &&
> +	test_config credential.helper tee &&
> +	set_askpass user@host pass@host &&
> +	(
> +		PATH="$PWD:$PATH" &&
> +		git ls-remote "$HTTPD_URL/auth-wwwauth/smart/repo.git"
> +	) &&
> +	expect_askpass both user@host &&
> +	test_cmp expected-get credential-get &&
> +	test_cmp expected-store credential-store

Elegant check for both calls.

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 171+ messages in thread

* Re: [PATCH 8/8] http: set specific auth scheme depending on credential
  2022-09-13 19:25 ` [PATCH 8/8] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
@ 2022-09-19 16:42   ` Derrick Stolee
  0 siblings, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:42 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Matthew John Cheetham, Matthew John Cheetham

On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Introduce a new credential field `authtype` that can be used by
> credential helpers to indicate the type of the credential or
> authentication mechanism to use for a request.
> 
> Modify http.c to now specify the correct authentication scheme or
> credential type when authenticating the curl handle. If the new
> `authtype` field in the credential structure is `NULL` or "Basic" then
> use the existing username/password options. If the field is "Bearer"
> then use the OAuth bearer token curl option. Otherwise, the `authtype`
> field is the authentication scheme and the `password` field is the
> raw, unencoded value.


> @@ -524,8 +525,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
>  
>  	credential_fill(&http_auth);
>  
> -	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
> -	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
> +	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
> +				|| !strcasecmp(http_auth.authtype, "digest")) {
> +		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
> +			http_auth.username);
> +		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
> +			http_auth.password);
> +#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
> +	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
> +		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
> +		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
> +			http_auth.password);
> +#endif
> +	} else {
> +		struct strbuf auth = STRBUF_INIT;
> +		strbuf_addf(&auth, "Authorization: %s %s",
> +			http_auth.authtype, http_auth.password);
> +		slot->headers = curl_slist_append(slot->headers, auth.buf);
> +		strbuf_release(&auth);
> +	}
>  }

It would be good to have a test here, and the only way I can think
to add it would be to modify one of the test credential helpers to
indicate that OAuth is being used.

The test would somehow need to be careful about the curl version,
though, and I'm not sure if we have prior work for writing prereqs
based on the linked curl version.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers
  2022-09-19 16:08 ` [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Derrick Stolee
@ 2022-09-19 16:44   ` Derrick Stolee
  2022-09-21 22:19   ` Matthew John Cheetham
  1 sibling, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-09-19 16:44 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git; +Cc: Matthew John Cheetham

On 9/19/2022 12:08 PM, Derrick Stolee wrote:
> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:

>> protocol=https
>> host=example.com
>> wwwauth[0]=Bearer realm="login.example", scope="git.readwrite"
>> wwwauth[1]=Basic realm="login.example"
> 
> The important part here is that we provide a way to specify a multi-valued
> key as opposed to a "last one wins" key, right?
> 
> Using empty braces (wwwauth[]) would suffice to indicate this, right? That
> allows us to not care about the values inside the braces. The biggest
> issues I see with a value in the braces are:
> 
> 1. What if it isn't an integer?
> 2. What if we are missing a value?
> 3. What if they come out of order?
> 
> Without a value inside, then the order in which they appear provides
> implicit indices in their multi-valued list.

After looking at the code, it would not be difficult at all to make this
change in-place for these patches. But I won't push too hard if there is
some reason to keep the index values.
 
> Other than that, I support this idea and will start looking at the code
> now.

I took a look and provided feedback as I could. Patches 6 and 7 eluded
me only because I'm so unfamiliar with the http.c code and don't have
time to learn it today.

I mentioned that patches 1-3 could easily be picked up as a topic while
the rest of the series is considered carefully.

I tried to add some mentions of testing, but you've already tested more
than I expected, by adding the headers to the Apache output.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (8 preceding siblings ...)
  2022-09-19 16:08 ` [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Derrick Stolee
@ 2022-09-19 23:36 ` Lessley Dennington
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
  10 siblings, 0 replies; 171+ messages in thread
From: Lessley Dennington @ 2022-09-19 23:36 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git; +Cc: Matthew John Cheetham

This is a really exciting idea! Based on your patches, it seems to be a
great opportunity to add extensibility and flexibility to the credential
helper model without huge disruptions to the codebase. Well done!

On 9/13/22 12:25 PM, Matthew John Cheetham via GitGitGadget wrote:
>   3. Teach Git to specify authentication schemes other than Basic in
>      subsequent HTTP requests based on credential helper responses.
> 
This!! Yes!!
> 
> ...
> wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"
> 
I think sending the fields individually (as you describe in this doc and
implement in your patches) is the right call. In my opinion, it's more
legible, consistent with the remote response, and aligns with your goal of
minimizing authentication-related actions in Git.

Best,

Lessley

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers
  2022-09-19 16:08 ` [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Derrick Stolee
  2022-09-19 16:44   ` Derrick Stolee
@ 2022-09-21 22:19   ` Matthew John Cheetham
  1 sibling, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-09-21 22:19 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git

On 2022-09-19 09:08, Derrick Stolee wrote:
> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
>> Hello! I have an RFC to update the existing credential helper design in
>> order to allow for some new scenarios, and future evolution of auth methods
>> that Git hosts may wish to provide. I outline the background, summary of
>> changes and some challenges below. I also attach a series of patches to
>> illustrate the design proposal.
> 
> It's unfortunate that we didn't get to talk about this during the
> contributor summit, but it is super-technical and worth looking closely
> at all the details. 
> 
>> One missing element from the patches are extensive tests of the new
>> behaviour. It appears existing tests focus either on the credential helper
>> protocol/format, or rely on testing basic authentication only via an Apache
>> webserver. In order to have a full end to end test coverage of these new
>> features it make be that we need a more comprehensive test bed to mock these
>> more nuanced authentication methods. I lean on the experts on the list for
>> advice here.
> 
> The microsoft/git fork has a feature (the GVFS Protocol) that requires a
> custom HTTP server as a test helper. We might need a similar test helper
> to return these WWW-Authenticate headers and check the full request list
> from Git matches the spec. Doing that while also executing the proper Git
> commands to serve the HTTP bodies is hopefully not too large. It might be
> nice to adapt such a helper to replace the need for a full Apache install
> in our test suite, but that's an independent concern from this RFC.

That's a good reference and possible solution to the testing question, and
definitely something I can look at adding. I just wanted another pair of
eyes and thoughts on any other options that I may have been missing in the
existing testing repertoire, before embarking on writing such a test helper.

>> Limitations
>> ===========
>>
>> Because this credential model was built mostly for password based
>> authentication systems, it's somewhat limited. In particular:
>>
>>  1. To generate valid credentials, additional information about the request
>>     (or indeed the requestee and their device) may be required. For example,
>>     OAuth is based around scopes. A scope, like "git.read", might be
>>     required to read data from the remote. However, the remote cannot tell
>>     the credential helper what scope is required for this request.
>>
>>  2. This system is not fully extensible. Each time a new type of
>>     authentication (like OAuth Bearer) is invented, Git needs updates before
>>     credential helpers can take advantage of it (or leverage a new
>>     capability in libcurl).
>>
>>
>> Goals
>> =====
>>
>>  * As a user with multiple federated cloud identities:
> 
> I'm not sure if you mentioned it anywhere else, but this is specifically
> for cases where a user might have multiple identities _on the same host
> by DNS name_. The credential.useHttpPath config option might seem like it
> could help here, but the credential helper might pick the wrong identity
> that is the most-recent login. Either this workflow will require the user
> to re-login with every new URL or the fetches/clones will fail when the
> guess is wrong and the user would need to learn how to log into that other
> identity.
> 
> Please correct me if I'm wrong about any of this, but the details of your
> goals make it clear that the workflow will be greatly improved:

Such a scenario where multiple identities may be available for the same DNS
hostname would indeed be improved (with an appropriately enlightened
credential helper of course). As you mentioned, credential.useHttpPath can
also be used to workaround such a situation, but that just creates another
problem in that users need to provide the same set of credentials for each
repository with a full remote URL path that use the same identity.

By providing information about the auth challenge (including parameters
like authority or realm if present) would allow credential helpers select
or filter known identities and credentials automatically, avoiding user
input.

>>    * Reach out to a remote and have my credential helper automatically
>>      prompt me for the correct identity.
>>    * Leverage existing authentication systems built-in to many operating
>>      systems and devices to boost security and reduce reliance on passwords.
>>
>>  * As a Git host and/or cloud identity provider:
>>    
>>    * Leverage newest identity standards, enhancements, and threat
>>      mitigations - all without updating Git.
>>    * Enforce security policies (like requiring two-factor authentication)
>>      dynamically.
>>    * Allow integration with third party standard based identity providers in
>>      enterprises allowing customers to have a single plane of control for
>>      critical identities with access to source code.
> 
> I had a question with this part of your proposal:
> 
>>     Because the extra information forms an ordered list, and the existing
>>     credential helper I/O format only provides for simple key=value pairs,
>>     we introduce a new convention for transmitting an ordered list of
>>     values. Key names that are suffixed with a C-style array syntax should
>>     have values considered to form an order list, i.e. key[n]=value, where n
>>     is a zero based index of the values.
>>     
>>     For the WWW-Authenticate header values we opt to use the key wwwauth[n].
> ...
>> Git sends over standard input:
>>
>> protocol=https
>> host=example.com
>> wwwauth[0]=Bearer realm="login.example", scope="git.readwrite"
>> wwwauth[1]=Basic realm="login.example"
> 
> The important part here is that we provide a way to specify a multi-valued
> key as opposed to a "last one wins" key, right?
> 
> Using empty braces (wwwauth[]) would suffice to indicate this, right? That
> allows us to not care about the values inside the braces. The biggest
> issues I see with a value in the braces are:
> 
> 1. What if it isn't an integer?
> 2. What if we are missing a value?
> 3. What if they come out of order?
> 
> Without a value inside, then the order in which they appear provides
> implicit indices in their multi-valued list.
> 
> Other than that, I support this idea and will start looking at the code
> now.

There are two important things this extension to the I/O format provides:
1) multi-valued keys, and 2) ordering to the multiple values.

You are correct that dropping the integer index still means we still meet
requirement 1, and implicitly meet requirement 2. In this proposal I was
just being explicit in the ordering - it's not something I'm overly
attached to however, and may indeed make parsing or identifiying these
multi-valued keys easier on the credential helper side of things.

> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 5/8] credential: add WWW-Authenticate header to cred requests
  2022-09-19 16:33   ` Derrick Stolee
@ 2022-09-21 22:20     ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-09-21 22:20 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git

On 2022-09-19 09:33, Derrick Stolee wrote:
> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
>> In this case we send multiple `wwwauth[n]` properties where `n` is a
>> zero-indexed number, reflecting the order the WWW-Authenticate headers
>> appeared in the HTTP response.
>> @@ -151,6 +151,15 @@ Git understands the following attributes:
>>  	were read (e.g., `url=https://example.com` would behave as if
>>  	`protocol=https` and `host=example.com` had been provided). This
>>  	can help callers avoid parsing URLs themselves.
>> +
>> +`wwwauth[n]`::
>> +
>> +	When an HTTP response is received that includes one or more
>> +	'WWW-Authenticate' authentication headers, these can be passed to Git
>> +	(and subsequent credential helpers) with these attributes.
>> +	Each 'WWW-Authenticate' header value should be passed as a separate
>> +	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
>> +	appear in the HTTP response.
>>  +
>>  Note that specifying a protocol is mandatory and if the URL
>>  doesn't specify a hostname (e.g., "cert:///path/to/file") the
> 
> This "+" means that this paragraph should be connected to the previous
> one, so it seems that you've inserted your new value in the middle of
> the `url` key. You'll want to move yours to be after those two connected
> paragraphs. Your diff hunk should look like this:
> 
> --- >8 ---
> 
> diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
> index f18673017f..127ae29be3 100644
> --- a/Documentation/git-credential.txt
> +++ b/Documentation/git-credential.txt
> @@ -160,6 +160,15 @@ empty string.
>  Components which are missing from the URL (e.g., there is no
>  username in the example above) will be left unset.
>  
> +`wwwauth[n]`::
> +
> +	When an HTTP response is received that includes one or more
> +	'WWW-Authenticate' authentication headers, these can be passed to Git
> +	(and subsequent credential helpers) with these attributes.
> +	Each 'WWW-Authenticate' header value should be passed as a separate
> +	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
> +	appear in the HTTP response.
> +
>  GIT
>  ---
>  Part of the linkgit:git[1] suite
> 
> 
> --- >8 ---

Thanks for catching!

>> diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
>> index 497b9b9d927..fe118d76f98 100644
>> --- a/t/lib-httpd/apache.conf
>> +++ b/t/lib-httpd/apache.conf
>> @@ -235,6 +235,19 @@ SSLEngine On
>>  	Require valid-user
>>  </LocationMatch>
>>  
>> +# Advertise two additional auth methods above "Basic".
>> +# Neither of them actually work but serve test cases showing these
>> +# additional auth headers are consumed correctly.
>> +<Location /auth-wwwauth/>
>> +	AuthType Basic
>> +	AuthName "git-auth"
>> +	AuthUserFile passwd
>> +	Require valid-user
>> +	SetEnvIf Authorization "^\S+" authz
>> +	Header always add WWW-Authenticate "Bearer authority=https://login.example.com" env=!authz
>> +	Header always add WWW-Authenticate "FooAuth foo=bar baz=1" env=!authz
>> +</Location>
>> +
> 
> This is cool that you've figured out how to make our Apache tests
> add these headers! Maybe we won't need that extra test helper like
> I thought (unless we want to confirm the second request sends the
> right information).

This will exercise the new header parsing and passing the info to the helper
but will indeed not test the response. I feel like a test helper would be
beneficial still.. what I've done here doesn't feel 100% clean or complete.

>> +test_expect_success 'http auth sends www-auth headers to credential helper' '
>> +	write_script git-credential-tee <<-\EOF &&
>> +		cmd=$1
>> +		teefile=credential-$cmd
>> +		if [ -f "$teefile" ]; then
> 
> I think we prefer using "test" over the braces (and linebreak
> before then) like this:
> 
> 		if test -n "$teefile"
> 		then
> 
>> +			rm $teefile
>> +		fi
> 
> Alternatively, you could always run "rm -f $teefile" for
> simplicity.
I like simple :-)

>> +		(
>> +			while read line;
>> +			do
>> +				if [ -z "$line" ]; then
>> +					exit 0
>> +				fi
>> +				echo "$line" >> $teefile
>> +				echo $line
>> +			done
>> +		) | git credential-store $cmd
> 
> Since I'm not sure, I'll ask the question: do we need the sub-shell
> here, or could we pipe directly off of the "done"? Like this:
> 
> 		while read line;
> 		do
> 			if [ -z "$line" ]; then
> 				exit 0
> 			fi
> 			echo "$line" >> $teefile
> 			echo $line
> 		done | git credential-store $cmd

That we can.. I will update in next iteration.

>> +	EOF
> 
> 
>> +	cat >expected-get <<-EOF &&
>> +	protocol=http
>> +	host=127.0.0.1:5551
>> +	wwwauth[0]=Bearer authority=https://login.example.com
>> +	wwwauth[1]=FooAuth foo=bar baz=1
>> +	wwwauth[2]=Basic realm="git-auth"
>> +	EOF
>> +
>> +	cat >expected-store <<-EOF &&
>> +	protocol=http
>> +	host=127.0.0.1:5551
>> +	username=user@host
>> +	password=pass@host
>> +	EOF
>> +
>> +	rm -f .git-credentials &&
>> +	test_config credential.helper tee &&
>> +	set_askpass user@host pass@host &&
>> +	(
>> +		PATH="$PWD:$PATH" &&
>> +		git ls-remote "$HTTPD_URL/auth-wwwauth/smart/repo.git"
>> +	) &&
>> +	expect_askpass both user@host &&
>> +	test_cmp expected-get credential-get &&
>> +	test_cmp expected-store credential-store
> 
> Elegant check for both calls.
> 
> Thanks,
> -Stolee

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 4/8] http: read HTTP WWW-Authenticate response headers
  2022-09-19 16:21   ` Derrick Stolee
@ 2022-09-21 22:24     ` Matthew John Cheetham
  2022-09-26 14:13       ` Derrick Stolee
  0 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham @ 2022-09-21 22:24 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git

On 2022-09-19 09:21, Derrick Stolee wrote:
> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
> 
>> +	/**
>> +	 * A `strvec` of WWW-Authenticate header values. Each string
>> +	 * is the value of a WWW-Authenticate header in an HTTP response,
>> +	 * in the order they were received in the response.
>> +	 */
>> +	struct strvec wwwauth_headers;
> 
> I like this careful documentation.
> 
>> +	unsigned header_is_last_match:1;
> 
> But then this member is unclear how it is attached. It could use its
> own "for internal use" comment if we don't want to describe it in full
> detail here.

A fair point. I will update in a future iteration.

>> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
>> +{
>> +	size_t size = eltsize * nmemb;
>> +	struct strvec *values = &http_auth.wwwauth_headers;
>> +	struct strbuf buf = STRBUF_INIT;
>> +	const char *val;
>> +	const char *z = NULL;
>> +
>> +	/*
>> +	 * Header lines may not come NULL-terminated from libcurl so we must
>> +	 * limit all scans to the maximum length of the header line, or leverage
>> +	 * strbufs for all operations.
>> +	 *
>> +	 * In addition, it is possible that header values can be split over
>> +	 * multiple lines as per RFC 2616 (even though this has since been
>> +	 * deprecated in RFC 7230). A continuation header field value is
>> +	 * identified as starting with a space or horizontal tab.
>> +	 *
>> +	 * The formal definition of a header field as given in RFC 2616 is:
>> +	 *
>> +	 *   message-header = field-name ":" [ field-value ]
>> +	 *   field-name     = token
>> +	 *   field-value    = *( field-content | LWS )
>> +	 *   field-content  = <the OCTETs making up the field-value
>> +	 *                    and consisting of either *TEXT or combinations
>> +	 *                    of token, separators, and quoted-string>
>> +	 */
>> +
>> +	strbuf_add(&buf, ptr, size);
>> +
>> +	/* Strip the CRLF that should be present at the end of each field */
> 
> Is it really a CRLF? Or just an LF?

It is indeed an CRLF, agnostic of platform. HTTP defines CRLF as the
end-of-line marker for all entities other than the body.

See RFC 2616 section 2.2: https://www.rfc-editor.org/rfc/rfc2616#section-2.2

>> +	strbuf_trim_trailing_newline(&buf);
> 
> Thankfully, this will trim an LF _or_ CR/LF pair, so either way would be fine.
> 
>> +	/* Start of a new WWW-Authenticate header */
>> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
>> +		while (isspace(*val)) val++;
> 
> Break the "val++;" to its own line:
> 
> 		while (isspace(*val))
> 			val++;

Sure! Sorry I missed this one.

> While we are here, do we need to be careful about the end of the string at
> this point? Is it possible that the server will send all spaces up until the
> maximum header size (as mentioned in the message)?
> 
>> +
>> +		strvec_push(values, val);
>> +		http_auth.header_is_last_match = 1;
>> +		goto exit;
>> +	}
>> +
>> +	/*
>> +	 * This line could be a continuation of the previously matched header
>> +	 * field. If this is the case then we should append this value to the
>> +	 * end of the previously consumed value.
>> +	 */
>> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
>> +		const char **v = values->v + values->nr - 1;
> 
> I suppose we expect leading spaces as critical to this header, right?

Leading (and trailing) spaces are not part of the header value.

From RFC 2616 section 2.2 regarding header field values:

"All linear white space, including folding, has the same semantics as SP.
A recipient MAY replace any linear white space with a single SP before
interpreting the field value or forwarding the message downstream."

>> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
> 
> We might have better luck using a strbuf, initializing it with the expected
> size and using strbuf_add() to append the strings. Maybe I'm just prematurely
> optimizing, though.

This code path is used to re-join/fold a header value continuation, which is
pretty rare in the wild (if at all with modern web servers).

>> +
>> +		free((void*)*v);
>> +		*v = append;
>> +
>> +		goto exit;
>> +	}
>> +
>> +	/* This is the start of a new header we don't care about */
>> +	http_auth.header_is_last_match = 0;
>> +
>> +	/*
>> +	 * If this is a HTTP status line and not a header field, this signals
>> +	 * a different HTTP response. libcurl writes all the output of all
>> +	 * response headers of all responses, including redirects.
>> +	 * We only care about the last HTTP request response's headers so clear
>> +	 * the existing array.
>> +	 */
>> +	if (skip_iprefix(buf.buf, "http/", &z))
>> +		strvec_clear(values);
>> +
>> +exit:
>> +	strbuf_release(&buf);
>> +	return size;
>> +}
>> +
>>  size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
>>  {
>>  	return nmemb;
>> @@ -1829,6 +1904,8 @@ static int http_request(const char *url,
>>  					 fwrite_buffer);
>>  	}
>>  
>> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
> 
> Nice integration point!
> 
> Thanks,
> -Stolee

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 3/8] osxkeychain: clarify that we ignore unknown lines
  2022-09-19 16:12   ` Derrick Stolee
@ 2022-09-21 22:48     ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-09-21 22:48 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git

On 2022-09-19 09:12, Derrick Stolee wrote:
> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Like in all the other credential helpers, the osxkeychain helper
>> ignores unknown credential lines.
>>
>> Add a comment (a la the other helpers) to make it clear and explicit
>> that this is the desired behaviour.
> 
> I recommend that these first three patches be submitted for full
> review and merging, since they seem important independent of this
> RFC.
> 
> Thanks,
> -Stolee

That's a fair point. I will submit these independently.

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH 4/8] http: read HTTP WWW-Authenticate response headers
  2022-09-21 22:24     ` Matthew John Cheetham
@ 2022-09-26 14:13       ` Derrick Stolee
  0 siblings, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-09-26 14:13 UTC (permalink / raw)
  To: Matthew John Cheetham, Matthew John Cheetham via GitGitGadget, git

On 9/21/2022 6:24 PM, Matthew John Cheetham wrote:
> On 2022-09-19 09:21, Derrick Stolee wrote:
>> On 9/13/2022 3:25 PM, Matthew John Cheetham via GitGitGadget wrote:

>>> +
>>> +		strvec_push(values, val);
>>> +		http_auth.header_is_last_match = 1;
>>> +		goto exit;
>>> +	}
>>> +
>>> +	/*
>>> +	 * This line could be a continuation of the previously matched header
>>> +	 * field. If this is the case then we should append this value to the
>>> +	 * end of the previously consumed value.
>>> +	 */
>>> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
>>> +		const char **v = values->v + values->nr - 1;
>>
>> I suppose we expect leading spaces as critical to this header, right?
> 
> Leading (and trailing) spaces are not part of the header value.
> 
> From RFC 2616 section 2.2 regarding header field values:
> 
> "All linear white space, including folding, has the same semantics as SP.
> A recipient MAY replace any linear white space with a single SP before
> interpreting the field value or forwarding the message downstream."
> 
>>> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
>>
>> We might have better luck using a strbuf, initializing it with the expected
>> size and using strbuf_add() to append the strings. Maybe I'm just prematurely
>> optimizing, though.
> 
> This code path is used to re-join/fold a header value continuation, which is
> pretty rare in the wild (if at all with modern web servers).

I think the point is that I noticed that you removed the leading whitespace
in a header's first line, but additional whitespace after this first space
will be included in the concatenated content of the header value.

As long as that is the intention, then I'm happy here.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v2 0/6] Enhance credential helper protocol to include auth headers
  2022-09-13 19:25 [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                   ` (9 preceding siblings ...)
  2022-09-19 23:36 ` Lessley Dennington
@ 2022-10-21 17:07 ` Matthew John Cheetham via GitGitGadget
  2022-10-21 17:07   ` [PATCH v2 1/6] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
                     ` (7 more replies)
  10 siblings, 8 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:07 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham

Following from my original RFC submission [0], this submission is considered
ready for full review. This patch series is now based on top of current
master (9c32cfb49c60fa8173b9666db02efe3b45a8522f) that includes my now
separately submitted patches [1] to fix up the other credential helpers'
behaviour.

In this patch series I update the existing credential helper design in order
to allow for some new scenarios, and future evolution of auth methods that
Git hosts may wish to provide. I outline the background, summary of changes
and some challenges below.

Testing these new additions, I introduce a new test helper test-http-server
that acts as a frontend to git-http-backend; a mini HTTP server based
heavily on git-daemon, with simple authentication configurable by command
line args.


Background
==========

Git uses a variety of protocols [2]: local, Smart HTTP, Dumb HTTP, SSH, and
Git. Here I focus on the Smart HTTP protocol, and attempt to enhance the
authentication capabilities of this protocol to address limitations (see
below).

The Smart HTTP protocol in Git supports a few different types of HTTP
authentication - Basic and Digest (RFC 2617) [3], and Negotiate (RFC 2478)
[4]. Git uses a extensible model where credential helpers can provide
credentials for protocols [5]. Several helpers support alternatives such as
OAuth authentication (RFC 6749) [6], but this is typically done as an
extension. For example, a helper might use basic auth and set the password
to an OAuth Bearer access token. Git uses standard input and output to
communicate with credential helpers.

After a HTTP 401 response, Git would call a credential helper with the
following over standard input:

protocol=https
host=example.com


And then a credential helper would return over standard output:

protocol=https
host=example.com
username=bob@id.example.com
password=<BEARER-TOKEN>


Git then the following request to the remote, including the standard HTTP
Authorization header (RFC 7235 Section 4.2) [7]:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: Basic base64(bob@id.example.com:<BEARER-TOKEN>)


Credential helpers are encouraged (see gitcredentials.txt) to return the
minimum information necessary.


Limitations
===========

Because this credential model was built mostly for password based
authentication systems, it's somewhat limited. In particular:

 1. To generate valid credentials, additional information about the request
    (or indeed the requestee and their device) may be required. For example,
    OAuth is based around scopes. A scope, like "git.read", might be
    required to read data from the remote. However, the remote cannot tell
    the credential helper what scope is required for this request.

 2. This system is not fully extensible. Each time a new type of
    authentication (like OAuth Bearer) is invented, Git needs updates before
    credential helpers can take advantage of it (or leverage a new
    capability in libcurl).


Goals
=====

 * As a user with multiple federated cloud identities:
   
   * Reach out to a remote and have my credential helper automatically
     prompt me for the correct identity.
   * Allow credential helpers to differentiate between different authorities
     or authentication/authorization challenge types, even from the same DNS
     hostname (and without needing to use credential.useHttpPath).
   * Leverage existing authentication systems built-in to many operating
     systems and devices to boost security and reduce reliance on passwords.

 * As a Git host and/or cloud identity provider:
   
   * Leverage newest identity standards, enhancements, and threat
     mitigations - all without updating Git.
   * Enforce security policies (like requiring two-factor authentication)
     dynamically.
   * Allow integration with third party standard based identity providers in
     enterprises allowing customers to have a single plane of control for
     critical identities with access to source code.


Design Principles
=================

 * Use the existing infrastructure. Git credential helpers are an
   already-working model.
 * Follow widely-adopted time-proven open standards, avoid net new ideas in
   the authentication space.
 * Minimize knowledge of authentication in Git; maintain modularity and
   extensibility.


Proposed Changes
================

 1. Teach Git to read HTTP response headers, specifically the standard
    WWW-Authenticate (RFC 7235 Section 4.1) headers.

 2. Teach Git to include extra information about HTTP responses that require
    authentication when calling credential helpers. Specifically the
    WWW-Authenticate header information.
    
    Because the extra information forms an ordered list, and the existing
    credential helper I/O format only provides for simple key=value pairs,
    we introduce a new convention for transmitting an ordered list of
    values. Key names that are suffixed with a C-style array syntax should
    have values considered to form an order list, i.e. key[]=value, where
    the order of the key=value pairs in the stream specifies the order.
    
    For the WWW-Authenticate header values we opt to use the key wwwauth[].

 3. Teach Git to specify authentication schemes other than Basic in
    subsequent HTTP requests based on credential helper responses.


Handling the WWW-Authenticate header in detail
==============================================

RFC 6750 [8] envisions that OAuth Bearer resource servers would give
responses that include WWW-Authenticate headers, for example:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


Specifically, a WWW-Authenticate header consists of a scheme and arbitrary
attributes, depending on the scheme. This pattern enables generic OAuth or
OpenID Connect [9] authorities. Note that it is possible to have several
WWW-Authenticate challenges in a response.

First Git attempts to make a request, unauthenticated, which fails with a
401 response and includes WWW-Authenticate header(s).

Next, Git invokes a credential helper which may prompt the user. If the user
approves, a credential helper can generate a token (or any auth challenge
response) to be used for that request.

For example: with a remote that supports bearer tokens from an OpenID
Connect [9] authority, a credential helper can use OpenID Connect's
Discovery [10] and Dynamic Client Registration [11] to register a client and
make a request with the correct permissions to access the remote. In this
manner, a user can be dynamically sent to the right federated identity
provider for a remote without any up-front configuration or manual
processes.

Following from the principle of keeping authentication knowledge in Git to a
minimum, we modify Git to add all WWW-Authenticate values to the credential
helper call.

Git sends over standard input:

protocol=https
host=example.com
wwwauth[]=Bearer realm="login.example", scope="git.readwrite"
wwwauth[]=Basic realm="login.example"


A credential helper that understands the extra wwwauth[n] property can
decide on the "best" or correct authentication scheme, generate credentials
for the request, and interact with the user.

The credential helper would then return over standard output:

protocol=https
host=example.com
path=foo.git
username=bob@identity.example
password=<BEARER-TOKEN>


Note that WWW-Authenticate supports multiple challenges, either in one
header:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"


or in multiple headers:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


These have equivalent meaning (RFC 2616 Section 4.2 [12]). To simplify the
implementation, Git will not merge or split up any of these WWW-Authenticate
headers, and instead pass each header line as one credential helper
property. The credential helper is responsible for splitting, merging, and
otherwise parsing these header values.

An alternative option to sending the header fields individually would be to
merge the header values in to one key=value property, for example:

...
wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"



Future flexibility
==================

By allowing the credential helpers decide the best authentication scheme, we
can allow the remote Git server to both offer new schemes (or remove old
ones) that enlightened credential helpers could take immediate advantage of,
and to use credentials that are much more tightly scoped and bound to the
specific request.

For example imagine a new "FooBar" authentication scheme that is surfaced in
the following response:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: FooBar realm="login.example", algs="ES256 PS256"


With support for arbitrary authentication schemes, Git would call credential
helpers with the following over standard input:

protocol=https
host=example.com
wwwauth[]=FooBar realm="login.example", algs="ES256 PS256", nonce="abc123"


And then an enlightened credential helper would return over standard output:

protocol=https
host=example.com
authtype=FooBar
username=bob@id.example.com
password=<FooBar credential>


Git would be expected to attach this authorization header to the next
request:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: FooBar <FooBar credential>



Should Git not control the set of authentication schemes?
=========================================================

One concern that the reader may have regarding these changes is in allowing
helpers to select the authentication mechanism to use, it may be possible
that a weaker form of authentication is used.

Take for example a Git remote server that responds with the following
authentication schemes:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Negotiate ...
WWW-Authenticate: Basic ...


Today Git (and libcurl) prefer to Negotiate over Basic authentication [13].
If a helper responded with authtype=basic Git would now be using a "less
secure" mechanism.

The reason we still propose the credential helper decide on the
authentication scheme is that Git is not the best placed entity to decide
what type of authentication should be used for a particular request (see
Design Principle 3).

OAuth Bearer tokens are often bundled in Basic Authorization headers [14],
but given that the tokens are/can be short-lived and have a highly scoped
set of permissions, this solution could be argued as being more secure than
something like NTLM [15]. Similarly, the user may wish to be consulted on
selecting a particular user account, or directly selecting an authentication
mechanism for a request that otherwise they would not be able to use.

Also, as new authentication protocols appear Git does not need to be
modified or updated for the user to take advantage of them; the credential
helpers take on the responsibility of learning and selecting the "best"
option.


Why not SSH?
============

There's nothing wrong with SSH. However, Git's Smart HTTP transport is
widely used, often with OAuth Bearer tokens. Git's Smart HTTP transport
sometimes requires less client setup than SSH transport, and works in
environments when SSH ports may be blocked. As long as Git supports HTTP
transport, it should support common and popular HTTP authentication methods.


References
==========

 * [0] [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth
   headers
   https://lore.kernel.org/git/pull.1352.git.1663097156.gitgitgadget@gmail.com/

 * [1] [PATCH 0/3] Correct credential helper discrepancies handling input
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * [2] Git on the Server - The Protocols
   https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols

 * [3] HTTP Authentication: Basic and Digest Access Authentication
   https://datatracker.ietf.org/doc/html/rfc2617

 * [4] The Simple and Protected GSS-API Negotiation Mechanism
   https://datatracker.ietf.org/doc/html/rfc2478

 * [5] Git Credentials - Custom Helpers
   https://git-scm.com/docs/gitcredentials#_custom_helpers

 * [6] The OAuth 2.0 Authorization Framework
   https://datatracker.ietf.org/doc/html/rfc6749

 * [7] Hypertext Transfer Protocol (HTTP/1.1): Authentication
   https://datatracker.ietf.org/doc/html/rfc7235

 * [8] The OAuth 2.0 Authorization Framework: Bearer Token Usage
   https://datatracker.ietf.org/doc/html/rfc6750

 * [9] OpenID Connect Core 1.0
   https://openid.net/specs/openid-connect-core-1_0.html

 * [10] OpenID Connect Discovery 1.0
   https://openid.net/specs/openid-connect-discovery-1_0.html

 * [11] OpenID Connect Dynamic Client Registration 1.0
   https://openid.net/specs/openid-connect-registration-1_0.html

 * [12] Hypertext Transfer Protocol (HTTP/1.1)
   https://datatracker.ietf.org/doc/html/rfc2616

 * [13] libcurl http.c pickoneauth Function
   https://github.com/curl/curl/blob/c495dcd02e885fc3f35164b1c3c5f72fa4b60c46/lib/http.c#L381-L416

 * [14] Git Credential Manager GitHub Host Provider (using PAT as password)
   https://github.com/GitCredentialManager/git-credential-manager/blob/f77b766f6875b90251249f2aa1702b921309cf00/src/shared/GitHub/GitHubHostProvider.cs#L157

 * [15] NT LAN Manager (NTLM) Authentication Protocol
   https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-nlmp/b38c36ed-2804-4868-a9ff-8dd3182128e4


Updates from RFC
================

 * Submitted first three patches as separate submission:
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * Various style fixes and updates to- and addition of comments.

 * Drop the explicit integer index in new 'array' style credential helper
   attrbiutes ("key[n]=value" becomes just "key[]=value").

 * Added test helper; a mini HTTP server, and several tests.

Matthew John Cheetham (6):
  http: read HTTP WWW-Authenticate response headers
  credential: add WWW-Authenticate header to cred requests
  http: store all request headers on active_request_slot
  http: move proactive auth to first slot creation
  http: set specific auth scheme depending on credential
  t5556-http-auth: add test for HTTP auth hdr logic

 Documentation/git-credential.txt          |   18 +
 Makefile                                  |    2 +
 contrib/buildsystems/CMakeLists.txt       |   13 +
 credential.c                              |   18 +
 credential.h                              |   16 +
 git-curl-compat.h                         |   10 +
 http-push.c                               |  103 +-
 http-walker.c                             |    2 +-
 http.c                                    |  200 +++-
 http.h                                    |    4 +-
 remote-curl.c                             |   36 +-
 t/helper/.gitignore                       |    1 +
 t/helper/test-credential-helper-replay.sh |   14 +
 t/helper/test-http-server.c               | 1134 +++++++++++++++++++++
 t/t5556-http-auth.sh                      |  260 +++++
 15 files changed, 1695 insertions(+), 136 deletions(-)
 create mode 100755 t/helper/test-credential-helper-replay.sh
 create mode 100644 t/helper/test-http-server.c
 create mode 100755 t/t5556-http-auth.sh


base-commit: 9c32cfb49c60fa8173b9666db02efe3b45a8522f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1352%2Fmjcheetham%2Femu-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1352/mjcheetham/emu-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1352

Range-diff vs v1:

 1:  6426f9c3954 < -:  ----------- wincred: ignore unknown lines (do not die)
 2:  ae5c1bfc092 < -:  ----------- netrc: ignore unknown lines (do not die)
 3:  2ece562a595 < -:  ----------- osxkeychain: clarify that we ignore unknown lines
 4:  78e66d56605 ! 1:  f297c78f60a http: read HTTP WWW-Authenticate response headers
     @@ credential.h: struct credential {
      +	 * in the order they were received in the response.
      +	 */
      +	struct strvec wwwauth_headers;
     ++
     ++	/**
     ++	 * Internal use only. Used to keep track of split header fields
     ++	 * in order to fold multiple lines into one value.
     ++	 */
      +	unsigned header_is_last_match:1;
      +
       	unsigned approved:1,
     @@ http.c: size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buff
      +
      +	/* Start of a new WWW-Authenticate header */
      +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
     -+		while (isspace(*val)) val++;
     ++		while (isspace(*val))
     ++			val++;
      +
      +		strvec_push(values, val);
      +		http_auth.header_is_last_match = 1;
 5:  936545004b8 ! 2:  0838d992744 credential: add WWW-Authenticate header to cred requests
     @@ Commit message
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Documentation/git-credential.txt ##
     -@@ Documentation/git-credential.txt: Git understands the following attributes:
     - 	were read (e.g., `url=https://example.com` would behave as if
     - 	`protocol=https` and `host=example.com` had been provided). This
     - 	can help callers avoid parsing URLs themselves.
     -+
     -+`wwwauth[n]`::
     +@@ Documentation/git-credential.txt: empty string.
     + Components which are missing from the URL (e.g., there is no
     + username in the example above) will be left unset.
     + 
     ++`wwwauth[]`::
      +
      +	When an HTTP response is received that includes one or more
      +	'WWW-Authenticate' authentication headers, these can be passed to Git
      +	(and subsequent credential helpers) with these attributes.
      +	Each 'WWW-Authenticate' header value should be passed as a separate
     -+	attribute 'wwwauth[n]' where 'n' is the zero-indexed order the headers
     -+	appear in the HTTP response.
     - +
     - Note that specifying a protocol is mandatory and if the URL
     - doesn't specify a hostname (e.g., "cert:///path/to/file") the
     ++	attribute 'wwwauth[]' where the order of the attributes is the same
     ++	as they appear in the HTTP response.
     ++
     + GIT
     + ---
     + Part of the linkgit:git[1] suite
      
       ## credential.c ##
      @@ credential.c: static void credential_write_item(FILE *fp, const char *key, const char *value,
     @@ credential.c: static void credential_write_item(FILE *fp, const char *key, const
      +				    const struct strvec *vec)
      +{
      +	int i = 0;
     ++	const char *full_key = xstrfmt("%s[]", key);
      +	for (; i < vec->nr; i++) {
     -+		const char *full_key = xstrfmt("%s[%d]", key, i);
      +		credential_write_item(fp, full_key, vec->v[i], 0);
     -+		free((void*)full_key);
      +	}
     ++	free((void*)full_key);
      +}
      +
       void credential_write(const struct credential *c, FILE *fp)
     @@ credential.c: void credential_write(const struct credential *c, FILE *fp)
       }
       
       static int run_credential_helper(struct credential *c,
     -
     - ## t/lib-httpd/apache.conf ##
     -@@ t/lib-httpd/apache.conf: SSLEngine On
     - 	Require valid-user
     - </LocationMatch>
     - 
     -+# Advertise two additional auth methods above "Basic".
     -+# Neither of them actually work but serve test cases showing these
     -+# additional auth headers are consumed correctly.
     -+<Location /auth-wwwauth/>
     -+	AuthType Basic
     -+	AuthName "git-auth"
     -+	AuthUserFile passwd
     -+	Require valid-user
     -+	SetEnvIf Authorization "^\S+" authz
     -+	Header always add WWW-Authenticate "Bearer authority=https://login.example.com" env=!authz
     -+	Header always add WWW-Authenticate "FooAuth foo=bar baz=1" env=!authz
     -+</Location>
     -+
     - RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
     - RewriteCond %{REQUEST_URI} /git-receive-pack$
     - RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
     -
     - ## t/t5551-http-fetch-smart.sh ##
     -@@ t/t5551-http-fetch-smart.sh: test_expect_success 'http auth forgets bogus credentials' '
     - 	expect_askpass both user@host
     - '
     - 
     -+test_expect_success 'http auth sends www-auth headers to credential helper' '
     -+	write_script git-credential-tee <<-\EOF &&
     -+		cmd=$1
     -+		teefile=credential-$cmd
     -+		if [ -f "$teefile" ]; then
     -+			rm $teefile
     -+		fi
     -+		(
     -+			while read line;
     -+			do
     -+				if [ -z "$line" ]; then
     -+					exit 0
     -+				fi
     -+				echo "$line" >> $teefile
     -+				echo $line
     -+			done
     -+		) | git credential-store $cmd
     -+	EOF
     -+
     -+	cat >expected-get <<-EOF &&
     -+	protocol=http
     -+	host=127.0.0.1:5551
     -+	wwwauth[0]=Bearer authority=https://login.example.com
     -+	wwwauth[1]=FooAuth foo=bar baz=1
     -+	wwwauth[2]=Basic realm="git-auth"
     -+	EOF
     -+
     -+	cat >expected-store <<-EOF &&
     -+	protocol=http
     -+	host=127.0.0.1:5551
     -+	username=user@host
     -+	password=pass@host
     -+	EOF
     -+
     -+	rm -f .git-credentials &&
     -+	test_config credential.helper tee &&
     -+	set_askpass user@host pass@host &&
     -+	(
     -+		PATH="$PWD:$PATH" &&
     -+		git ls-remote "$HTTPD_URL/auth-wwwauth/smart/repo.git"
     -+	) &&
     -+	expect_askpass both user@host &&
     -+	test_cmp expected-get credential-get &&
     -+	test_cmp expected-store credential-store
     -+'
     -+
     - test_expect_success 'client falls back from v2 to v0 to match server' '
     - 	GIT_TRACE_PACKET=$PWD/trace \
     - 	GIT_TEST_PROTOCOL_VERSION=2 \
 6:  20843e2051e = 3:  c62fef65f46 http: store all request headers on active_request_slot
 7:  cae7180bc37 = 4:  a790c01f9f2 http: move proactive auth to first slot creation
 8:  7f827067f55 ! 5:  b0b7cd7ee5e http: set specific auth scheme depending on credential
     @@ Commit message
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Documentation/git-credential.txt ##
     -@@ Documentation/git-credential.txt: Git understands the following attributes:
     - 	`protocol=https` and `host=example.com` had been provided). This
     - 	can help callers avoid parsing URLs themselves.
     +@@ Documentation/git-credential.txt: username in the example above) will be left unset.
     + 	attribute 'wwwauth[]' where the order of the attributes is the same
     + 	as they appear in the HTTP response.
       
      +`authtype`::
      +
     @@ Documentation/git-credential.txt: Git understands the following attributes:
      +	scheme for the `Authorization` header, and the `password` field is
      +	used as the raw unencoded authorization parameters of the same header.
      +
     - `wwwauth[n]`::
     - 
     - 	When an HTTP response is received that includes one or more
     + GIT
     + ---
     + Part of the linkgit:git[1] suite
      
       ## credential.c ##
      @@ credential.c: void credential_clear(struct credential *c)
     @@ git-curl-compat.h
       
      +/**
      + * CURLAUTH_BEARER was added in 7.61.0, released in July 2018.
     ++ * However, only 7.69.0 fixes a bug where Bearer headers were not
     ++ * actually sent with reused connections on subsequent transfers
     ++ * (curl/curl@dea17b519dc1).
      + */
     -+#if LIBCURL_VERSION_NUM >= 0x073D00
     ++#if LIBCURL_VERSION_NUM >= 0x074500
      +#define GIT_CURL_HAVE_CURLAUTH_BEARER
      +#endif
      +
 -:  ----------- > 6:  f3f13ed8c82 t5556-http-auth: add test for HTTP auth hdr logic

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v2 1/6] http: read HTTP WWW-Authenticate response headers
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:07   ` Matthew John Cheetham via GitGitGadget
  2022-10-21 17:07   ` [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:07 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

According to RFC2616 Section 4.2 [1], header field names are not
case-sensitive meaning when collecting multiple values for the same
field name, we can just use the case of the first observed instance of
each field name and no normalisation is required.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us one line at
a time.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 credential.c |  1 +
 credential.h | 15 ++++++++++
 http.c       | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/credential.c b/credential.c
index f6389a50684..897b4679333 100644
--- a/credential.c
+++ b/credential.c
@@ -22,6 +22,7 @@ void credential_clear(struct credential *c)
 	free(c->username);
 	free(c->password);
 	string_list_clear(&c->helpers, 0);
+	strvec_clear(&c->wwwauth_headers);
 
 	credential_init(c);
 }
diff --git a/credential.h b/credential.h
index f430e77fea4..6f2e5bc610b 100644
--- a/credential.h
+++ b/credential.h
@@ -2,6 +2,7 @@
 #define CREDENTIAL_H
 
 #include "string-list.h"
+#include "strvec.h"
 
 /**
  * The credentials API provides an abstracted way of gathering username and
@@ -115,6 +116,19 @@ struct credential {
 	 */
 	struct string_list helpers;
 
+	/**
+	 * A `strvec` of WWW-Authenticate header values. Each string
+	 * is the value of a WWW-Authenticate header in an HTTP response,
+	 * in the order they were received in the response.
+	 */
+	struct strvec wwwauth_headers;
+
+	/**
+	 * Internal use only. Used to keep track of split header fields
+	 * in order to fold multiple lines into one value.
+	 */
+	unsigned header_is_last_match:1;
+
 	unsigned approved:1,
 		 configured:1,
 		 quit:1,
@@ -130,6 +144,7 @@ struct credential {
 
 #define CREDENTIAL_INIT { \
 	.helpers = STRING_LIST_INIT_DUP, \
+	.wwwauth_headers = STRVEC_INIT, \
 }
 
 /* Initialize a credential structure, setting all fields to empty. */
diff --git a/http.c b/http.c
index 5d0502f51fd..03d43d352e7 100644
--- a/http.c
+++ b/http.c
@@ -183,6 +183,82 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	return nmemb;
 }
 
+static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
+{
+	size_t size = eltsize * nmemb;
+	struct strvec *values = &http_auth.wwwauth_headers;
+	struct strbuf buf = STRBUF_INIT;
+	const char *val;
+	const char *z = NULL;
+
+	/*
+	 * Header lines may not come NULL-terminated from libcurl so we must
+	 * limit all scans to the maximum length of the header line, or leverage
+	 * strbufs for all operations.
+	 *
+	 * In addition, it is possible that header values can be split over
+	 * multiple lines as per RFC 2616 (even though this has since been
+	 * deprecated in RFC 7230). A continuation header field value is
+	 * identified as starting with a space or horizontal tab.
+	 *
+	 * The formal definition of a header field as given in RFC 2616 is:
+	 *
+	 *   message-header = field-name ":" [ field-value ]
+	 *   field-name     = token
+	 *   field-value    = *( field-content | LWS )
+	 *   field-content  = <the OCTETs making up the field-value
+	 *                    and consisting of either *TEXT or combinations
+	 *                    of token, separators, and quoted-string>
+	 */
+
+	strbuf_add(&buf, ptr, size);
+
+	/* Strip the CRLF that should be present at the end of each field */
+	strbuf_trim_trailing_newline(&buf);
+
+	/* Start of a new WWW-Authenticate header */
+	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
+		while (isspace(*val))
+			val++;
+
+		strvec_push(values, val);
+		http_auth.header_is_last_match = 1;
+		goto exit;
+	}
+
+	/*
+	 * This line could be a continuation of the previously matched header
+	 * field. If this is the case then we should append this value to the
+	 * end of the previously consumed value.
+	 */
+	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
+		const char **v = values->v + values->nr - 1;
+		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
+
+		free((void*)*v);
+		*v = append;
+
+		goto exit;
+	}
+
+	/* This is the start of a new header we don't care about */
+	http_auth.header_is_last_match = 0;
+
+	/*
+	 * If this is a HTTP status line and not a header field, this signals
+	 * a different HTTP response. libcurl writes all the output of all
+	 * response headers of all responses, including redirects.
+	 * We only care about the last HTTP request response's headers so clear
+	 * the existing array.
+	 */
+	if (skip_iprefix(buf.buf, "http/", &z))
+		strvec_clear(values);
+
+exit:
+	strbuf_release(&buf);
+	return size;
+}
+
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
 	return nmemb;
@@ -1829,6 +1905,8 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
+
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
  2022-10-21 17:07   ` [PATCH v2 1/6] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:07   ` Matthew John Cheetham via GitGitGadget
  2022-10-28 18:22     ` Jeff Hostetler
  2022-10-21 17:08   ` [PATCH v2 3/6] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:07 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[n]` properties where `n` is a
zero-indexed number, reflecting the order the WWW-Authenticate headers
appeared in the HTTP response.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt |  9 +++++++++
 credential.c                     | 12 ++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index f18673017f5..0ff3cbc25b9 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -160,6 +160,15 @@ empty string.
 Components which are missing from the URL (e.g., there is no
 username in the example above) will be left unset.
 
+`wwwauth[]`::
+
+	When an HTTP response is received that includes one or more
+	'WWW-Authenticate' authentication headers, these can be passed to Git
+	(and subsequent credential helpers) with these attributes.
+	Each 'WWW-Authenticate' header value should be passed as a separate
+	attribute 'wwwauth[]' where the order of the attributes is the same
+	as they appear in the HTTP response.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/credential.c b/credential.c
index 897b4679333..8a3ad6c0ae2 100644
--- a/credential.c
+++ b/credential.c
@@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
 	fprintf(fp, "%s=%s\n", key, value);
 }
 
+static void credential_write_strvec(FILE *fp, const char *key,
+				    const struct strvec *vec)
+{
+	int i = 0;
+	const char *full_key = xstrfmt("%s[]", key);
+	for (; i < vec->nr; i++) {
+		credential_write_item(fp, full_key, vec->v[i], 0);
+	}
+	free((void*)full_key);
+}
+
 void credential_write(const struct credential *c, FILE *fp)
 {
 	credential_write_item(fp, "protocol", c->protocol, 1);
@@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
 static int run_credential_helper(struct credential *c,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v2 3/6] http: store all request headers on active_request_slot
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
  2022-10-21 17:07   ` [PATCH v2 1/6] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
  2022-10-21 17:07   ` [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:08   ` Matthew John Cheetham via GitGitGadget
  2022-10-21 17:08   ` [PATCH v2 4/6] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:08 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Once a list of headers has been set on the curl handle, it is not
possible to recover that `struct curl_slist` instance to add or modify
headers.

In future commits we will want to modify the set of request headers in
response to an authentication challenge/401 response from the server,
with information provided by a credential helper.

There are a number of different places where curl is used for an HTTP
request, and they do not have a common handling of request headers.
However, given that they all do call the `start_active_slot()` function,
either directly or indirectly via `run_slot()` or `run_one_slot()`, we
use this as the point to set the `CURLOPT_HTTPHEADER` option just
before the request is made.

We collect all request headers in a `struct curl_slist` on the
`struct active_request_slot` that is obtained from a call to
`get_active_slot(int)`. This function now takes a single argument to
define if the initial set of headers on the slot should include the
"Pragma: no-cache" header, along with all extra headers specified via
`http.extraHeader` config values.

The active request slot obtained from `get_active_slot(int)` will always
contain a fresh set of default headers and any headers set in previous
usages of this slot will be freed.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http-push.c   | 103 ++++++++++++++++++++++----------------------------
 http-walker.c |   2 +-
 http.c        |  82 ++++++++++++++++++----------------------
 http.h        |   4 +-
 remote-curl.c |  36 +++++++++---------
 5 files changed, 101 insertions(+), 126 deletions(-)

diff --git a/http-push.c b/http-push.c
index 5f4340a36e6..2b40959b376 100644
--- a/http-push.c
+++ b/http-push.c
@@ -211,29 +211,29 @@ static void curl_setup_http(CURL *curl, const char *url,
 	curl_easy_setopt(curl, CURLOPT_UPLOAD, 1);
 }
 
-static struct curl_slist *get_dav_token_headers(struct remote_lock *lock, enum dav_header_flag options)
+static struct curl_slist *append_dav_token_headers(struct curl_slist *headers,
+	struct remote_lock *lock, enum dav_header_flag options)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
 	if (options & DAV_HEADER_IF) {
 		strbuf_addf(&buf, "If: (<%s>)", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_LOCK) {
 		strbuf_addf(&buf, "Lock-Token: <%s>", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_TIMEOUT) {
 		strbuf_addf(&buf, "Timeout: Second-%ld", lock->timeout);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	strbuf_release(&buf);
 
-	return dav_headers;
+	return headers;
 }
 
 static void finish_request(struct transfer_request *request);
@@ -281,7 +281,7 @@ static void start_mkcol(struct transfer_request *request)
 
 	request->url = get_remote_object_url(repo->url, hex, 1);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MKCOL);
@@ -399,7 +399,7 @@ static void start_put(struct transfer_request *request)
 	strbuf_add(&buf, request->lock->tmpfile_suffix, the_hash_algo->hexsz + 1);
 	request->url = strbuf_detach(&buf, NULL);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http(slot->curl, request->url, DAV_PUT,
@@ -417,15 +417,13 @@ static void start_put(struct transfer_request *request)
 static void start_move(struct transfer_request *request)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MOVE);
-	dav_headers = curl_slist_append(dav_headers, request->dest);
-	dav_headers = curl_slist_append(dav_headers, "Overwrite: T");
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
+	slot->headers = curl_slist_append(slot->headers, request->dest);
+	slot->headers = curl_slist_append(slot->headers, "Overwrite: T");
 
 	if (start_active_slot(slot)) {
 		request->slot = slot;
@@ -440,17 +438,16 @@ static int refresh_lock(struct remote_lock *lock)
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
 	lock->refreshing = 1;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_LOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -464,7 +461,6 @@ static int refresh_lock(struct remote_lock *lock)
 	}
 
 	lock->refreshing = 0;
-	curl_slist_free_all(dav_headers);
 
 	return rc;
 }
@@ -838,7 +834,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	char *ep;
 	char timeout_header[25];
 	struct remote_lock *lock = NULL;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	char *escaped;
 
@@ -849,7 +844,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	while (ep) {
 		char saved_character = ep[1];
 		ep[1] = '\0';
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
 		curl_setup_http_get(slot->curl, url, DAV_MKCOL);
 		if (start_active_slot(slot)) {
@@ -875,14 +870,15 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
+	slot = get_active_slot(0);
+	slot->results = &results;
+
 	xsnprintf(timeout_header, sizeof(timeout_header), "Timeout: Second-%ld", timeout);
-	dav_headers = curl_slist_append(dav_headers, timeout_header);
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
+	slot->headers = curl_slist_append(slot->headers, timeout_header);
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
 
-	slot = get_active_slot();
-	slot->results = &results;
 	curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	CALLOC_ARRAY(lock, 1);
@@ -921,7 +917,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		fprintf(stderr, "Unable to start LOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
 
@@ -945,15 +940,14 @@ static int unlock_remote(struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct remote_lock *prev = repo->locks;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_LOCK);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_LOCK);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_UNLOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -966,8 +960,6 @@ static int unlock_remote(struct remote_lock *lock)
 		fprintf(stderr, "Unable to start UNLOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
-
 	if (repo->locks == lock) {
 		repo->locks = lock->next;
 	} else {
@@ -1121,7 +1113,6 @@ static void remote_ls(const char *path, int flags,
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	struct remote_ls_ctx ls;
 
@@ -1134,14 +1125,14 @@ static void remote_ls(const char *path, int flags,
 
 	strbuf_addstr(&out_buffer.buf, PROPFIND_ALL_REQUEST);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 1");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 1");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1177,7 +1168,6 @@ static void remote_ls(const char *path, int flags,
 	free(url);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 }
 
 static void get_remote_object_list(unsigned char parent)
@@ -1199,7 +1189,6 @@ static int locking_available(void)
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	int lock_flags = 0;
 	char *escaped;
@@ -1208,14 +1197,14 @@ static int locking_available(void)
 	strbuf_addf(&out_buffer.buf, PROPFIND_SUPPORTEDLOCK_REQUEST, escaped);
 	free(escaped);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 0");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 0");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, repo->url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1257,7 +1246,6 @@ static int locking_available(void)
 
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 
 	return lock_flags;
 }
@@ -1374,17 +1362,16 @@ static int update_remote(const struct object_id *oid, struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers;
-
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
 	strbuf_addf(&out_buffer.buf, "%s\n", oid_to_hex(oid));
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF);
+
 	curl_setup_http(slot->curl, lock->url, DAV_PUT,
 			&out_buffer, fwrite_null);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -1486,18 +1473,18 @@ static void update_remote_info_refs(struct remote_lock *lock)
 	struct buffer buffer = { STRBUF_INIT, 0 };
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 
 	remote_ls("refs/", (PROCESS_FILES | RECURSIVE),
 		  add_remote_info_ref, &buffer.buf);
 	if (!aborted) {
-		dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
+		slot->headers = append_dav_token_headers(slot->headers, lock,
+			DAV_HEADER_IF);
+
 		curl_setup_http(slot->curl, lock->url, DAV_PUT,
 				&buffer, fwrite_null);
-		curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 		if (start_active_slot(slot)) {
 			run_active_slot(slot);
@@ -1652,7 +1639,7 @@ static int delete_remote_branch(const char *pattern, int force)
 	if (dry_run)
 		return 0;
 	url = xstrfmt("%s%s", repo->url, remote_ref->name);
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
 	curl_setup_http_get(slot->curl, url, DAV_DELETE);
 	if (start_active_slot(slot)) {
diff --git a/http-walker.c b/http-walker.c
index b8f0f98ae14..8747de2fcdb 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -373,7 +373,7 @@ static void fetch_alternates(struct walker *walker, const char *base)
 	 * Use a callback to process the result, since another request
 	 * may fail and need to have alternates loaded before continuing
 	 */
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_alternates_response;
 	alt_req.walker = walker;
 	slot->callback_data = &alt_req;
diff --git a/http.c b/http.c
index 03d43d352e7..f2ebb17c8c4 100644
--- a/http.c
+++ b/http.c
@@ -124,8 +124,6 @@ static unsigned long empty_auth_useless =
 	| CURLAUTH_DIGEST_IE
 	| CURLAUTH_DIGEST;
 
-static struct curl_slist *pragma_header;
-static struct curl_slist *no_pragma_header;
 static struct string_list extra_http_headers = STRING_LIST_INIT_DUP;
 
 static struct curl_slist *host_resolutions;
@@ -1133,11 +1131,6 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
-	pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma: no-cache");
-	no_pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma:");
-
 	{
 		char *http_max_requests = getenv("GIT_HTTP_MAX_REQUESTS");
 		if (http_max_requests)
@@ -1199,6 +1192,8 @@ void http_cleanup(void)
 
 	while (slot != NULL) {
 		struct active_request_slot *next = slot->next;
+		if (slot->headers)
+			curl_slist_free_all(slot->headers);
 		if (slot->curl) {
 			xmulti_remove_handle(slot);
 			curl_easy_cleanup(slot->curl);
@@ -1215,12 +1210,6 @@ void http_cleanup(void)
 
 	string_list_clear(&extra_http_headers, 0);
 
-	curl_slist_free_all(pragma_header);
-	pragma_header = NULL;
-
-	curl_slist_free_all(no_pragma_header);
-	no_pragma_header = NULL;
-
 	curl_slist_free_all(host_resolutions);
 	host_resolutions = NULL;
 
@@ -1255,7 +1244,18 @@ void http_cleanup(void)
 	FREE_AND_NULL(cached_accept_language);
 }
 
-struct active_request_slot *get_active_slot(void)
+static struct curl_slist *http_copy_default_headers(void)
+{
+	struct curl_slist *headers = NULL;
+	const struct string_list_item *item;
+
+	for_each_string_list_item(item, &extra_http_headers)
+		headers = curl_slist_append(headers, item->string);
+
+	return headers;
+}
+
+struct active_request_slot *get_active_slot(int no_pragma_header)
 {
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
@@ -1277,6 +1277,7 @@ struct active_request_slot *get_active_slot(void)
 		newslot->curl = NULL;
 		newslot->in_use = 0;
 		newslot->next = NULL;
+		newslot->headers = NULL;
 
 		slot = active_queue_head;
 		if (!slot) {
@@ -1294,6 +1295,15 @@ struct active_request_slot *get_active_slot(void)
 		curl_session_count++;
 	}
 
+	if (slot->headers)
+		curl_slist_free_all(slot->headers);
+
+	slot->headers = http_copy_default_headers();
+
+	if (!no_pragma_header)
+		slot->headers = curl_slist_append(slot->headers,
+			"Pragma: no-cache");
+
 	active_requests++;
 	slot->in_use = 1;
 	slot->results = NULL;
@@ -1303,7 +1313,6 @@ struct active_request_slot *get_active_slot(void)
 	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
 	if (curl_save_cookies)
 		curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
 	curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
 	curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
 	curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
@@ -1335,9 +1344,12 @@ struct active_request_slot *get_active_slot(void)
 
 int start_active_slot(struct active_request_slot *slot)
 {
-	CURLMcode curlm_result = curl_multi_add_handle(curlm, slot->curl);
+	CURLMcode curlm_result;
 	int num_transfers;
 
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, slot->headers);
+	curlm_result = curl_multi_add_handle(curlm, slot->curl);
+
 	if (curlm_result != CURLM_OK &&
 	    curlm_result != CURLM_CALL_MULTI_PERFORM) {
 		warning("curl_multi_add_handle failed: %s",
@@ -1652,17 +1664,6 @@ int run_one_slot(struct active_request_slot *slot,
 	return handle_curl_result(results);
 }
 
-struct curl_slist *http_copy_default_headers(void)
-{
-	struct curl_slist *headers = NULL;
-	const struct string_list_item *item;
-
-	for_each_string_list_item(item, &extra_http_headers)
-		headers = curl_slist_append(headers, item->string);
-
-	return headers;
-}
-
 static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
 {
 	char *ptr;
@@ -1880,12 +1881,11 @@ static int http_request(const char *url,
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *headers = http_copy_default_headers();
-	struct strbuf buf = STRBUF_INIT;
+	int no_cache = options && options->no_cache;
 	const char *accept_language;
 	int ret;
 
-	slot = get_active_slot();
+	slot = get_active_slot(!no_cache);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
 	if (!result) {
@@ -1910,27 +1910,23 @@ static int http_request(const char *url,
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-		headers = curl_slist_append(headers, accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			accept_language);
 
-	strbuf_addstr(&buf, "Pragma:");
-	if (options && options->no_cache)
-		strbuf_addstr(&buf, " no-cache");
 	if (options && options->initial_request &&
 	    http_follow_config == HTTP_FOLLOW_INITIAL)
 		curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 1);
 
-	headers = curl_slist_append(headers, buf.buf);
-
 	/* Add additional headers here */
 	if (options && options->extra_headers) {
 		const struct string_list_item *item;
 		for_each_string_list_item(item, options->extra_headers) {
-			headers = curl_slist_append(headers, item->string);
+			slot->headers = curl_slist_append(slot->headers,
+				item->string);
 		}
 	}
 
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -1948,9 +1944,6 @@ static int http_request(const char *url,
 		curlinfo_strbuf(slot->curl, CURLINFO_EFFECTIVE_URL,
 				options->effective_url);
 
-	curl_slist_free_all(headers);
-	strbuf_release(&buf);
-
 	return ret;
 }
 
@@ -2311,12 +2304,10 @@ struct http_pack_request *new_direct_http_pack_request(
 		goto abort;
 	}
 
-	preq->slot = get_active_slot();
+	preq->slot = get_active_slot(1);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
-	curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER,
-		no_pragma_header);
 
 	/*
 	 * If there is data present from a previous transfer attempt,
@@ -2481,14 +2472,13 @@ struct http_object_request *new_http_object_request(const char *base_url,
 		}
 	}
 
-	freq->slot = get_active_slot();
+	freq->slot = get_active_slot(1);
 
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_URL, freq->url);
-	curl_easy_setopt(freq->slot->curl, CURLOPT_HTTPHEADER, no_pragma_header);
 
 	/*
 	 * If we have successfully processed data from a previous fetch
diff --git a/http.h b/http.h
index 3c94c479100..a304cc408b2 100644
--- a/http.h
+++ b/http.h
@@ -22,6 +22,7 @@ struct slot_results {
 struct active_request_slot {
 	CURL *curl;
 	int in_use;
+	struct curl_slist *headers;
 	CURLcode curl_result;
 	long http_code;
 	int *finished;
@@ -43,7 +44,7 @@ size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf);
 curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp);
 
 /* Slot lifecycle functions */
-struct active_request_slot *get_active_slot(void);
+struct active_request_slot *get_active_slot(int no_pragma_header);
 int start_active_slot(struct active_request_slot *slot);
 void run_active_slot(struct active_request_slot *slot);
 void finish_all_active_slots(void);
@@ -64,7 +65,6 @@ void step_active_slots(void);
 void http_init(struct remote *remote, const char *url,
 	       int proactive_auth);
 void http_cleanup(void);
-struct curl_slist *http_copy_default_headers(void);
 
 extern long int git_curl_ipresolve;
 extern int active_requests;
diff --git a/remote-curl.c b/remote-curl.c
index 72dfb8fb86a..edbd4504beb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -847,14 +847,13 @@ static int run_slot(struct active_request_slot *slot,
 static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	struct strbuf buf = STRBUF_INIT;
 	int err;
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -862,13 +861,11 @@ static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, NULL);
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, "0000");
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, 4);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buf);
 
 	err = run_slot(slot, results);
 
-	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 	return err;
 }
@@ -888,7 +885,6 @@ static curl_off_t xcurl_off_t(size_t len)
 static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_received)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	int use_gzip = rpc->gzip_request;
 	char *gzip_body = NULL;
 	size_t gzip_size = 0;
@@ -930,21 +926,23 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 			needs_100_continue = 1;
 	}
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
-	headers = curl_slist_append(headers, needs_100_continue ?
+retry:
+	slot = get_active_slot(0);
+
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, needs_100_continue ?
 		"Expect: 100-continue" : "Expect:");
 
 	/* Add Accept-Language header */
 	if (rpc->hdr_accept_language)
-		headers = curl_slist_append(headers, rpc->hdr_accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->hdr_accept_language);
 
 	/* Add the extra Git-Protocol header */
 	if (rpc->protocol_header)
-		headers = curl_slist_append(headers, rpc->protocol_header);
-
-retry:
-	slot = get_active_slot();
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->protocol_header);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -955,7 +953,8 @@ retry:
 		/* The request body is large and the size cannot be predicted.
 		 * We must use chunked encoding to send it.
 		 */
-		headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
+		slot->headers = curl_slist_append(slot->headers,
+			"Transfer-Encoding: chunked");
 		rpc->initial_buffer = 1;
 		curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out);
 		curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc);
@@ -1002,7 +1001,8 @@ retry:
 
 		gzip_size = stream.total_out;
 
-		headers = curl_slist_append(headers, "Content-Encoding: gzip");
+		slot->headers = curl_slist_append(slot->headers,
+			"Content-Encoding: gzip");
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, gzip_body);
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE_LARGE, xcurl_off_t(gzip_size));
 
@@ -1025,7 +1025,6 @@ retry:
 		}
 	}
 
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
@@ -1055,7 +1054,6 @@ retry:
 	if (stateless_connect)
 		packet_response_end(rpc->in);
 
-	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v2 4/6] http: move proactive auth to first slot creation
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-10-21 17:08   ` [PATCH v2 3/6] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:08   ` Matthew John Cheetham via GitGitGadget
  2022-10-21 17:08   ` [PATCH v2 5/6] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:08 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Rather than proactively seek credentials to authenticate a request at
`http_init()` time, do it when the first `active_request_slot` is
created.

Because credential helpers may modify the headers used for a request we
can only auth when a slot is created (when we can first start to gather
request headers).

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/http.c b/http.c
index f2ebb17c8c4..17b47195d22 100644
--- a/http.c
+++ b/http.c
@@ -515,18 +515,18 @@ static int curl_empty_auth_enabled(void)
 	return 0;
 }
 
-static void init_curl_http_auth(CURL *result)
+static void init_curl_http_auth(struct active_request_slot *slot)
 {
 	if (!http_auth.username || !*http_auth.username) {
 		if (curl_empty_auth_enabled())
-			curl_easy_setopt(result, CURLOPT_USERPWD, ":");
+			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
 	}
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(result, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(result, CURLOPT_PASSWORD, http_auth.password);
+	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
+	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
 }
 
 /* *var must be free-able */
@@ -901,9 +901,6 @@ static CURL *get_curl_handle(void)
 #endif
 	}
 
-	if (http_proactive_auth)
-		init_curl_http_auth(result);
-
 	if (getenv("GIT_SSL_VERSION"))
 		ssl_version = getenv("GIT_SSL_VERSION");
 	if (ssl_version && *ssl_version) {
@@ -1260,6 +1257,7 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
 
+	int proactive_auth = 0;
 	int num_transfers;
 
 	/* Wait for a slot to open up if the queue is full */
@@ -1282,6 +1280,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 		slot = active_queue_head;
 		if (!slot) {
 			active_queue_head = newslot;
+
+			/* Auth first slot if asked for proactive auth */
+			proactive_auth = http_proactive_auth;
 		} else {
 			while (slot->next != NULL)
 				slot = slot->next;
@@ -1336,8 +1337,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 
 	curl_easy_setopt(slot->curl, CURLOPT_IPRESOLVE, git_curl_ipresolve);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, http_auth_methods);
-	if (http_auth.password || curl_empty_auth_enabled())
-		init_curl_http_auth(slot->curl);
+
+	if (http_auth.password || curl_empty_auth_enabled() || proactive_auth)
+		init_curl_http_auth(slot);
 
 	return slot;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v2 5/6] http: set specific auth scheme depending on credential
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-10-21 17:08   ` [PATCH v2 4/6] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:08   ` Matthew John Cheetham via GitGitGadget
  2022-10-21 17:08   ` [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic Matthew John Cheetham via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:08 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a new credential field `authtype` that can be used by
credential helpers to indicate the type of the credential or
authentication mechanism to use for a request.

Modify http.c to now specify the correct authentication scheme or
credential type when authenticating the curl handle. If the new
`authtype` field in the credential structure is `NULL` or "Basic" then
use the existing username/password options. If the field is "Bearer"
then use the OAuth bearer token curl option. Otherwise, the `authtype`
field is the authentication scheme and the `password` field is the
raw, unencoded value.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt |  9 +++++++++
 credential.c                     |  5 +++++
 credential.h                     |  1 +
 git-curl-compat.h                | 10 ++++++++++
 http.c                           | 24 +++++++++++++++++++++---
 5 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index 0ff3cbc25b9..82ade09b5e9 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -169,6 +169,15 @@ username in the example above) will be left unset.
 	attribute 'wwwauth[]' where the order of the attributes is the same
 	as they appear in the HTTP response.
 
+`authtype`::
+
+	Indicates the type of authentication scheme used. If this is not
+	present the default is "Basic".
+	Known values include "Basic", "Digest", and "Bearer".
+	If an unknown value is provided, this is taken as the authentication
+	scheme for the `Authorization` header, and the `password` field is
+	used as the raw unencoded authorization parameters of the same header.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/credential.c b/credential.c
index 8a3ad6c0ae2..a556f9f375a 100644
--- a/credential.c
+++ b/credential.c
@@ -21,6 +21,7 @@ void credential_clear(struct credential *c)
 	free(c->path);
 	free(c->username);
 	free(c->password);
+	free(c->authtype);
 	string_list_clear(&c->helpers, 0);
 	strvec_clear(&c->wwwauth_headers);
 
@@ -235,6 +236,9 @@ int credential_read(struct credential *c, FILE *fp)
 		} else if (!strcmp(key, "path")) {
 			free(c->path);
 			c->path = xstrdup(value);
+		} else if (!strcmp(key, "authtype")) {
+			free(c->authtype);
+			c->authtype = xstrdup(value);
 		} else if (!strcmp(key, "url")) {
 			credential_from_url(c, value);
 		} else if (!strcmp(key, "quit")) {
@@ -281,6 +285,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_item(fp, "authtype", c->authtype, 0);
 	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
diff --git a/credential.h b/credential.h
index 6f2e5bc610b..8d580b054d0 100644
--- a/credential.h
+++ b/credential.h
@@ -140,6 +140,7 @@ struct credential {
 	char *protocol;
 	char *host;
 	char *path;
+	char *authtype;
 };
 
 #define CREDENTIAL_INIT { \
diff --git a/git-curl-compat.h b/git-curl-compat.h
index 56a83b6bbd8..839049f6dfe 100644
--- a/git-curl-compat.h
+++ b/git-curl-compat.h
@@ -126,4 +126,14 @@
 #define GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS
 #endif
 
+/**
+ * CURLAUTH_BEARER was added in 7.61.0, released in July 2018.
+ * However, only 7.69.0 fixes a bug where Bearer headers were not
+ * actually sent with reused connections on subsequent transfers
+ * (curl/curl@dea17b519dc1).
+ */
+#if LIBCURL_VERSION_NUM >= 0x074500
+#define GIT_CURL_HAVE_CURLAUTH_BEARER
+#endif
+
 #endif
diff --git a/http.c b/http.c
index 17b47195d22..ac620bcbf0c 100644
--- a/http.c
+++ b/http.c
@@ -517,7 +517,8 @@ static int curl_empty_auth_enabled(void)
 
 static void init_curl_http_auth(struct active_request_slot *slot)
 {
-	if (!http_auth.username || !*http_auth.username) {
+	if (!http_auth.authtype &&
+		(!http_auth.username || !*http_auth.username)) {
 		if (curl_empty_auth_enabled())
 			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
@@ -525,8 +526,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
+	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
+				|| !strcasecmp(http_auth.authtype, "digest")) {
+		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
+			http_auth.username);
+		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
+			http_auth.password);
+#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
+	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
+		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
+		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
+			http_auth.password);
+#endif
+	} else {
+		struct strbuf auth = STRBUF_INIT;
+		strbuf_addf(&auth, "Authorization: %s %s",
+			http_auth.authtype, http_auth.password);
+		slot->headers = curl_slist_append(slot->headers, auth.buf);
+		strbuf_release(&auth);
+	}
 }
 
 /* *var must be free-able */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-10-21 17:08   ` [PATCH v2 5/6] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
@ 2022-10-21 17:08   ` Matthew John Cheetham via GitGitGadget
  2022-10-28 15:08     ` Derrick Stolee
  2022-10-25  2:26   ` git-credential.txt M Hickford
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  7 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-10-21 17:08 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham, Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add a series of tests to exercise the HTTP authentication header parsing
and the interop with credential helpers. Credential helpers can respond
to requests that contain WWW-Authenticate information with the ability
to select the response Authenticate header scheme.

Introduce a mini HTTP server helper that provides a frontend for the
git-http-backend, with support for arbitrary authentication schemes.
The test-http-server is based heavily on the git-daemon, and forwards
all successfully authenticated requests to the http-backend.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Makefile                                  |    2 +
 contrib/buildsystems/CMakeLists.txt       |   13 +
 t/helper/.gitignore                       |    1 +
 t/helper/test-credential-helper-replay.sh |   14 +
 t/helper/test-http-server.c               | 1134 +++++++++++++++++++++
 t/t5556-http-auth.sh                      |  260 +++++
 6 files changed, 1424 insertions(+)
 create mode 100755 t/helper/test-credential-helper-replay.sh
 create mode 100644 t/helper/test-http-server.c
 create mode 100755 t/t5556-http-auth.sh

diff --git a/Makefile b/Makefile
index d93ad956e58..39b130f711d 100644
--- a/Makefile
+++ b/Makefile
@@ -1500,6 +1500,8 @@ else
 	endif
 	BASIC_CFLAGS += $(CURL_CFLAGS)
 
+	TEST_PROGRAMS_NEED_X += test-http-server
+
 	REMOTE_CURL_PRIMARY = git-remote-http$X
 	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
 	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 787738e6fa3..45251695ce0 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -989,6 +989,19 @@ set(wrapper_scripts
 set(wrapper_test_scripts
 	test-fake-ssh test-tool)
 
+if(CURL_FOUND)
+       list(APPEND wrapper_test_scripts test-http-server)
+
+       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
+       target_link_libraries(test-http-server common-main)
+
+       if(MSVC)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
+       endif()
+endif()
 
 foreach(script ${wrapper_scripts})
 	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 8c2ddcce95f..1a94ab6eed5 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -1,2 +1,3 @@
 /test-tool
 /test-fake-ssh
+test-http-server
diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
new file mode 100755
index 00000000000..03e5e63dad6
--- /dev/null
+++ b/t/helper/test-credential-helper-replay.sh
@@ -0,0 +1,14 @@
+cmd=$1
+teefile=$cmd-actual.cred
+catfile=$cmd-response.cred
+rm -f $teefile
+while read line;
+do
+	if test -z "$line"; then
+		break;
+	fi
+	echo "$line" >> $teefile
+done
+if test "$cmd" = "get"; then
+	cat $catfile
+fi
diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
new file mode 100644
index 00000000000..92139c04c90
--- /dev/null
+++ b/t/helper/test-http-server.c
@@ -0,0 +1,1134 @@
+#include "config.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "string-list.h"
+#include "trace2.h"
+#include "version.h"
+#include "dir.h"
+#include "date.h"
+
+#define TR2_CAT "test-http-server"
+
+static const char *pid_file;
+static int verbose;
+static int reuseaddr;
+
+static const char test_http_auth_usage[] =
+"http-server [--verbose]\n"
+"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
+"           [--reuseaddr] [--pid-file=<file>]\n"
+"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
+"           [--anonymous-allowed]\n"
+"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
+;
+
+/* Timeout, and initial timeout */
+static unsigned int timeout;
+static unsigned int init_timeout;
+
+static void logreport(const char *label, const char *err, va_list params)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
+	strbuf_vaddf(&msg, err, params);
+	strbuf_addch(&msg, '\n');
+
+	fwrite(msg.buf, sizeof(char), msg.len, stderr);
+	fflush(stderr);
+
+	strbuf_release(&msg);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void logerror(const char *err, ...)
+{
+	va_list params;
+	va_start(params, err);
+	logreport("error", err, params);
+	va_end(params);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void loginfo(const char *err, ...)
+{
+	va_list params;
+	if (!verbose)
+		return;
+	va_start(params, err);
+	logreport("info", err, params);
+	va_end(params);
+}
+
+static void set_keep_alive(int sockfd)
+{
+	int ka = 1;
+
+	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
+		if (errno != ENOTSOCK)
+			logerror("unable to set SO_KEEPALIVE on socket: %s",
+				strerror(errno));
+	}
+}
+
+//////////////////////////////////////////////////////////////////
+// The code in this section is used by "worker" instances to service
+// a single connection from a client.  The worker talks to the client
+// on 0 and 1.
+//////////////////////////////////////////////////////////////////
+
+enum worker_result {
+	/*
+	 * Operation successful.
+	 * Caller *might* keep the socket open and allow keep-alive.
+	 */
+	WR_OK       = 0,
+	/*
+	 * Various errors while processing the request and/or the response.
+	 * Close the socket and clean up.
+	 * Exit child-process with non-zero status.
+	 */
+	WR_IO_ERROR = 1<<0,
+	/*
+	 * Close the socket and clean up.  Does not imply an error.
+	 */
+	WR_HANGUP   = 1<<1,
+
+	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
+};
+
+/*
+ * Fields from a parsed HTTP request.
+ */
+struct req {
+	struct strbuf start_line;
+
+	const char *method;
+	const char *http_version;
+
+	struct strbuf uri_path;
+	struct strbuf query_args;
+
+	struct string_list header_list;
+	const char *content_type;
+	ssize_t content_length;
+};
+
+#define REQ__INIT { \
+	.start_line = STRBUF_INIT, \
+	.uri_path = STRBUF_INIT, \
+	.query_args = STRBUF_INIT, \
+	.header_list = STRING_LIST_INIT_NODUP, \
+	.content_type = NULL, \
+	.content_length = -1 \
+	}
+
+static void req__release(struct req *req)
+{
+	strbuf_release(&req->start_line);
+
+	strbuf_release(&req->uri_path);
+	strbuf_release(&req->query_args);
+
+	string_list_clear(&req->header_list, 0);
+}
+
+static enum worker_result send_http_error(
+	int fd,
+	int http_code, const char *http_code_name,
+	int retry_after_seconds, struct string_list *response_headers,
+	enum worker_result wr_in)
+{
+	struct strbuf response_header = STRBUF_INIT;
+	struct strbuf response_content = STRBUF_INIT;
+	struct string_list_item *h;
+	enum worker_result wr;
+
+	strbuf_addf(&response_content, "Error: %d %s\r\n",
+		    http_code, http_code_name);
+	if (retry_after_seconds > 0)
+		strbuf_addf(&response_content, "Retry-After: %d\r\n",
+			    retry_after_seconds);
+
+	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
+	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
+	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
+	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
+	if (retry_after_seconds > 0)
+		strbuf_addf  (&response_header, "Retry-After: %d\r\n", retry_after_seconds);
+	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
+	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
+	if (response_headers)
+		for_each_string_list_item(h, response_headers)
+			strbuf_addf(&response_header, "%s\r\n", h->string);
+	strbuf_addstr(&response_header, "\r\n");
+
+	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
+		logerror("unable to write response header");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
+		logerror("unable to write response content body");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	wr = wr_in;
+
+done:
+	strbuf_release(&response_header);
+	strbuf_release(&response_content);
+
+	return wr;
+}
+
+/*
+ * Read the HTTP request up to the start of the optional message-body.
+ * We do this byte-by-byte because we have keep-alive turned on and
+ * cannot rely on an EOF.
+ *
+ * https://tools.ietf.org/html/rfc7230
+ *
+ * We cannot call die() here because our caller needs to properly
+ * respond to the client and/or close the socket before this
+ * child exits so that the client doesn't get a connection reset
+ * by peer error.
+ */
+static enum worker_result req__read(struct req *req, int fd)
+{
+	struct strbuf h = STRBUF_INIT;
+	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
+	int nr_start_line_fields;
+	const char *uri_target;
+	const char *query;
+	char *hp;
+	const char *hv;
+
+	enum worker_result result = WR_OK;
+
+	/*
+	 * Read line 0 of the request and split it into component parts:
+	 *
+	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
+	 *
+	 */
+	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
+		result = WR_OK | WR_HANGUP;
+		goto done;
+	}
+
+	strbuf_trim_trailing_newline(&req->start_line);
+
+	nr_start_line_fields = string_list_split(&start_line_fields,
+						 req->start_line.buf,
+						 ' ', -1);
+	if (nr_start_line_fields != 3) {
+		logerror("could not parse request start-line '%s'",
+			 req->start_line.buf);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	req->method = xstrdup(start_line_fields.items[0].string);
+	req->http_version = xstrdup(start_line_fields.items[2].string);
+
+	uri_target = start_line_fields.items[1].string;
+
+	if (strcmp(req->http_version, "HTTP/1.1")) {
+		logerror("unsupported version '%s' (expecting HTTP/1.1)",
+			 req->http_version);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	query = strchr(uri_target, '?');
+
+	if (query) {
+		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+		strbuf_addstr(&req->query_args, query + 1);
+	} else {
+		strbuf_addstr(&req->uri_path, uri_target);
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+	}
+
+	/*
+	 * Read the set of HTTP headers into a string-list.
+	 */
+	while (1) {
+		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
+			goto done;
+		strbuf_trim_trailing_newline(&h);
+
+		if (!h.len)
+			goto done; /* a blank line ends the header */
+
+		hp = strbuf_detach(&h, NULL);
+		string_list_append(&req->header_list, hp);
+
+		/* store common request headers separately */
+		if (skip_prefix(hp, "Content-Type: ", &hv)) {
+			req->content_type = hv;
+		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
+			req->content_length = strtol(hv, &hp, 10);
+		}
+	}
+
+	/*
+	 * We do not attempt to read the <message-body>, if it exists.
+	 * We let our caller read/chunk it in as appropriate.
+	 */
+
+done:
+	string_list_clear(&start_line_fields, 0);
+
+	/*
+	 * This is useful for debugging the request, but very noisy.
+	 */
+	if (trace2_is_enabled()) {
+		struct string_list_item *item;
+		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
+		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
+		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
+		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
+		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
+		if (req->content_length >= 0)
+			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
+		if (req->content_type)
+			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
+		for_each_string_list_item(item, &req->header_list)
+			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
+	}
+
+	return result;
+}
+
+static int is_git_request(struct req *req)
+{
+	static regex_t *smart_http_regex;
+	static int initialized;
+
+	if (!initialized) {
+		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
+		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
+			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
+			    REG_EXTENDED)) {
+			warning("could not compile smart HTTP regex");
+			smart_http_regex = NULL;
+		}
+		initialized = 1;
+	}
+
+	return smart_http_regex &&
+		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
+}
+
+static enum worker_result do__git(struct req *req, const char *user)
+{
+	const char *ok = "HTTP/1.1 200 OK\r\n";
+	struct child_process cp = CHILD_PROCESS_INIT;
+	int res;
+
+	if (write(1, ok, strlen(ok)) < 0)
+		return error(_("could not send '%s'"), ok);
+
+	if (user)
+		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
+
+	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
+	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
+			req->uri_path.buf);
+	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
+	if (req->query_args.len)
+		strvec_pushf(&cp.env, "QUERY_STRING=%s",
+				req->query_args.buf);
+	if (req->content_type)
+		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
+				req->content_type);
+	if (req->content_length >= 0)
+		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
+				(intmax_t)req->content_length);
+	cp.git_cmd = 1;
+	strvec_push(&cp.args, "http-backend");
+	res = run_command(&cp);
+	close(1);
+	close(0);
+	return !!res;
+}
+
+enum auth_result {
+	AUTH_UNKNOWN = 0,
+	AUTH_DENY = 1,
+	AUTH_ALLOW = 2,
+};
+
+struct auth_module {
+	const char *scheme;
+	const char *challenge_params;
+	struct string_list *tokens;
+};
+
+static int allow_anonymous;
+static struct auth_module **auth_modules = NULL;
+static size_t auth_modules_nr = 0;
+static size_t auth_modules_alloc = 0;
+
+static struct auth_module *get_auth_module(struct strbuf *scheme)
+{
+	int i;
+	struct auth_module *mod;
+	for (i = 0; i < auth_modules_nr; i++) {
+		mod = auth_modules[i];
+		if (!strcasecmp(mod->scheme, scheme->buf))
+			return mod;
+	}
+
+	return NULL;
+}
+
+static void add_auth_module(struct auth_module *mod)
+{
+	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
+	auth_modules[auth_modules_nr++] = mod;
+}
+
+static int is_authed(struct req *req, const char **user, enum worker_result *wr)
+{
+	enum auth_result result = AUTH_UNKNOWN;
+	struct string_list hdrs = STRING_LIST_INIT_NODUP;
+	struct auth_module *mod;
+
+	struct string_list_item *hdr;
+	struct string_list_item *token;
+	const char *v;
+	struct strbuf **split = NULL;
+	int i;
+	char *challenge;
+
+	/* ask all auth modules to validate the request */
+	for_each_string_list_item(hdr, &req->header_list) {
+		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
+			split = strbuf_split_str(v, ' ', 2);
+			if (!split[0] || !split[1]) continue;
+
+			// trim trailing space ' '
+			strbuf_setlen(split[0], split[0]->len - 1);
+
+			mod = get_auth_module(split[0]);
+			if (mod) {
+
+				for_each_string_list_item(token, mod->tokens) {
+					if (!strcmp(split[1]->buf, token->string)) {
+						result = AUTH_ALLOW;
+						goto done;
+					}
+				}
+
+				if (result != AUTH_UNKNOWN)
+					goto done;
+			}
+		}
+	}
+
+done:
+	switch (result) {
+	case AUTH_ALLOW:
+		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
+		*user = "VALID_TEST_USER";
+		*wr = WR_OK;
+		break;
+
+	case AUTH_DENY:
+		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
+		/* fall-through */
+
+	case AUTH_UNKNOWN:
+		if (allow_anonymous)
+			break;
+		for (i = 0; i < auth_modules_nr; i++) {
+			mod = auth_modules[i];
+			if (mod->challenge_params)
+				challenge = xstrfmt("WWW-Authenticate: %s %s",
+						    mod->scheme,
+						    mod->challenge_params);
+			else
+				challenge = xstrfmt("WWW-Authenticate: %s",
+						    mod->scheme);
+			string_list_append(&hdrs, challenge);
+		}
+		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
+	}
+
+	strbuf_list_free(split);
+	string_list_clear(&hdrs, 0);
+
+	return result == AUTH_ALLOW ||
+	      (result == AUTH_UNKNOWN && allow_anonymous);
+}
+
+static enum worker_result dispatch(struct req *req)
+{
+	enum worker_result wr = WR_OK;
+	const char *user = NULL;
+
+	if (!is_authed(req, &user, &wr))
+		return wr;
+
+	if (is_git_request(req))
+		return do__git(req, user);
+
+	return send_http_error(1, 501, "Not Implemented", -1, NULL,
+			       WR_OK | WR_HANGUP);
+}
+
+static enum worker_result worker(void)
+{
+	struct req req = REQ__INIT;
+	char *client_addr = getenv("REMOTE_ADDR");
+	char *client_port = getenv("REMOTE_PORT");
+	enum worker_result wr = WR_OK;
+
+	if (client_addr)
+		loginfo("Connection from %s:%s", client_addr, client_port);
+
+	set_keep_alive(0);
+
+	while (1) {
+		req__release(&req);
+
+		alarm(init_timeout ? init_timeout : timeout);
+		wr = req__read(&req, 0);
+		alarm(0);
+
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+
+		wr = dispatch(&req);
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+	}
+
+	close(0);
+	close(1);
+
+	return !!(wr & WR_IO_ERROR);
+}
+
+//////////////////////////////////////////////////////////////////
+// This section contains the listener and child-process management
+// code used by the primary instance to accept incoming connections
+// and dispatch them to async child process "worker" instances.
+//////////////////////////////////////////////////////////////////
+
+static int addrcmp(const struct sockaddr_storage *s1,
+		   const struct sockaddr_storage *s2)
+{
+	const struct sockaddr *sa1 = (const struct sockaddr*) s1;
+	const struct sockaddr *sa2 = (const struct sockaddr*) s2;
+
+	if (sa1->sa_family != sa2->sa_family)
+		return sa1->sa_family - sa2->sa_family;
+	if (sa1->sa_family == AF_INET)
+		return memcmp(&((struct sockaddr_in *)s1)->sin_addr,
+		    &((struct sockaddr_in *)s2)->sin_addr,
+		    sizeof(struct in_addr));
+#ifndef NO_IPV6
+	if (sa1->sa_family == AF_INET6)
+		return memcmp(&((struct sockaddr_in6 *)s1)->sin6_addr,
+		    &((struct sockaddr_in6 *)s2)->sin6_addr,
+		    sizeof(struct in6_addr));
+#endif
+	return 0;
+}
+
+static int max_connections = 32;
+
+static unsigned int live_children;
+
+static struct child {
+	struct child *next;
+	struct child_process cld;
+	struct sockaddr_storage address;
+} *firstborn;
+
+static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child *newborn, **cradle;
+
+	newborn = xcalloc(1, sizeof(*newborn));
+	live_children++;
+	memcpy(&newborn->cld, cld, sizeof(*cld));
+	memcpy(&newborn->address, addr, addrlen);
+	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
+		if (!addrcmp(&(*cradle)->address, &newborn->address))
+			break;
+	newborn->next = *cradle;
+	*cradle = newborn;
+}
+
+/*
+ * This gets called if the number of connections grows
+ * past "max_connections".
+ *
+ * We kill the newest connection from a duplicate IP.
+ */
+static void kill_some_child(void)
+{
+	const struct child *blanket, *next;
+
+	if (!(blanket = firstborn))
+		return;
+
+	for (; (next = blanket->next); blanket = next)
+		if (!addrcmp(&blanket->address, &next->address)) {
+			kill(blanket->cld.pid, SIGTERM);
+			break;
+		}
+}
+
+static void check_dead_children(void)
+{
+	int status;
+	pid_t pid;
+
+	struct child **cradle, *blanket;
+	for (cradle = &firstborn; (blanket = *cradle);)
+		if ((pid = waitpid(blanket->cld.pid, &status, WNOHANG)) > 1) {
+			const char *dead = "";
+			if (status)
+				dead = " (with error)";
+			loginfo("[%"PRIuMAX"] Disconnected%s", (uintmax_t)pid, dead);
+
+			/* remove the child */
+			*cradle = blanket->next;
+			live_children--;
+			child_process_clear(&blanket->cld);
+			free(blanket);
+		} else
+			cradle = &blanket->next;
+}
+
+static struct strvec cld_argv = STRVEC_INIT;
+static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child_process cld = CHILD_PROCESS_INIT;
+
+	if (max_connections && live_children >= max_connections) {
+		kill_some_child();
+		sleep(1);  /* give it some time to die */
+		check_dead_children();
+		if (live_children >= max_connections) {
+			close(incoming);
+			logerror("Too many children, dropping connection");
+			return;
+		}
+	}
+
+	if (addr->sa_family == AF_INET) {
+		char buf[128] = "";
+		struct sockaddr_in *sin_addr = (void *) addr;
+		inet_ntop(addr->sa_family, &sin_addr->sin_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=%s", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin_addr->sin_port));
+#ifndef NO_IPV6
+	} else if (addr->sa_family == AF_INET6) {
+		char buf[128] = "";
+		struct sockaddr_in6 *sin6_addr = (void *) addr;
+		inet_ntop(AF_INET6, &sin6_addr->sin6_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=[%s]", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin6_addr->sin6_port));
+#endif
+	}
+
+	strvec_pushv(&cld.args, cld_argv.v);
+	cld.in = incoming;
+	cld.out = dup(incoming);
+
+	if (cld.out < 0)
+		logerror("could not dup() `incoming`");
+	else if (start_command(&cld))
+		logerror("unable to fork");
+	else
+		add_child(&cld, addr, addrlen);
+}
+
+static void child_handler(int signo)
+{
+	/*
+	 * Otherwise empty handler because systemcalls will get interrupted
+	 * upon signal receipt
+	 * SysV needs the handler to be rearmed
+	 */
+	signal(SIGCHLD, child_handler);
+}
+
+static int set_reuse_addr(int sockfd)
+{
+	int on = 1;
+
+	if (!reuseaddr)
+		return 0;
+	return setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR,
+			  &on, sizeof(on));
+}
+
+struct socketlist {
+	int *list;
+	size_t nr;
+	size_t alloc;
+};
+
+static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
+{
+#ifdef NO_IPV6
+	static char ip[INET_ADDRSTRLEN];
+#else
+	static char ip[INET6_ADDRSTRLEN];
+#endif
+
+	switch (family) {
+#ifndef NO_IPV6
+	case AF_INET6:
+		inet_ntop(family, &((struct sockaddr_in6*)sin)->sin6_addr, ip, len);
+		break;
+#endif
+	case AF_INET:
+		inet_ntop(family, &((struct sockaddr_in*)sin)->sin_addr, ip, len);
+		break;
+	default:
+		xsnprintf(ip, sizeof(ip), "<unknown>");
+	}
+	return ip;
+}
+
+#ifndef NO_IPV6
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	int socknum = 0;
+	char pbuf[NI_MAXSERV];
+	struct addrinfo hints, *ai0, *ai;
+	int gai;
+	long flags;
+
+	xsnprintf(pbuf, sizeof(pbuf), "%d", listen_port);
+	memset(&hints, 0, sizeof(hints));
+	hints.ai_family = AF_UNSPEC;
+	hints.ai_socktype = SOCK_STREAM;
+	hints.ai_protocol = IPPROTO_TCP;
+	hints.ai_flags = AI_PASSIVE;
+
+	gai = getaddrinfo(listen_addr, pbuf, &hints, &ai0);
+	if (gai) {
+		logerror("getaddrinfo() for %s failed: %s", listen_addr, gai_strerror(gai));
+		return 0;
+	}
+
+	for (ai = ai0; ai; ai = ai->ai_next) {
+		int sockfd;
+
+		sockfd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+		if (sockfd < 0)
+			continue;
+		if (sockfd >= FD_SETSIZE) {
+			logerror("Socket descriptor too large");
+			close(sockfd);
+			continue;
+		}
+
+#ifdef IPV6_V6ONLY
+		if (ai->ai_family == AF_INET6) {
+			int on = 1;
+			setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY,
+				   &on, sizeof(on));
+			/* Note: error is not fatal */
+		}
+#endif
+
+		if (set_reuse_addr(sockfd)) {
+			logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+			close(sockfd);
+			continue;
+		}
+
+		set_keep_alive(sockfd);
+
+		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
+			logerror("Could not bind to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+		if (listen(sockfd, 5) < 0) {
+			logerror("Could not listen to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+
+		flags = fcntl(sockfd, F_GETFD, 0);
+		if (flags >= 0)
+			fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+		ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+		socklist->list[socklist->nr++] = sockfd;
+		socknum++;
+	}
+
+	freeaddrinfo(ai0);
+
+	return socknum;
+}
+
+#else /* NO_IPV6 */
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	struct sockaddr_in sin;
+	int sockfd;
+	long flags;
+
+	memset(&sin, 0, sizeof sin);
+	sin.sin_family = AF_INET;
+	sin.sin_port = htons(listen_port);
+
+	if (listen_addr) {
+		/* Well, host better be an IP address here. */
+		if (inet_pton(AF_INET, listen_addr, &sin.sin_addr.s_addr) <= 0)
+			return 0;
+	} else {
+		sin.sin_addr.s_addr = htonl(INADDR_ANY);
+	}
+
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (sockfd < 0)
+		return 0;
+
+	if (set_reuse_addr(sockfd)) {
+		logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	set_keep_alive(sockfd);
+
+	if ( bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0 ) {
+		logerror("Could not bind to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	if (listen(sockfd, 5) < 0) {
+		logerror("Could not listen to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	flags = fcntl(sockfd, F_GETFD, 0);
+	if (flags >= 0)
+		fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+	ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+	socklist->list[socklist->nr++] = sockfd;
+	return 1;
+}
+
+#endif
+
+static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	if (!listen_addr->nr)
+		setup_named_sock("127.0.0.1", listen_port, socklist);
+	else {
+		int i, socknum;
+		for (i = 0; i < listen_addr->nr; i++) {
+			socknum = setup_named_sock(listen_addr->items[i].string,
+						   listen_port, socklist);
+
+			if (socknum == 0)
+				logerror("unable to allocate any listen sockets for host %s on port %u",
+					 listen_addr->items[i].string, listen_port);
+		}
+	}
+}
+
+static int service_loop(struct socketlist *socklist)
+{
+	struct pollfd *pfd;
+	int i;
+
+	CALLOC_ARRAY(pfd, socklist->nr);
+
+	for (i = 0; i < socklist->nr; i++) {
+		pfd[i].fd = socklist->list[i];
+		pfd[i].events = POLLIN;
+	}
+
+	signal(SIGCHLD, child_handler);
+
+	for (;;) {
+		int i;
+		int nr_ready;
+		int timeout = (pid_file ? 100 : -1);
+
+		check_dead_children();
+
+		nr_ready = poll(pfd, socklist->nr, timeout);
+		if (nr_ready < 0) {
+			if (errno != EINTR) {
+				logerror("Poll failed, resuming: %s",
+				      strerror(errno));
+				sleep(1);
+			}
+			continue;
+		}
+		else if (nr_ready == 0) {
+			/*
+			 * If we have a pid_file, then we watch it.
+			 * If someone deletes it, we shutdown the service.
+			 * The shell scripts in the test suite will use this.
+			 */
+			if (!pid_file || file_exists(pid_file))
+				continue;
+			goto shutdown;
+		}
+
+		for (i = 0; i < socklist->nr; i++) {
+			if (pfd[i].revents & POLLIN) {
+				union {
+					struct sockaddr sa;
+					struct sockaddr_in sai;
+#ifndef NO_IPV6
+					struct sockaddr_in6 sai6;
+#endif
+				} ss;
+				socklen_t sslen = sizeof(ss);
+				int incoming = accept(pfd[i].fd, &ss.sa, &sslen);
+				if (incoming < 0) {
+					switch (errno) {
+					case EAGAIN:
+					case EINTR:
+					case ECONNABORTED:
+						continue;
+					default:
+						die_errno("accept returned");
+					}
+				}
+				handle(incoming, &ss.sa, sslen);
+			}
+		}
+	}
+
+shutdown:
+	loginfo("Starting graceful shutdown (pid-file gone)");
+	for (i = 0; i < socklist->nr; i++)
+		close(socklist->list[i]);
+
+	return 0;
+}
+
+static int serve(struct string_list *listen_addr, int listen_port)
+{
+	struct socketlist socklist = { NULL, 0, 0 };
+
+	socksetup(listen_addr, listen_port, &socklist);
+	if (socklist.nr == 0)
+		die("unable to allocate any listen sockets on port %u",
+		    listen_port);
+
+	loginfo("Ready to rumble");
+
+	/*
+	 * Wait to create the pid-file until we've setup the sockets
+	 * and are open for business.
+	 */
+	if (pid_file)
+		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
+
+	return service_loop(&socklist);
+}
+
+//////////////////////////////////////////////////////////////////
+// This section is executed by both the primary instance and all
+// worker instances.  So, yes, each child-process re-parses the
+// command line argument and re-discovers how it should behave.
+//////////////////////////////////////////////////////////////////
+
+int cmd_main(int argc, const char **argv)
+{
+	int listen_port = 0;
+	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
+	int worker_mode = 0;
+	int i;
+	struct auth_module *mod = NULL;
+
+	trace2_cmd_name("test-http-server");
+	setup_git_directory_gently(NULL);
+
+	for (i = 1; i < argc; i++) {
+		const char *arg = argv[i];
+		const char *v;
+
+		if (skip_prefix(arg, "--listen=", &v)) {
+			string_list_append(&listen_addr, xstrdup_tolower(v));
+			continue;
+		}
+		if (skip_prefix(arg, "--port=", &v)) {
+			char *end;
+			unsigned long n;
+			n = strtoul(v, &end, 0);
+			if (*v && !*end) {
+				listen_port = n;
+				continue;
+			}
+		}
+		if (!strcmp(arg, "--worker")) {
+			worker_mode = 1;
+			trace2_cmd_mode("worker");
+			continue;
+		}
+		if (!strcmp(arg, "--verbose")) {
+			verbose = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--timeout=", &v)) {
+			timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--init-timeout=", &v)) {
+			init_timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--max-connections=", &v)) {
+			max_connections = atoi(v);
+			if (max_connections < 0)
+				max_connections = 0; /* unlimited */
+			continue;
+		}
+		if (!strcmp(arg, "--reuseaddr")) {
+			reuseaddr = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--pid-file=", &v)) {
+			pid_file = v;
+			continue;
+		}
+		if (skip_prefix(arg, "--allow-anonymous", &v)) {
+			allow_anonymous = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--auth=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			// trim trailing ':'
+			if (p[1])
+				strbuf_setlen(p[0], p[0]->len - 1);
+
+			if (get_auth_module(p[0])) {
+				error("duplicate auth scheme '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			mod = xmalloc(sizeof(struct auth_module));
+			mod->scheme = xstrdup(p[0]->buf);
+			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
+			mod->tokens = xmalloc(sizeof(struct string_list));
+			string_list_init_dup(mod->tokens);
+
+			add_auth_module(mod);
+
+			strbuf_list_free(p);
+			continue;
+		}
+		if (skip_prefix(arg, "--auth-token=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			if (!p[1]) {
+				error("missing token value '%s'\n", v);
+				usage(test_http_auth_usage);
+			}
+
+			// trim trailing ':'
+			strbuf_setlen(p[0], p[0]->len - 1);
+
+			mod = get_auth_module(p[0]);
+			if (!mod) {
+				error("auth scheme not defined '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			string_list_append(mod->tokens, p[1]->buf);
+			strbuf_list_free(p);
+			continue;
+		}
+
+		fprintf(stderr, "error: unknown argument '%s'\n", arg);
+		usage(test_http_auth_usage);
+	}
+
+	/* avoid splitting a message in the middle */
+	setvbuf(stderr, NULL, _IOFBF, 4096);
+
+	if (listen_port == 0)
+		listen_port = DEFAULT_GIT_PORT;
+
+	/*
+	 * If no --listen=<addr> args are given, the setup_named_sock()
+	 * code will use receive a NULL address and set INADDR_ANY.
+	 * This exposes both internal and external interfaces on the
+	 * port.
+	 *
+	 * Disallow that and default to the internal-use-only loopback
+	 * address.
+	 */
+	if (!listen_addr.nr)
+		string_list_append(&listen_addr, "127.0.0.1");
+
+	/*
+	 * worker_mode is set in our own child process instances
+	 * (that are bound to a connected socket from a client).
+	 */
+	if (worker_mode)
+		return worker();
+
+	/*
+	 * `cld_argv` is a bit of a clever hack. The top-level instance
+	 * of test-http-server does the normal bind/listen/accept stuff.
+	 * For each incoming socket, the top-level process spawns
+	 * a child instance of test-http-server *WITH* the additional
+	 * `--worker` argument. This causes the child to set `worker_mode`
+	 * and immediately call `worker()` using the connected socket (and
+	 * without the usual need for fork() or threads).
+	 *
+	 * The magic here is made possible because `cld_argv` is static
+	 * and handle() (called by service_loop()) knows about it.
+	 */
+	strvec_push(&cld_argv, argv[0]);
+	strvec_push(&cld_argv, "--worker");
+	for (i = 1; i < argc; ++i)
+		strvec_push(&cld_argv, argv[i]);
+
+	/*
+	 * Setup primary instance to listen for connections.
+	 */
+	return serve(&listen_addr, listen_port);
+}
diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
new file mode 100755
index 00000000000..43f1791a0fe
--- /dev/null
+++ b/t/t5556-http-auth.sh
@@ -0,0 +1,260 @@
+#!/bin/sh
+
+test_description='test http auth header and credential helper interop'
+
+. ./test-lib.sh
+
+test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
+
+# Setup a repository
+#
+REPO_DIR="$(pwd)"/repo
+
+# Setup some lookback URLs where test-http-server will be listening.
+# We will spawn it directly inside the repo directory, so we avoid
+# any need to configure directory mappings etc - we only serve this
+# repository from the root '/' of the server.
+#
+HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
+ORIGIN_URL=http://$HOST_PORT/
+
+# The pid-file is created by test-http-server when it starts.
+# The server will shutdown if/when we delete it (this is easier than
+# killing it by PID).
+#
+PID_FILE="$(pwd)"/pid-file.pid
+SERVER_LOG="$(pwd)"/OUT.server.log
+
+PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
+CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
+	&& export CREDENTIAL_HELPER
+
+test_expect_success 'setup repos' '
+	test_create_repo "$REPO_DIR" &&
+	git -C "$REPO_DIR" branch -M main
+'
+
+stop_http_server () {
+	if ! test -f "$PID_FILE"
+	then
+		return 0
+	fi
+	#
+	# The server will shutdown automatically when we delete the pid-file.
+	#
+	rm -f "$PID_FILE"
+	#
+	# Give it a few seconds to shutdown (mainly to completely release the
+	# port before the next test start another instance and it attempts to
+	# bind to it).
+	#
+	for k in 0 1 2 3 4
+	do
+		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "stop_http_server: timeout waiting for server shutdown"
+	return 1
+}
+
+start_http_server () {
+	#
+	# Launch our server into the background in repo_dir.
+	#
+	(
+		cd "$REPO_DIR"
+		test-http-server --verbose \
+			--listen=127.0.0.1 \
+			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
+			--reuseaddr \
+			--pid-file="$PID_FILE" \
+			"$@" \
+			2>"$SERVER_LOG" &
+	)
+	#
+	# Give it a few seconds to get started.
+	#
+	for k in 0 1 2 3 4
+	do
+		if test -f "$PID_FILE"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "start_http_server: timeout waiting for server startup"
+	return 1
+}
+
+per_test_cleanup () {
+	stop_http_server &&
+	rm -f OUT.* &&
+	rm -f *.cred
+}
+
+test_expect_success 'http auth anonymous no challenge' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server --allow-anonymous &&
+
+	# Attempt to read from a protected repository
+	git ls-remote $ORIGIN_URL
+'
+
+test_expect_success 'http auth www-auth headers to credential helper bearer valid' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=bearer:secret-token &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-token
+	authtype=bearer
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-token
+	authtype=bearer
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper basic valid' '
+	test_when_finished "per_test_cleanup" &&
+	# base64("alice:secret-passwd")
+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
+	export USERPASS64 &&
+
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=basic:$USERPASS64 &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	authtype=basic
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	authtype=basic
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper custom scheme' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=foobar:alg=test\ widget=1 \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=foobar:SECRET-FOOBAR-VALUE &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=foobar alg=test widget=1
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=SECRET-FOOBAR-VALUE
+	authtype=foobar
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=SECRET-FOOBAR-VALUE
+	authtype=foobar
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper invalid' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=bearer:secret-token &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >erase-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-token
+	authtype=bearer
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-token
+	authtype=bearer
+	EOF
+
+	test_must_fail git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp erase-expected.cred erase-actual.cred
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 171+ messages in thread

* git-credential.txt
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-10-21 17:08   ` [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic Matthew John Cheetham via GitGitGadget
@ 2022-10-25  2:26   ` M Hickford
  2022-10-25 20:49     ` git-credential.txt Matthew John Cheetham
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  7 siblings, 1 reply; 171+ messages in thread
From: M Hickford @ 2022-10-25  2:26 UTC (permalink / raw)
  To: gitgitgadget
  Cc: derrickstolee, git, lessleydennington, mjcheetham, mjcheetham

Reading git-credential.txt, I'm not quite clear:

1. Are the new wwwauth[] and authtype attributes populated by Git and passed to helpers? Or vice versa?
2. Should a storage helper store these attributes? If so, must the values be treated as confidential?

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: git-credential.txt
  2022-10-25  2:26   ` git-credential.txt M Hickford
@ 2022-10-25 20:49     ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-10-25 20:49 UTC (permalink / raw)
  To: M Hickford, gitgitgadget
  Cc: derrickstolee, git, lessleydennington, mjcheetham

On 2022-10-24 19:26, M Hickford wrote:
> Reading git-credential.txt, I'm not quite clear:
> 
> 1. Are the new wwwauth[] and authtype attributes populated by Git and passed to helpers? Or vice versa?

The wwwauth[] attribute is from Git -> helper, and the authtype attribute is
from helper -> Git. I can update the doc to make this more explicit.

> 2. Should a storage helper store these attributes? If so, must the values be treated as confidential?

Good question. A simple credential helper may wish to inspect these headers only
to differentiate the different authentication schemes available (basic, bearer,
etc) and return a credential of the correct/available type (and include an
`authtype` attribute in the response).

However it's unlikely such a helper would need to store the wwwauth[] values
as verbatim unless it can directly understand the parameters of the challenges.
The addition of this attribute is for credential helpers to gain more context
about the auth challenge from the remote.

For example, a helper may receive a bearer challenge including minimum required
OAuth scopes and an authentication authority:

wwwauth[]=Bearer authority=login.example.com/oauth scopes="code_rw userinfo_read"

Using these extra parameters the helper can try and locate an existing stored
credential that satisfies the request.

Such an enlightened helper would need to query stored credentials looking for
matching metadata including the authority, and a bearer token that has at least
the minimum required scopes (but could have a superset).

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-10-21 17:08   ` [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic Matthew John Cheetham via GitGitGadget
@ 2022-10-28 15:08     ` Derrick Stolee
  2022-10-28 19:14       ` Jeff Hostetler
  2022-11-01 23:59       ` Matthew John Cheetham
  0 siblings, 2 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-10-28 15:08 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham, Matthew John Cheetham

On 10/21/22 1:08 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>

> @@ -1500,6 +1500,8 @@ else
>  	endif
>  	BASIC_CFLAGS += $(CURL_CFLAGS)
>  
> +	TEST_PROGRAMS_NEED_X += test-http-server
> +
>  	REMOTE_CURL_PRIMARY = git-remote-http$X
>  	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
>  	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)

This hunk is in the "else" block of "ifdef NO_CURL",
so this makes sense for why TEST_PROGRAMS_NEED_X is
augmented here, away from other instances.

> diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
> index 787738e6fa3..45251695ce0 100644
> --- a/contrib/buildsystems/CMakeLists.txt
> +++ b/contrib/buildsystems/CMakeLists.txt
> @@ -989,6 +989,19 @@ set(wrapper_scripts
>  set(wrapper_test_scripts
>  	test-fake-ssh test-tool)
>  
> +if(CURL_FOUND)
> +       list(APPEND wrapper_test_scripts test-http-server)
> +
> +       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
> +       target_link_libraries(test-http-server common-main)
> +
> +       if(MSVC)
> +               set_target_properties(test-http-server
> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
> +               set_target_properties(test-http-server
> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
> +       endif()
> +endif()

And this file has the pattern of many "if(CURL_FOUND)"
blocks with isolated purposes, so it makes sense to
have this be an isolated change instead of grouped with
a different case.

> diff --git a/t/helper/.gitignore b/t/helper/.gitignore
> index 8c2ddcce95f..1a94ab6eed5 100644
> --- a/t/helper/.gitignore
> +++ b/t/helper/.gitignore
> @@ -1,2 +1,3 @@
>  /test-tool
>  /test-fake-ssh
> +test-http-server

Should this start with a "/" like the other entries?

> diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
> new file mode 100755
> index 00000000000..03e5e63dad6
> --- /dev/null
> +++ b/t/helper/test-credential-helper-replay.sh
> @@ -0,0 +1,14 @@
> +cmd=$1
> +teefile=$cmd-actual.cred
> +catfile=$cmd-response.cred
> +rm -f $teefile
> +while read line;
> +do
> +	if test -z "$line"; then
> +		break;
> +	fi
> +	echo "$line" >> $teefile
> +done
> +if test "$cmd" = "get"; then
> +	cat $catfile
> +fi

Should this be a helper method within another script, such
as t/lib-credential.sh or t/lib-httpd.sh? The read over
stdin will still work, as in this example:

read_chunk() {
	while read line; do
		case "$line" in
		--) break ;;
		*) echo "$line" ;;
		esac
	done
}

> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c

> @@ -0,0 +1,1134 @@
> +#include "config.h"
> +#include "run-command.h"
> +#include "strbuf.h"
> +#include "string-list.h"
> +#include "trace2.h"
> +#include "version.h"
> +#include "dir.h"
> +#include "date.h"
> +
> +#define TR2_CAT "test-http-server"
> +
> +static const char *pid_file;
> +static int verbose;
> +static int reuseaddr;
> +
> +static const char test_http_auth_usage[] =
> +"http-server [--verbose]\n"
> +"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
> +"           [--reuseaddr] [--pid-file=<file>]\n"
> +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
> +"           [--anonymous-allowed]\n"
> +"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
> +;

These are a lot of options to implement all at once. They are probably
simple enough, but depending on the implementation and tests, it might
be helpful to split this patch into smaller ones that introduce these
options along with the tests that exercise each. That will help
verify that they are being tested properly instead of needing to track
back and forth across the patch for each one.

> +
> +/* Timeout, and initial timeout */
> +static unsigned int timeout;
> +static unsigned int init_timeout;
> +
> +static void logreport(const char *label, const char *err, va_list params)
> +{
> +	struct strbuf msg = STRBUF_INIT;
> +
> +	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
> +	strbuf_vaddf(&msg, err, params);
> +	strbuf_addch(&msg, '\n');
> +
> +	fwrite(msg.buf, sizeof(char), msg.len, stderr);
> +	fflush(stderr);
> +
> +	strbuf_release(&msg);
> +}
> +
> +__attribute__((format (printf, 1, 2)))
> +static void logerror(const char *err, ...)
> +{
> +	va_list params;
> +	va_start(params, err);
> +	logreport("error", err, params);
> +	va_end(params);
> +}
> +
> +__attribute__((format (printf, 1, 2)))
> +static void loginfo(const char *err, ...)
> +{
> +	va_list params;
> +	if (!verbose)
> +		return;
> +	va_start(params, err);
> +	logreport("info", err, params);
> +	va_end(params);
> +}

I wonder how much of this we need or is just a nice thing. I would
err on the side of making things as simple as possible, but being
able to debug this test server may be important based on your
experience.

> +static void set_keep_alive(int sockfd)
> +{
> +	int ka = 1;
> +
> +	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
> +		if (errno != ENOTSOCK)
> +			logerror("unable to set SO_KEEPALIVE on socket: %s",
> +				strerror(errno));
> +	}
> +}
> +
> +//////////////////////////////////////////////////////////////////
> +// The code in this section is used by "worker" instances to service
> +// a single connection from a client.  The worker talks to the client
> +// on 0 and 1.
> +//////////////////////////////////////////////////////////////////

Use /* */ style comments. You can repeat the asterisks to get a
similar visual block.

> +
> +enum worker_result {
> +	/*
> +	 * Operation successful.
> +	 * Caller *might* keep the socket open and allow keep-alive.
> +	 */
> +	WR_OK       = 0,
> +	/*
> +	 * Various errors while processing the request and/or the response.
> +	 * Close the socket and clean up.
> +	 * Exit child-process with non-zero status.
> +	 */
> +	WR_IO_ERROR = 1<<0,
> +	/*
> +	 * Close the socket and clean up.  Does not imply an error.
> +	 */
> +	WR_HANGUP   = 1<<1,

nit: add a whitespace line between an item and the next
item's comment.

> +
> +	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
> +};

(I read, but have no comments on the http-server boilerplate.)

> +
> +enum auth_result {
> +	AUTH_UNKNOWN = 0,
> +	AUTH_DENY = 1,
> +	AUTH_ALLOW = 2,
> +};
> +
> +struct auth_module {
> +	const char *scheme;
> +	const char *challenge_params;

Later, I notice that you set challenge_params using an
xstrdup() so this shouldn't be const and you should
free it in any freeing code.

> +	struct string_list *tokens;
> +};
> +
> +static int allow_anonymous;
> +static struct auth_module **auth_modules = NULL;
> +static size_t auth_modules_nr = 0;
> +static size_t auth_modules_alloc = 0;

So, we are setting up a number of potential auth modules,
each of which has a scheme to match a request to the module,
and a list of tokens that would be considered worthy of the
AUTH_ALLOW result. Otherwise, if the scheme matches but no
token matches, we get AUTH_DENY. Finally, if no scheme matches
we get AUTH_UNKNOWN.

This concept might be worth a comment here around the data
structures before we get into how that is implemented.

> +static struct auth_module *get_auth_module(struct strbuf *scheme)
> +{
> +	int i;
> +	struct auth_module *mod;
> +	for (i = 0; i < auth_modules_nr; i++) {
> +		mod = auth_modules[i];
> +		if (!strcasecmp(mod->scheme, scheme->buf))
> +			return mod;
> +	}
> +
> +	return NULL;
> +}

Matching the input scheme against the list of modules.

Only complaint: there is no reason that 'scheme' needs t
be a strbuf, but could be a 'const char *' here.

> +static void add_auth_module(struct auth_module *mod)
> +{
> +	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
> +	auth_modules[auth_modules_nr++] = mod;
> +}

nit: this could be located earlier, next to the list
definition, or delayed until it is needed. That would
allow get_auth_module() to be closer to its first use.

> +static int is_authed(struct req *req, const char **user, enum worker_result *wr)
> +{
> +	enum auth_result result = AUTH_UNKNOWN;
> +	struct string_list hdrs = STRING_LIST_INIT_NODUP;
> +	struct auth_module *mod;
> +
> +	struct string_list_item *hdr;
> +	struct string_list_item *token;
> +	const char *v;
> +	struct strbuf **split = NULL;
> +	int i;
> +	char *challenge;
> +
> +	/* ask all auth modules to validate the request */
> +	for_each_string_list_item(hdr, &req->header_list) {
> +		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
> +			split = strbuf_split_str(v, ' ', 2);
> +			if (!split[0] || !split[1]) continue;

For each valid request header...

> +			// trim trailing space ' '
> +			strbuf_setlen(split[0], split[0]->len - 1);
> +
> +			mod = get_auth_module(split[0]);
> +			if (mod) {

...get an appropriate module, if it exists...

> +
> +				for_each_string_list_item(token, mod->tokens) {
> +					if (!strcmp(split[1]->buf, token->string)) {
> +						result = AUTH_ALLOW;
> +						goto done;
> +					}
> +				}
> +
> +				if (result != AUTH_UNKNOWN)
> +					goto done;

...and report if we find a valid token.

Here, it seems I was wrong in my expectation of AUTH_DENY:
if a matching module exists but no token exists in that
module, then we keep searching other modules. 

> +			}
> +		}
> +	}
> +
> +done:
> +	switch (result) {
> +	case AUTH_ALLOW:
> +		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
> +		*user = "VALID_TEST_USER";
> +		*wr = WR_OK;
> +		break;
> +
> +	case AUTH_DENY:
> +		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
> +		/* fall-through */

I'm not sure that I see a case where this is possible. Maybe
we should have a 'result = AUTH_DENY' at the start of the
"if (mod)" block, followed by a 'goto done' in all cases
instead of "if (result != AUTH_UNKNOWN)"?

> +	case AUTH_UNKNOWN:
> +		if (allow_anonymous)
> +			break;

If we do not require auth, then we want to continue if there
is no matching authentication.

> +		for (i = 0; i < auth_modules_nr; i++) {
> +			mod = auth_modules[i];
> +			if (mod->challenge_params)
> +				challenge = xstrfmt("WWW-Authenticate: %s %s",
> +						    mod->scheme,
> +						    mod->challenge_params);
> +			else
> +				challenge = xstrfmt("WWW-Authenticate: %s",
> +						    mod->scheme);
> +			string_list_append(&hdrs, challenge);
> +		}
> +		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);

However, here is the critical piece about how servers will
start to act with the new WWW-Authenticate header usage in
the Git credential helper interface. This will be critical
in the testing for Git to retry the credential helper while
passing these authentications schemes from the installed
modules.

> +	}
> +
> +	strbuf_list_free(split);
> +	string_list_clear(&hdrs, 0);
> +
> +	return result == AUTH_ALLOW ||
> +	      (result == AUTH_UNKNOWN && allow_anonymous);

Did it work? Or did it not need to work? I'm interested to
investigate the case that the client sent an authentication
header that matches a module but doesn't match any tokens,
but we allow anonymous access, anyway. Is that a 400? Or
is that a 401?

> +static enum worker_result dispatch(struct req *req)
> +{
> +	enum worker_result wr = WR_OK;
> +	const char *user = NULL;
> +
> +	if (!is_authed(req, &user, &wr))
> +		return wr;

If we are not authed, send the 401 response.

> +	if (is_git_request(req))
> +		return do__git(req, user);

If we are authed, then pass through to the Git response.

> +	return send_http_error(1, 501, "Not Implemented", -1, NULL,
> +			       WR_OK | WR_HANGUP);

If the Git request fails, we don't care. This is a test.
Just pass a 500-level error and the client will barf,
letting us know that something went wrong.

> +static void kill_some_child(void)

> +static void check_dead_children(void)

These technically sound methods have unfortunate names.
Using something like "connection" over "child" might
alleviate some of the horror. (I initially wanted to
suggest "subprocess" but you compare live_children to
max_connections in the next method, so connection seemed
appropriate.)

> +static struct strvec cld_argv = STRVEC_INIT;
> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
> +{
> +	struct child_process cld = CHILD_PROCESS_INIT;
> +
> +	if (max_connections && live_children >= max_connections) {
> +		kill_some_child();
> +		sleep(1);  /* give it some time to die */
> +		check_dead_children();
> +		if (live_children >= max_connections) {
> +			close(incoming);
> +			logerror("Too many children, dropping connection");
> +			return;
> +		}
> +	}

Do we anticipate exercising concurrent requests in our
tests? Perhaps it's not worth putting a cap on the
connection count so we can keep the test helpers simple.

> +	if (addr->sa_family == AF_INET) {
> +		char buf[128] = "";
> +		struct sockaddr_in *sin_addr = (void *) addr;
> +		inet_ntop(addr->sa_family, &sin_addr->sin_addr, buf, sizeof(buf));
> +		strvec_pushf(&cld.env, "REMOTE_ADDR=%s", buf);
> +		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
> +				 ntohs(sin_addr->sin_port));
> +#ifndef NO_IPV6
> +	} else if (addr->sa_family == AF_INET6) {
> +		char buf[128] = "";
> +		struct sockaddr_in6 *sin6_addr = (void *) addr;
> +		inet_ntop(AF_INET6, &sin6_addr->sin6_addr, buf, sizeof(buf));
> +		strvec_pushf(&cld.env, "REMOTE_ADDR=[%s]", buf);
> +		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
> +				 ntohs(sin6_addr->sin6_port));
> +#endif
> +	}
> +
> +	strvec_pushv(&cld.args, cld_argv.v);
> +	cld.in = incoming;
> +	cld.out = dup(incoming);
> +
> +	if (cld.out < 0)
> +		logerror("could not dup() `incoming`");
> +	else if (start_command(&cld))
> +		logerror("unable to fork");
> +	else
> +		add_child(&cld, addr, addrlen);
> +}
> +

I scanned the socket creation code, but my eyes were
glazing over. I'm definitely in the camp of "if it works,
that's enough for our tests." If we start to rely on this
test harness in more places, we can improve any shortcomings
as they arise.

> +//////////////////////////////////////////////////////////////////
> +// This section is executed by both the primary instance and all
> +// worker instances.  So, yes, each child-process re-parses the
> +// command line argument and re-discovers how it should behave.
> +//////////////////////////////////////////////////////////////////
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +	int listen_port = 0;
> +	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
> +	int worker_mode = 0;
> +	int i;
> +	struct auth_module *mod = NULL;
> +
> +	trace2_cmd_name("test-http-server");
> +	setup_git_directory_gently(NULL);
> +
> +	for (i = 1; i < argc; i++) {
> +		const char *arg = argv[i];
> +		const char *v;
> +
> +		if (skip_prefix(arg, "--listen=", &v)) {
> +			string_list_append(&listen_addr, xstrdup_tolower(v));
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--port=", &v)) {
> +			char *end;
> +			unsigned long n;
> +			n = strtoul(v, &end, 0);
> +			if (*v && !*end) {
> +				listen_port = n;
> +				continue;
> +			}
> +		}
> +		if (!strcmp(arg, "--worker")) {
> +			worker_mode = 1;
> +			trace2_cmd_mode("worker");
> +			continue;
> +		}
> +		if (!strcmp(arg, "--verbose")) {
> +			verbose = 1;
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--timeout=", &v)) {
> +			timeout = atoi(v);
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--init-timeout=", &v)) {
> +			init_timeout = atoi(v);
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--max-connections=", &v)) {
> +			max_connections = atoi(v);
> +			if (max_connections < 0)
> +				max_connections = 0; /* unlimited */
> +			continue;
> +		}
> +		if (!strcmp(arg, "--reuseaddr")) {
> +			reuseaddr = 1;
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--pid-file=", &v)) {
> +			pid_file = v;
> +			continue;
> +		}

ok, most of these arguments are actually about the per-connection
subprocesses.

> +		if (skip_prefix(arg, "--allow-anonymous", &v)) {
> +			allow_anonymous = 1;
> +			continue;
> +		}

Here is how we choose to allo anonymous access.

> +		if (skip_prefix(arg, "--auth=", &v)) {
> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
> +
> +			if (!p[0]) {
> +				error("invalid argument '%s'", v);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			// trim trailing ':'
> +			if (p[1])
> +				strbuf_setlen(p[0], p[0]->len - 1);
> +
> +			if (get_auth_module(p[0])) {
> +				error("duplicate auth scheme '%s'\n", p[0]->buf);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			mod = xmalloc(sizeof(struct auth_module));
> +			mod->scheme = xstrdup(p[0]->buf);
> +			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;

Here, you xstrdup() into a 'const char *', but you are really
passing ownership so it shouldn't be conts.

> +			mod->tokens = xmalloc(sizeof(struct string_list));

nit: this could also be "CALLOC_ARRAY(mod->tokens, 1);"

> +			string_list_init_dup(mod->tokens);
> +
> +			add_auth_module(mod);
> +
> +			strbuf_list_free(p);
> +			continue;

Ok, we gain the auth schemes from the command line.

> +		}
> +		if (skip_prefix(arg, "--auth-token=", &v)) {
> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
> +			if (!p[0]) {
> +				error("invalid argument '%s'", v);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			if (!p[1]) {
> +				error("missing token value '%s'\n", v);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			// trim trailing ':'

Use /* */ (Aside: I'm surprised we don't have a build option in
DEVELOPER=1 that catches the use of these comments.)

> +			strbuf_setlen(p[0], p[0]->len - 1);
> +
> +			mod = get_auth_module(p[0]);
> +			if (!mod) {
> +				error("auth scheme not defined '%s'\n", p[0]->buf);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			string_list_append(mod->tokens, p[1]->buf);
> +			strbuf_list_free(p);
> +			continue;
> +		}

And the token lists. It is important that the scheme is added
before any token is added.

> +		fprintf(stderr, "error: unknown argument '%s'\n", arg);
> +		usage(test_http_auth_usage);
> +	}
> +
> +	/* avoid splitting a message in the middle */
> +	setvbuf(stderr, NULL, _IOFBF, 4096);
> +
> +	if (listen_port == 0)
> +		listen_port = DEFAULT_GIT_PORT;
> +
> +	/*
> +	 * If no --listen=<addr> args are given, the setup_named_sock()
> +	 * code will use receive a NULL address and set INADDR_ANY.
> +	 * This exposes both internal and external interfaces on the
> +	 * port.
> +	 *
> +	 * Disallow that and default to the internal-use-only loopback
> +	 * address.
> +	 */
> +	if (!listen_addr.nr)
> +		string_list_append(&listen_addr, "127.0.0.1");
> +
> +	/*
> +	 * worker_mode is set in our own child process instances
> +	 * (that are bound to a connected socket from a client).
> +	 */
> +	if (worker_mode)
> +		return worker();
> +
> +	/*
> +	 * `cld_argv` is a bit of a clever hack. The top-level instance
> +	 * of test-http-server does the normal bind/listen/accept stuff.
> +	 * For each incoming socket, the top-level process spawns
> +	 * a child instance of test-http-server *WITH* the additional
> +	 * `--worker` argument. This causes the child to set `worker_mode`
> +	 * and immediately call `worker()` using the connected socket (and
> +	 * without the usual need for fork() or threads).
> +	 *
> +	 * The magic here is made possible because `cld_argv` is static
> +	 * and handle() (called by service_loop()) knows about it.
> +	 */
> +	strvec_push(&cld_argv, argv[0]);
> +	strvec_push(&cld_argv, "--worker");
> +	for (i = 1; i < argc; ++i)
> +		strvec_push(&cld_argv, argv[i]);
> +
> +	/*
> +	 * Setup primary instance to listen for connections.
> +	 */
> +	return serve(&listen_addr, listen_port);
> +}

And complete the thing with some boilerplate.

This was a lot to read, and the interesting bits are all mixed in
with the http server code, which is less interesting to what we
are trying to accomplish. It would be beneficial to split this
into one or two patches before we actually introduce the tests.

The most important thing that I think would be helpful is to
isolate all the authentication behavior into its own patch so
we can see how those connections from the command-line arguments
affect the behavior of the server responses.

I think ideally we would have the following split:

 1. All server boilerblate. All requests 500 not-implemented.

 2. Add Git fall-through with no authentication. Add the tests
    that are intended to allow anonymous auth.

 3. Add authentication data structures read from command-line,
    but not processed at all in the logic.

 4. Act on the authentication data structures to alter the
    requests. Add the tests that use these authentication
    schemes.

I could easily see a case for combining 1&2 as well as 3&4,
for slightly larger but more completely-testable changes at
every step.

From what I read, I don't think there is much to change in
the end result of the code, but it definitely was hard to read
the important things when surrounded by many lines of
boilerplate.

> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh

I'm going to pause here and come back to the test script in
a separate reply.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests
  2022-10-21 17:07   ` [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-10-28 18:22     ` Jeff Hostetler
  2022-11-01 23:07       ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Jeff Hostetler @ 2022-10-28 18:22 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	Matthew John Cheetham



On 10/21/22 1:07 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Add the value of the WWW-Authenticate response header to credential
> requests. Credential helpers that understand and support HTTP
> authentication and authorization can use this standard header (RFC 2616
> Section 14.47 [1]) to generate valid credentials.
> 
> WWW-Authenticate headers can contain information pertaining to the
> authority, authentication mechanism, or extra parameters/scopes that are
> required.
> 
> The current I/O format for credential helpers only allows for unique
> names for properties/attributes, so in order to transmit multiple header
> values (with a specific order) we introduce a new convention whereby a
> C-style array syntax is used in the property name to denote multiple
> ordered values for the same property.
> 
> In this case we send multiple `wwwauth[n]` properties where `n` is a
 > zero-indexed number, reflecting the order the WWW-Authenticate headers
 > appeared in the HTTP response.

Here (and maybe in the cover letter) you mention `wwwauth[n]` and `n`...
> +`wwwauth[]`::
> +
> +	When an HTTP response is received that includes one or more
> +	'WWW-Authenticate' authentication headers, these can be passed to Git
> +	(and subsequent credential helpers) with these attributes.
> +	Each 'WWW-Authenticate' header value should be passed as a separate
> +	attribute 'wwwauth[]' where the order of the attributes is the same
> +	as they appear in the HTTP response.

...but here you don't include the `n`.

[...]
> +static void credential_write_strvec(FILE *fp, const char *key,
> +				    const struct strvec *vec)
> +{
> +	int i = 0;
> +	const char *full_key = xstrfmt("%s[]", key);

...nor here.

Jeff

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-10-28 15:08     ` Derrick Stolee
@ 2022-10-28 19:14       ` Jeff Hostetler
  2022-11-01 23:14         ` Matthew John Cheetham
  2022-11-01 23:59       ` Matthew John Cheetham
  1 sibling, 1 reply; 171+ messages in thread
From: Jeff Hostetler @ 2022-10-28 19:14 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham, Matthew John Cheetham



On 10/28/22 11:08 AM, Derrick Stolee wrote:
> }
> 
>> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
> 
>> @@ -0,0 +1,1134 @@
>> +#include "config.h"
>> +#include "run-command.h"
>> +#include "strbuf.h"
>> +#include "string-list.h"
>> +#include "trace2.h"
>> +#include "version.h"
>> +#include "dir.h"
>> +#include "date.h"
>> +
>> +#define TR2_CAT "test-http-server"
>> +
>> +static const char *pid_file;
>> +static int verbose;
>> +static int reuseaddr;
>> +
>> +static const char test_http_auth_usage[] =
>> +"http-server [--verbose]\n"
>> +"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
>> +"           [--reuseaddr] [--pid-file=<file>]\n"
>> +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
>> +"           [--anonymous-allowed]\n"
>> +"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
>> +;
> 
> These are a lot of options to implement all at once. They are probably
> simple enough, but depending on the implementation and tests, it might
> be helpful to split this patch into smaller ones that introduce these
> options along with the tests that exercise each. That will help
> verify that they are being tested properly instead of needing to track
> back and forth across the patch for each one.

how many of these options were inherited from test-gvfs-protocol or
from upstream git-daemon?  If most came from git-daemon, it's probably
easier to see that this was a cut-n-paste from it if it comes over in
one commit, since all of the OPT_ processing, usage(), and static global
state vars will come over together I would think -- rather than to build
up the arg parsing bit by bit.  More on this in a minute...


>> +
>> +/* Timeout, and initial timeout */
>> +static unsigned int timeout;
>> +static unsigned int init_timeout;
>> +
>> +static void logreport(const char *label, const char *err, va_list params)
>> +{
>> +	struct strbuf msg = STRBUF_INIT;
>> +
>> +	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
>> +	strbuf_vaddf(&msg, err, params);
>> +	strbuf_addch(&msg, '\n');
>> +
>> +	fwrite(msg.buf, sizeof(char), msg.len, stderr);
>> +	fflush(stderr);
>> +
>> +	strbuf_release(&msg);
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void logerror(const char *err, ...)
>> +{
>> +	va_list params;
>> +	va_start(params, err);
>> +	logreport("error", err, params);
>> +	va_end(params);
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void loginfo(const char *err, ...)
>> +{
>> +	va_list params;
>> +	if (!verbose)
>> +		return;
>> +	va_start(params, err);
>> +	logreport("info", err, params);
>> +	va_end(params);
>> +}

...Maybe it would be easier to see/diff this large new test server
if we copied `daemon.c` into this source file in 1 commit and then
converted it to what you have now in 1 commit -- so that only new
code shows up here.  For example, all of the above logreport, logerror,
and loginfo routines would show up as new in the copy commit, but not
in the edit commit.  However, that may lead to too much noise when
you actually get into the meat of the auth changes, maybe.


> I wonder how much of this we need or is just a nice thing. I would
> err on the side of making things as simple as possible, but being
> able to debug this test server may be important based on your
> experience.

i'd vote to keep it.

[...]
>> +static void kill_some_child(void)
> 
>> +static void check_dead_children(void)
> 
> These technically sound methods have unfortunate names.
> Using something like "connection" over "child" might
> alleviate some of the horror. (I initially wanted to
> suggest "subprocess" but you compare live_children to
> max_connections in the next method, so connection seemed
> appropriate.)

These names were inherited from `daemon.c` IIRC. I wouldn't change
them since it'll just introduce noise when diffing.  Especially,
if we do the copy commit first.


[...]
>> +static struct strvec cld_argv = STRVEC_INIT;
>> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
>> +{
>> +	struct child_process cld = CHILD_PROCESS_INIT;
>> +
>> +	if (max_connections && live_children >= max_connections) {
>> +		kill_some_child();
>> +		sleep(1);  /* give it some time to die */
>> +		check_dead_children();
>> +		if (live_children >= max_connections) {
>> +			close(incoming);
>> +			logerror("Too many children, dropping connection");
>> +			return;
>> +		}
>> +	}
> 
> Do we anticipate exercising concurrent requests in our
> tests? Perhaps it's not worth putting a cap on the
> connection count so we can keep the test helpers simple.

again, this code was inherited from `daemon.c`, so we could leave it.

[...]
>> +			mod = xmalloc(sizeof(struct auth_module));
>> +			mod->scheme = xstrdup(p[0]->buf);
>> +			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
> 
> Here, you xstrdup() into a 'const char *', but you are really
> passing ownership so it shouldn't be conts.

There is a strbuf_detach() that will let you steal the buffer from the
strbuf if that would help.


[...]
> This was a lot to read, and the interesting bits are all mixed in
> with the http server code, which is less interesting to what we
> are trying to accomplish. It would be beneficial to split this
> into one or two patches before we actually introduce the tests.

agreed. it is big, but it does make sense.  perhaps doing the
copy daemon.c commit and then see how this commit diffs from it
would make it more manageable. (not sure, but worth a try.)

[...]
>  From what I read, I don't think there is much to change in
> the end result of the code, but it definitely was hard to read
> the important things when surrounded by many lines of
> boilerplate.

agreed. i think the end result is good.

Thanks
Jeff



^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 2/6] credential: add WWW-Authenticate header to cred requests
  2022-10-28 18:22     ` Jeff Hostetler
@ 2022-11-01 23:07       ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-11-01 23:07 UTC (permalink / raw)
  To: Jeff Hostetler, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham

On 2022-10-28 11:22, Jeff Hostetler wrote:
> On 10/21/22 1:07 PM, Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Add the value of the WWW-Authenticate response header to credential
>> requests. Credential helpers that understand and support HTTP
>> authentication and authorization can use this standard header (RFC 2616
>> Section 14.47 [1]) to generate valid credentials.
>>
>> WWW-Authenticate headers can contain information pertaining to the
>> authority, authentication mechanism, or extra parameters/scopes that are
>> required.
>>
>> The current I/O format for credential helpers only allows for unique
>> names for properties/attributes, so in order to transmit multiple header
>> values (with a specific order) we introduce a new convention whereby a
>> C-style array syntax is used in the property name to denote multiple
>> ordered values for the same property.
>>
>> In this case we send multiple `wwwauth[n]` properties where `n` is a
>> zero-indexed number, reflecting the order the WWW-Authenticate headers
>> appeared in the HTTP response.
> 
> Here (and maybe in the cover letter) you mention `wwwauth[n]` and `n`...
>> +`wwwauth[]`::
>> +
>> +    When an HTTP response is received that includes one or more
>> +    'WWW-Authenticate' authentication headers, these can be passed to Git
>> +    (and subsequent credential helpers) with these attributes.
>> +    Each 'WWW-Authenticate' header value should be passed as a separate
>> +    attribute 'wwwauth[]' where the order of the attributes is the same
>> +    as they appear in the HTTP response.
> 
> ...but here you don't include the `n`.
> 
> [...]
>> +static void credential_write_strvec(FILE *fp, const char *key,
>> +                    const struct strvec *vec)
>> +{
>> +    int i = 0;
>> +    const char *full_key = xstrfmt("%s[]", key);
> 
> ...nor here.
> 
Ah. This is an oversight in my v2 rebasing! Will fix in v3.

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-10-28 19:14       ` Jeff Hostetler
@ 2022-11-01 23:14         ` Matthew John Cheetham
  2022-11-02 14:38           ` Derrick Stolee
  0 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham @ 2022-11-01 23:14 UTC (permalink / raw)
  To: Jeff Hostetler, Derrick Stolee,
	Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham

On 2022-10-28 12:14, Jeff Hostetler wrote:
>
>
> On 10/28/22 11:08 AM, Derrick Stolee wrote:
>> }
>>
>>> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
>>
>>> @@ -0,0 +1,1134 @@
>>> +#include "config.h"
>>> +#include "run-command.h"
>>> +#include "strbuf.h"
>>> +#include "string-list.h"
>>> +#include "trace2.h"
>>> +#include "version.h"
>>> +#include "dir.h"
>>> +#include "date.h"
>>> +
>>> +#define TR2_CAT "test-http-server"
>>> +
>>> +static const char *pid_file;
>>> +static int verbose;
>>> +static int reuseaddr;
>>> +
>>> +static const char test_http_auth_usage[] =
>>> +"http-server [--verbose]\n"
>>> +"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
>>> +"           [--reuseaddr] [--pid-file=<file>]\n"
>>> +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
>>> +"           [--anonymous-allowed]\n"
>>> +"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
>>> +;
>>
>> These are a lot of options to implement all at once. They are probably
>> simple enough, but depending on the implementation and tests, it might
>> be helpful to split this patch into smaller ones that introduce these
>> options along with the tests that exercise each. That will help
>> verify that they are being tested properly instead of needing to track
>> back and forth across the patch for each one.
>
> how many of these options were inherited from test-gvfs-protocol or
> from upstream git-daemon?  If most came from git-daemon, it's probably
> easier to see that this was a cut-n-paste from it if it comes over in
> one commit, since all of the OPT_ processing, usage(), and static global
> state vars will come over together I would think -- rather than to build
> up the arg parsing bit by bit.  More on this in a minute...
>

Only --anonymous-allowed, --auth and --auth-token are added over git-daemon.

>
>>> +
>>> +/* Timeout, and initial timeout */
>>> +static unsigned int timeout;
>>> +static unsigned int init_timeout;
>>> +
>>> +static void logreport(const char *label, const char *err, va_list params)
>>> +{
>>> +    struct strbuf msg = STRBUF_INIT;
>>> +
>>> +    strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
>>> +    strbuf_vaddf(&msg, err, params);
>>> +    strbuf_addch(&msg, '\n');
>>> +
>>> +    fwrite(msg.buf, sizeof(char), msg.len, stderr);
>>> +    fflush(stderr);
>>> +
>>> +    strbuf_release(&msg);
>>> +}
>>> +
>>> +__attribute__((format (printf, 1, 2)))
>>> +static void logerror(const char *err, ...)
>>> +{
>>> +    va_list params;
>>> +    va_start(params, err);
>>> +    logreport("error", err, params);
>>> +    va_end(params);
>>> +}
>>> +
>>> +__attribute__((format (printf, 1, 2)))
>>> +static void loginfo(const char *err, ...)
>>> +{
>>> +    va_list params;
>>> +    if (!verbose)
>>> +        return;
>>> +    va_start(params, err);
>>> +    logreport("info", err, params);
>>> +    va_end(params);
>>> +}
>
> ...Maybe it would be easier to see/diff this large new test server
> if we copied `daemon.c` into this source file in 1 commit and then
> converted it to what you have now in 1 commit -- so that only new
> code shows up here.  For example, all of the above logreport, logerror,
> and loginfo routines would show up as new in the copy commit, but not
> in the edit commit.  However, that may lead to too much noise when
> you actually get into the meat of the auth changes, maybe.

I take from git-daemon and the test-gvfs-protocol helper from microsoft/git
fork, but then also delete lots of not required pieces too just as much as
I have added. Copying git-daemon.c, to then delete, and then add feels like
lots of noise.

>> I wonder how much of this we need or is just a nice thing. I would
>> err on the side of making things as simple as possible, but being
>> able to debug this test server may be important based on your
>> experience.
>
> i'd vote to keep it.
>
> [...]
>>> +static void kill_some_child(void)
>>
>>> +static void check_dead_children(void)
>>
>> These technically sound methods have unfortunate names.
>> Using something like "connection" over "child" might
>> alleviate some of the horror. (I initially wanted to
>> suggest "subprocess" but you compare live_children to
>> max_connections in the next method, so connection seemed
>> appropriate.)
>
> These names were inherited from `daemon.c` IIRC. I wouldn't change
> them since it'll just introduce noise when diffing.  Especially,
> if we do the copy commit first.

Indeed. These functions are untouched from daemon.c. I do plan to split
this mega-patch up however in to a single 'add the boilerplate' based on
git-daemon patch, then add the extra pieces like HTTP request parsing and
the auth pieces in a v3.

> [...]
>>> +static struct strvec cld_argv = STRVEC_INIT;
>>> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
>>> +{
>>> +    struct child_process cld = CHILD_PROCESS_INIT;
>>> +
>>> +    if (max_connections && live_children >= max_connections) {
>>> +        kill_some_child();
>>> +        sleep(1);  /* give it some time to die */
>>> +        check_dead_children();
>>> +        if (live_children >= max_connections) {
>>> +            close(incoming);
>>> +            logerror("Too many children, dropping connection");
>>> +            return;
>>> +        }
>>> +    }
>>
>> Do we anticipate exercising concurrent requests in our
>> tests? Perhaps it's not worth putting a cap on the
>> connection count so we can keep the test helpers simple.
>
> again, this code was inherited from `daemon.c`, so we could leave it.
>
> [...]
>>> +            mod = xmalloc(sizeof(struct auth_module));
>>> +            mod->scheme = xstrdup(p[0]->buf);
>>> +            mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
>>
>> Here, you xstrdup() into a 'const char *', but you are really
>> passing ownership so it shouldn't be conts.
>
> There is a strbuf_detach() that will let you steal the buffer from the
> strbuf if that would help.

Will update in v3 to drop the const.

> [...]
>> This was a lot to read, and the interesting bits are all mixed in
>> with the http server code, which is less interesting to what we
>> are trying to accomplish. It would be beneficial to split this
>> into one or two patches before we actually introduce the tests.
>
> agreed. it is big, but it does make sense.  perhaps doing the
> copy daemon.c commit and then see how this commit diffs from it
> would make it more manageable. (not sure, but worth a try.)
>
> [...]
>>  From what I read, I don't think there is much to change in
>> the end result of the code, but it definitely was hard to read
>> the important things when surrounded by many lines of
>> boilerplate.
>
> agreed. i think the end result is good.
>
> Thanks
> Jeff
>
>

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-10-28 15:08     ` Derrick Stolee
  2022-10-28 19:14       ` Jeff Hostetler
@ 2022-11-01 23:59       ` Matthew John Cheetham
  1 sibling, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-11-01 23:59 UTC (permalink / raw)
  To: Derrick Stolee, Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham

On 2022-10-28 08:08, Derrick Stolee wrote:
> On 10/21/22 1:08 PM, Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
>> @@ -1500,6 +1500,8 @@ else
>>  	endif
>>  	BASIC_CFLAGS += $(CURL_CFLAGS)
>>  
>> +	TEST_PROGRAMS_NEED_X += test-http-server
>> +
>>  	REMOTE_CURL_PRIMARY = git-remote-http$X
>>  	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
>>  	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
> 
> This hunk is in the "else" block of "ifdef NO_CURL",
> so this makes sense for why TEST_PROGRAMS_NEED_X is
> augmented here, away from other instances.
> 
>> diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
>> index 787738e6fa3..45251695ce0 100644
>> --- a/contrib/buildsystems/CMakeLists.txt
>> +++ b/contrib/buildsystems/CMakeLists.txt
>> @@ -989,6 +989,19 @@ set(wrapper_scripts
>>  set(wrapper_test_scripts
>>  	test-fake-ssh test-tool)
>>  
>> +if(CURL_FOUND)
>> +       list(APPEND wrapper_test_scripts test-http-server)
>> +
>> +       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
>> +       target_link_libraries(test-http-server common-main)
>> +
>> +       if(MSVC)
>> +               set_target_properties(test-http-server
>> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
>> +               set_target_properties(test-http-server
>> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
>> +       endif()
>> +endif()
> 
> And this file has the pattern of many "if(CURL_FOUND)"
> blocks with isolated purposes, so it makes sense to
> have this be an isolated change instead of grouped with
> a different case.
> 
>> diff --git a/t/helper/.gitignore b/t/helper/.gitignore
>> index 8c2ddcce95f..1a94ab6eed5 100644
>> --- a/t/helper/.gitignore
>> +++ b/t/helper/.gitignore
>> @@ -1,2 +1,3 @@
>>  /test-tool
>>  /test-fake-ssh
>> +test-http-server
> 
> Should this start with a "/" like the other entries?

That it probably should! Will update.

>> diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
>> new file mode 100755
>> index 00000000000..03e5e63dad6
>> --- /dev/null
>> +++ b/t/helper/test-credential-helper-replay.sh
>> @@ -0,0 +1,14 @@
>> +cmd=$1
>> +teefile=$cmd-actual.cred
>> +catfile=$cmd-response.cred
>> +rm -f $teefile
>> +while read line;
>> +do
>> +	if test -z "$line"; then
>> +		break;
>> +	fi
>> +	echo "$line" >> $teefile
>> +done
>> +if test "$cmd" = "get"; then
>> +	cat $catfile
>> +fi
> 
> Should this be a helper method within another script, such
> as t/lib-credential.sh or t/lib-httpd.sh? The read over
> stdin will still work, as in this example:
> 
> read_chunk() {
> 	while read line; do
> 		case "$line" in
> 		--) break ;;
> 		*) echo "$line" ;;
> 		esac
> 	done
> }

This script file is used as a credential helper that is invoked by Git.
We specify that Git should use this credential helper in the tests using
the -c option:

  CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
	  && export CREDENTIAL_HELPER
..
   git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&


Would extracting a read_chunk() function to one of the lib-* test scripts
be worth it given we already need another entry script anyway?

What other scripts would be calling read_chunk()?


>> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
> 
>> @@ -0,0 +1,1134 @@
>> +#include "config.h"
>> +#include "run-command.h"
>> +#include "strbuf.h"
>> +#include "string-list.h"
>> +#include "trace2.h"
>> +#include "version.h"
>> +#include "dir.h"
>> +#include "date.h"
>> +
>> +#define TR2_CAT "test-http-server"
>> +
>> +static const char *pid_file;
>> +static int verbose;
>> +static int reuseaddr;
>> +
>> +static const char test_http_auth_usage[] =
>> +"http-server [--verbose]\n"
>> +"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
>> +"           [--reuseaddr] [--pid-file=<file>]\n"
>> +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
>> +"           [--anonymous-allowed]\n"
>> +"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
>> +;
> 
> These are a lot of options to implement all at once. They are probably
> simple enough, but depending on the implementation and tests, it might
> be helpful to split this patch into smaller ones that introduce these
> options along with the tests that exercise each. That will help
> verify that they are being tested properly instead of needing to track
> back and forth across the patch for each one.

I plan to split this patch in to several in a v3.

>> +
>> +/* Timeout, and initial timeout */
>> +static unsigned int timeout;
>> +static unsigned int init_timeout;
>> +
>> +static void logreport(const char *label, const char *err, va_list params)
>> +{
>> +	struct strbuf msg = STRBUF_INIT;
>> +
>> +	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
>> +	strbuf_vaddf(&msg, err, params);
>> +	strbuf_addch(&msg, '\n');
>> +
>> +	fwrite(msg.buf, sizeof(char), msg.len, stderr);
>> +	fflush(stderr);
>> +
>> +	strbuf_release(&msg);
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void logerror(const char *err, ...)
>> +{
>> +	va_list params;
>> +	va_start(params, err);
>> +	logreport("error", err, params);
>> +	va_end(params);
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void loginfo(const char *err, ...)
>> +{
>> +	va_list params;
>> +	if (!verbose)
>> +		return;
>> +	va_start(params, err);
>> +	logreport("info", err, params);
>> +	va_end(params);
>> +}
> 
> I wonder how much of this we need or is just a nice thing. I would
> err on the side of making things as simple as possible, but being
> able to debug this test server may be important based on your
> experience.

These are useful to debug failures. Plus they also come from my copy
from daemon.c, so didn't want to touch/delete too much from that
starting point.

>> +static void set_keep_alive(int sockfd)
>> +{
>> +	int ka = 1;
>> +
>> +	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
>> +		if (errno != ENOTSOCK)
>> +			logerror("unable to set SO_KEEPALIVE on socket: %s",
>> +				strerror(errno));
>> +	}
>> +}
>> +
>> +//////////////////////////////////////////////////////////////////
>> +// The code in this section is used by "worker" instances to service
>> +// a single connection from a client.  The worker talks to the client
>> +// on 0 and 1.
>> +//////////////////////////////////////////////////////////////////
> 
> Use /* */ style comments. You can repeat the asterisks to get a
> similar visual block.

Yep!

>> +
>> +enum worker_result {
>> +	/*
>> +	 * Operation successful.
>> +	 * Caller *might* keep the socket open and allow keep-alive.
>> +	 */
>> +	WR_OK       = 0,
>> +	/*
>> +	 * Various errors while processing the request and/or the response.
>> +	 * Close the socket and clean up.
>> +	 * Exit child-process with non-zero status.
>> +	 */
>> +	WR_IO_ERROR = 1<<0,
>> +	/*
>> +	 * Close the socket and clean up.  Does not imply an error.
>> +	 */
>> +	WR_HANGUP   = 1<<1,
> 
> nit: add a whitespace line between an item and the next
> item's comment.

Sure

>> +
>> +	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
>> +};
> 
> (I read, but have no comments on the http-server boilerplate.)
> 
>> +
>> +enum auth_result {
>> +	AUTH_UNKNOWN = 0,
>> +	AUTH_DENY = 1,
>> +	AUTH_ALLOW = 2,
>> +};
>> +
>> +struct auth_module {
>> +	const char *scheme;
>> +	const char *challenge_params;
> 
> Later, I notice that you set challenge_params using an
> xstrdup() so this shouldn't be const and you should
> free it in any freeing code.

One question on this suggestion.. where would be appropriate to
free said char*? We need them for the lifetime of the process,
and they never grown in number beyond initial allocation from
parsing command line args.

I could move to stack alloc these in `cmd_main` and instead pass
a pointer to the `auth_modules` and count down through every
serve/handle etc function, rather than rely on them being global?

Thoughts or preferences?

>> +	struct string_list *tokens;
>> +};
>> +
>> +static int allow_anonymous;
>> +static struct auth_module **auth_modules = NULL;
>> +static size_t auth_modules_nr = 0;
>> +static size_t auth_modules_alloc = 0;
> 
> So, we are setting up a number of potential auth modules,
> each of which has a scheme to match a request to the module,
> and a list of tokens that would be considered worthy of the
> AUTH_ALLOW result. Otherwise, if the scheme matches but no
> token matches, we get AUTH_DENY. Finally, if no scheme matches
> we get AUTH_UNKNOWN.
> 
> This concept might be worth a comment here around the data
> structures before we get into how that is implemented.
> 
>> +static struct auth_module *get_auth_module(struct strbuf *scheme)
>> +{
>> +	int i;
>> +	struct auth_module *mod;
>> +	for (i = 0; i < auth_modules_nr; i++) {
>> +		mod = auth_modules[i];
>> +		if (!strcasecmp(mod->scheme, scheme->buf))
>> +			return mod;
>> +	}
>> +
>> +	return NULL;
>> +}
> 
> Matching the input scheme against the list of modules.
> 
> Only complaint: there is no reason that 'scheme' needs t
> be a strbuf, but could be a 'const char *' here.

True.

>> +static void add_auth_module(struct auth_module *mod)
>> +{
>> +	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
>> +	auth_modules[auth_modules_nr++] = mod;
>> +}
> 
> nit: this could be located earlier, next to the list
> definition, or delayed until it is needed. That would
> allow get_auth_module() to be closer to its first use.

Not sure I follow.. are you saying I should move `add_auth_module`
to earlier in the file?

>> +static int is_authed(struct req *req, const char **user, enum worker_result *wr)
>> +{
>> +	enum auth_result result = AUTH_UNKNOWN;
>> +	struct string_list hdrs = STRING_LIST_INIT_NODUP;
>> +	struct auth_module *mod;
>> +
>> +	struct string_list_item *hdr;
>> +	struct string_list_item *token;
>> +	const char *v;
>> +	struct strbuf **split = NULL;
>> +	int i;
>> +	char *challenge;
>> +
>> +	/* ask all auth modules to validate the request */
>> +	for_each_string_list_item(hdr, &req->header_list) {
>> +		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
>> +			split = strbuf_split_str(v, ' ', 2);
>> +			if (!split[0] || !split[1]) continue;
> 
> For each valid request header...
> 
>> +			// trim trailing space ' '
>> +			strbuf_setlen(split[0], split[0]->len - 1);
>> +
>> +			mod = get_auth_module(split[0]);
>> +			if (mod) {
> 
> ...get an appropriate module, if it exists...
> 
>> +
>> +				for_each_string_list_item(token, mod->tokens) {
>> +					if (!strcmp(split[1]->buf, token->string)) {
>> +						result = AUTH_ALLOW;
>> +						goto done;
>> +					}
>> +				}
>> +
>> +				if (result != AUTH_UNKNOWN)
>> +					goto done;
> 
> ...and report if we find a valid token.
> 
> Here, it seems I was wrong in my expectation of AUTH_DENY:
> if a matching module exists but no token exists in that
> module, then we keep searching other modules. 

AUTH_DENY denies a request immediately and stops searching other modules.
AUTH_ALLOW approves the request and stops looking at other modules.
AUTH_UNKNOWN means this module didn't match or 'decide' to reject, so keep
looking/asking other modules.

After reading you review, I think it may be better to change this to
more closely match your expectations (and how typical servers behave):

Return AUTH_ALLOW if we find a matching valid token for the module.
If we match a module and do NOT find a token, then return AUTH_DENY.
Otherwise return AUTH_UNKNOWN - this means the user provided some auth
mechanism we don't understand, or no auth at all.

>> +			}
>> +		}
>> +	}
>> +
>> +done:
>> +	switch (result) {
>> +	case AUTH_ALLOW:
>> +		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
>> +		*user = "VALID_TEST_USER";
>> +		*wr = WR_OK;
>> +		break;
>> +
>> +	case AUTH_DENY:
>> +		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
>> +		/* fall-through */
> 
> I'm not sure that I see a case where this is possible. Maybe
> we should have a 'result = AUTH_DENY' at the start of the
> "if (mod)" block, followed by a 'goto done' in all cases
> instead of "if (result != AUTH_UNKNOWN)"?

In this version, you're correct.. AUTH_DENY is never returned.
This tri-state response from an auth module is an oversight from an earlier
local version - sorry for the confusion here, and thanks for catching!
I will update in a v3 to match sane expectations.

>> +	case AUTH_UNKNOWN:
>> +		if (allow_anonymous)
>> +			break;
> 
> If we do not require auth, then we want to continue if there
> is no matching authentication.
> 
>> +		for (i = 0; i < auth_modules_nr; i++) {
>> +			mod = auth_modules[i];
>> +			if (mod->challenge_params)
>> +				challenge = xstrfmt("WWW-Authenticate: %s %s",
>> +						    mod->scheme,
>> +						    mod->challenge_params);
>> +			else
>> +				challenge = xstrfmt("WWW-Authenticate: %s",
>> +						    mod->scheme);
>> +			string_list_append(&hdrs, challenge);
>> +		}
>> +		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
> 
> However, here is the critical piece about how servers will
> start to act with the new WWW-Authenticate header usage in
> the Git credential helper interface. This will be critical
> in the testing for Git to retry the credential helper while
> passing these authentications schemes from the installed
> modules.
> 
>> +	}
>> +
>> +	strbuf_list_free(split);
>> +	string_list_clear(&hdrs, 0);
>> +
>> +	return result == AUTH_ALLOW ||
>> +	      (result == AUTH_UNKNOWN && allow_anonymous);
> 
> Did it work? Or did it not need to work? I'm interested to
> investigate the case that the client sent an authentication
> header that matches a module but doesn't match any tokens,
> but we allow anonymous access, anyway. Is that a 400? Or
> is that a 401?

It should probably be a 401 as the credentials are understood, but
are just 'bad'.

>> +static enum worker_result dispatch(struct req *req)
>> +{
>> +	enum worker_result wr = WR_OK;
>> +	const char *user = NULL;
>> +
>> +	if (!is_authed(req, &user, &wr))
>> +		return wr;
> 
> If we are not authed, send the 401 response.
> 
>> +	if (is_git_request(req))
>> +		return do__git(req, user);
> 
> If we are authed, then pass through to the Git response.
> 
>> +	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>> +			       WR_OK | WR_HANGUP);
> 
> If the Git request fails, we don't care. This is a test.
> Just pass a 500-level error and the client will barf,
> letting us know that something went wrong.

Correct assessment!

>> +static void kill_some_child(void)
> 
>> +static void check_dead_children(void)
> 
> These technically sound methods have unfortunate names.
> Using something like "connection" over "child" might
> alleviate some of the horror. (I initially wanted to
> suggest "subprocess" but you compare live_children to
> max_connections in the next method, so connection seemed
> appropriate.)

These are copied exactly from git-daemon, so I'd rather
avoid the churn in renaming things.

>> +static struct strvec cld_argv = STRVEC_INIT;
>> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
>> +{
>> +	struct child_process cld = CHILD_PROCESS_INIT;
>> +
>> +	if (max_connections && live_children >= max_connections) {
>> +		kill_some_child();
>> +		sleep(1);  /* give it some time to die */
>> +		check_dead_children();
>> +		if (live_children >= max_connections) {
>> +			close(incoming);
>> +			logerror("Too many children, dropping connection");
>> +			return;
>> +		}
>> +	}
> 
> Do we anticipate exercising concurrent requests in our
> tests? Perhaps it's not worth putting a cap on the
> connection count so we can keep the test helpers simple.

Probably not, but again.. 100% of the boilerplate here came from
the prior art in daemon.c, so didn't want to touch any of it!
I'm happy to start deleting things however if needed?

>> +	if (addr->sa_family == AF_INET) {
>> +		char buf[128] = "";
>> +		struct sockaddr_in *sin_addr = (void *) addr;
>> +		inet_ntop(addr->sa_family, &sin_addr->sin_addr, buf, sizeof(buf));
>> +		strvec_pushf(&cld.env, "REMOTE_ADDR=%s", buf);
>> +		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
>> +				 ntohs(sin_addr->sin_port));
>> +#ifndef NO_IPV6
>> +	} else if (addr->sa_family == AF_INET6) {
>> +		char buf[128] = "";
>> +		struct sockaddr_in6 *sin6_addr = (void *) addr;
>> +		inet_ntop(AF_INET6, &sin6_addr->sin6_addr, buf, sizeof(buf));
>> +		strvec_pushf(&cld.env, "REMOTE_ADDR=[%s]", buf);
>> +		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
>> +				 ntohs(sin6_addr->sin6_port));
>> +#endif
>> +	}
>> +
>> +	strvec_pushv(&cld.args, cld_argv.v);
>> +	cld.in = incoming;
>> +	cld.out = dup(incoming);
>> +
>> +	if (cld.out < 0)
>> +		logerror("could not dup() `incoming`");
>> +	else if (start_command(&cld))
>> +		logerror("unable to fork");
>> +	else
>> +		add_child(&cld, addr, addrlen);
>> +}
>> +
> 
> I scanned the socket creation code, but my eyes were
> glazing over. I'm definitely in the camp of "if it works,
> that's enough for our tests." If we start to rely on this
> test harness in more places, we can improve any shortcomings
> as they arise.
> 
>> +//////////////////////////////////////////////////////////////////
>> +// This section is executed by both the primary instance and all
>> +// worker instances.  So, yes, each child-process re-parses the
>> +// command line argument and re-discovers how it should behave.
>> +//////////////////////////////////////////////////////////////////
>> +
>> +int cmd_main(int argc, const char **argv)
>> +{
>> +	int listen_port = 0;
>> +	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
>> +	int worker_mode = 0;
>> +	int i;
>> +	struct auth_module *mod = NULL;
>> +
>> +	trace2_cmd_name("test-http-server");
>> +	setup_git_directory_gently(NULL);
>> +
>> +	for (i = 1; i < argc; i++) {
>> +		const char *arg = argv[i];
>> +		const char *v;
>> +
>> +		if (skip_prefix(arg, "--listen=", &v)) {
>> +			string_list_append(&listen_addr, xstrdup_tolower(v));
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--port=", &v)) {
>> +			char *end;
>> +			unsigned long n;
>> +			n = strtoul(v, &end, 0);
>> +			if (*v && !*end) {
>> +				listen_port = n;
>> +				continue;
>> +			}
>> +		}
>> +		if (!strcmp(arg, "--worker")) {
>> +			worker_mode = 1;
>> +			trace2_cmd_mode("worker");
>> +			continue;
>> +		}
>> +		if (!strcmp(arg, "--verbose")) {
>> +			verbose = 1;
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--timeout=", &v)) {
>> +			timeout = atoi(v);
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--init-timeout=", &v)) {
>> +			init_timeout = atoi(v);
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--max-connections=", &v)) {
>> +			max_connections = atoi(v);
>> +			if (max_connections < 0)
>> +				max_connections = 0; /* unlimited */
>> +			continue;
>> +		}
>> +		if (!strcmp(arg, "--reuseaddr")) {
>> +			reuseaddr = 1;
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--pid-file=", &v)) {
>> +			pid_file = v;
>> +			continue;
>> +		}
> 
> ok, most of these arguments are actually about the per-connection
> subprocesses.
> 
>> +		if (skip_prefix(arg, "--allow-anonymous", &v)) {
>> +			allow_anonymous = 1;
>> +			continue;
>> +		}
> 
> Here is how we choose to allo anonymous access.
> 
>> +		if (skip_prefix(arg, "--auth=", &v)) {
>> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
>> +
>> +			if (!p[0]) {
>> +				error("invalid argument '%s'", v);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			// trim trailing ':'
>> +			if (p[1])
>> +				strbuf_setlen(p[0], p[0]->len - 1);
>> +
>> +			if (get_auth_module(p[0])) {
>> +				error("duplicate auth scheme '%s'\n", p[0]->buf);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			mod = xmalloc(sizeof(struct auth_module));
>> +			mod->scheme = xstrdup(p[0]->buf);
>> +			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
> 
> Here, you xstrdup() into a 'const char *', but you are really
> passing ownership so it shouldn't be conts.
Ok

> 
>> +			mod->tokens = xmalloc(sizeof(struct string_list));
> 
> nit: this could also be "CALLOC_ARRAY(mod->tokens, 1);"
Sure!
>> +			string_list_init_dup(mod->tokens);
>> +
>> +			add_auth_module(mod);
>> +
>> +			strbuf_list_free(p);
>> +			continue;
> 
> Ok, we gain the auth schemes from the command line.
> 
>> +		}
>> +		if (skip_prefix(arg, "--auth-token=", &v)) {
>> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
>> +			if (!p[0]) {
>> +				error("invalid argument '%s'", v);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			if (!p[1]) {
>> +				error("missing token value '%s'\n", v);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			// trim trailing ':'
> 
> Use /* */ (Aside: I'm surprised we don't have a build option in
> DEVELOPER=1 that catches the use of these comments.)
Me too! Appologies here.
>> +			strbuf_setlen(p[0], p[0]->len - 1);
>> +
>> +			mod = get_auth_module(p[0]);
>> +			if (!mod) {
>> +				error("auth scheme not defined '%s'\n", p[0]->buf);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			string_list_append(mod->tokens, p[1]->buf);
>> +			strbuf_list_free(p);
>> +			continue;
>> +		}
> 
> And the token lists. It is important that the scheme is added
> before any token is added.
> 
>> +		fprintf(stderr, "error: unknown argument '%s'\n", arg);
>> +		usage(test_http_auth_usage);
>> +	}
>> +
>> +	/* avoid splitting a message in the middle */
>> +	setvbuf(stderr, NULL, _IOFBF, 4096);
>> +
>> +	if (listen_port == 0)
>> +		listen_port = DEFAULT_GIT_PORT;
>> +
>> +	/*
>> +	 * If no --listen=<addr> args are given, the setup_named_sock()
>> +	 * code will use receive a NULL address and set INADDR_ANY.
>> +	 * This exposes both internal and external interfaces on the
>> +	 * port.
>> +	 *
>> +	 * Disallow that and default to the internal-use-only loopback
>> +	 * address.
>> +	 */
>> +	if (!listen_addr.nr)
>> +		string_list_append(&listen_addr, "127.0.0.1");
>> +
>> +	/*
>> +	 * worker_mode is set in our own child process instances
>> +	 * (that are bound to a connected socket from a client).
>> +	 */
>> +	if (worker_mode)
>> +		return worker();
>> +
>> +	/*
>> +	 * `cld_argv` is a bit of a clever hack. The top-level instance
>> +	 * of test-http-server does the normal bind/listen/accept stuff.
>> +	 * For each incoming socket, the top-level process spawns
>> +	 * a child instance of test-http-server *WITH* the additional
>> +	 * `--worker` argument. This causes the child to set `worker_mode`
>> +	 * and immediately call `worker()` using the connected socket (and
>> +	 * without the usual need for fork() or threads).
>> +	 *
>> +	 * The magic here is made possible because `cld_argv` is static
>> +	 * and handle() (called by service_loop()) knows about it.
>> +	 */
>> +	strvec_push(&cld_argv, argv[0]);
>> +	strvec_push(&cld_argv, "--worker");
>> +	for (i = 1; i < argc; ++i)
>> +		strvec_push(&cld_argv, argv[i]);
>> +
>> +	/*
>> +	 * Setup primary instance to listen for connections.
>> +	 */
>> +	return serve(&listen_addr, listen_port);
>> +}
> 
> And complete the thing with some boilerplate.
> 
> This was a lot to read, and the interesting bits are all mixed in
> with the http server code, which is less interesting to what we
> are trying to accomplish. It would be beneficial to split this
> into one or two patches before we actually introduce the tests.
> 
> The most important thing that I think would be helpful is to
> isolate all the authentication behavior into its own patch so
> we can see how those connections from the command-line arguments
> affect the behavior of the server responses.
> 
> I think ideally we would have the following split:
> 
>  1. All server boilerblate. All requests 500 not-implemented.
> 
>  2. Add Git fall-through with no authentication. Add the tests
>     that are intended to allow anonymous auth.
> 
>  3. Add authentication data structures read from command-line,
>     but not processed at all in the logic.
> 
>  4. Act on the authentication data structures to alter the
>     requests. Add the tests that use these authentication
>     schemes.
> 
> I could easily see a case for combining 1&2 as well as 3&4,
> for slightly larger but more completely-testable changes at
> every step.
I agree, and my appologies for not splitting these out.
I'll follow up with a split that should make more sense.
> From what I read, I don't think there is much to change in
> the end result of the code, but it definitely was hard to read
> the important things when surrounded by many lines of
> boilerplate.
> 
>> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
> 
> I'm going to pause here and come back to the test script in
> a separate reply.
> 
> Thanks,
> -Stolee
Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v2 6/6] t5556-http-auth: add test for HTTP auth hdr logic
  2022-11-01 23:14         ` Matthew John Cheetham
@ 2022-11-02 14:38           ` Derrick Stolee
  0 siblings, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-11-02 14:38 UTC (permalink / raw)
  To: Matthew John Cheetham, Jeff Hostetler,
	Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham

On 11/1/22 7:14 PM, Matthew John Cheetham wrote:
> On 2022-10-28 12:14, Jeff Hostetler wrote:
>> On 10/28/22 11:08 AM, Derrick Stolee wrote:

>>>> +static void kill_some_child(void)
>>>
>>>> +static void check_dead_children(void)
>>>
>>> These technically sound methods have unfortunate names.
>>> Using something like "connection" over "child" might
>>> alleviate some of the horror. (I initially wanted to
>>> suggest "subprocess" but you compare live_children to
>>> max_connections in the next method, so connection seemed
>>> appropriate.)
>>
>> These names were inherited from `daemon.c` IIRC. I wouldn't change
>> them since it'll just introduce noise when diffing.  Especially,
>> if we do the copy commit first.
> 
> Indeed. These functions are untouched from daemon.c. I do plan to split
> this mega-patch up however in to a single 'add the boilerplate' based on
> git-daemon patch, then add the extra pieces like HTTP request parsing and
> the auth pieces in a v3.

If these are copied from daemon.c, it may be worth trying
to lib-ify these data structures and code so they can be
shared across the two places. That can also come up as a
cleanup later, too.

For now, don't bother changing the names since they exist
somewhere else.
 
>> [...]
>>>> +static struct strvec cld_argv = STRVEC_INIT;
>>>> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
>>>> +{
>>>> +    struct child_process cld = CHILD_PROCESS_INIT;
>>>> +
>>>> +    if (max_connections && live_children >= max_connections) {
>>>> +        kill_some_child();
>>>> +        sleep(1);  /* give it some time to die */
>>>> +        check_dead_children();
>>>> +        if (live_children >= max_connections) {
>>>> +            close(incoming);
>>>> +            logerror("Too many children, dropping connection");
>>>> +            return;
>>>> +        }
>>>> +    }
>>>
>>> Do we anticipate exercising concurrent requests in our
>>> tests? Perhaps it's not worth putting a cap on the
>>> connection count so we can keep the test helpers simple.
>>
>> again, this code was inherited from `daemon.c`, so we could leave it.

I wonder how much could be extracted from daemon.c using a
copy into a 'daemon-lib.c' with methods defined in 'daemon-lib.h'
then consumed from this file instead. Not sure it's worth the
churn to daemon.c, though.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-10-21 17:07 ` [PATCH v2 0/6] " Matthew John Cheetham via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-10-25  2:26   ` git-credential.txt M Hickford
@ 2022-11-02 22:09   ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 01/11] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
                       ` (15 more replies)
  7 siblings, 16 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

Following from my original RFC submission [0], this submission is considered
ready for full review. This patch series is now based on top of current
master (9c32cfb49c60fa8173b9666db02efe3b45a8522f) that includes my now
separately submitted patches [1] to fix up the other credential helpers'
behaviour.

In this patch series I update the existing credential helper design in order
to allow for some new scenarios, and future evolution of auth methods that
Git hosts may wish to provide. I outline the background, summary of changes
and some challenges below.

Testing these new additions, I introduce a new test helper test-http-server
that acts as a frontend to git-http-backend; a mini HTTP server based
heavily on git-daemon, with simple authentication configurable by command
line args.


Background
==========

Git uses a variety of protocols [2]: local, Smart HTTP, Dumb HTTP, SSH, and
Git. Here I focus on the Smart HTTP protocol, and attempt to enhance the
authentication capabilities of this protocol to address limitations (see
below).

The Smart HTTP protocol in Git supports a few different types of HTTP
authentication - Basic and Digest (RFC 2617) [3], and Negotiate (RFC 2478)
[4]. Git uses a extensible model where credential helpers can provide
credentials for protocols [5]. Several helpers support alternatives such as
OAuth authentication (RFC 6749) [6], but this is typically done as an
extension. For example, a helper might use basic auth and set the password
to an OAuth Bearer access token. Git uses standard input and output to
communicate with credential helpers.

After a HTTP 401 response, Git would call a credential helper with the
following over standard input:

protocol=https
host=example.com


And then a credential helper would return over standard output:

protocol=https
host=example.com
username=bob@id.example.com
password=<BEARER-TOKEN>


Git then the following request to the remote, including the standard HTTP
Authorization header (RFC 7235 Section 4.2) [7]:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: Basic base64(bob@id.example.com:<BEARER-TOKEN>)


Credential helpers are encouraged (see gitcredentials.txt) to return the
minimum information necessary.


Limitations
===========

Because this credential model was built mostly for password based
authentication systems, it's somewhat limited. In particular:

 1. To generate valid credentials, additional information about the request
    (or indeed the requestee and their device) may be required. For example,
    OAuth is based around scopes. A scope, like "git.read", might be
    required to read data from the remote. However, the remote cannot tell
    the credential helper what scope is required for this request.

 2. This system is not fully extensible. Each time a new type of
    authentication (like OAuth Bearer) is invented, Git needs updates before
    credential helpers can take advantage of it (or leverage a new
    capability in libcurl).


Goals
=====

 * As a user with multiple federated cloud identities:
   
   * Reach out to a remote and have my credential helper automatically
     prompt me for the correct identity.
   * Allow credential helpers to differentiate between different authorities
     or authentication/authorization challenge types, even from the same DNS
     hostname (and without needing to use credential.useHttpPath).
   * Leverage existing authentication systems built-in to many operating
     systems and devices to boost security and reduce reliance on passwords.

 * As a Git host and/or cloud identity provider:
   
   * Leverage newest identity standards, enhancements, and threat
     mitigations - all without updating Git.
   * Enforce security policies (like requiring two-factor authentication)
     dynamically.
   * Allow integration with third party standard based identity providers in
     enterprises allowing customers to have a single plane of control for
     critical identities with access to source code.


Design Principles
=================

 * Use the existing infrastructure. Git credential helpers are an
   already-working model.
 * Follow widely-adopted time-proven open standards, avoid net new ideas in
   the authentication space.
 * Minimize knowledge of authentication in Git; maintain modularity and
   extensibility.


Proposed Changes
================

 1. Teach Git to read HTTP response headers, specifically the standard
    WWW-Authenticate (RFC 7235 Section 4.1) headers.

 2. Teach Git to include extra information about HTTP responses that require
    authentication when calling credential helpers. Specifically the
    WWW-Authenticate header information.
    
    Because the extra information forms an ordered list, and the existing
    credential helper I/O format only provides for simple key=value pairs,
    we introduce a new convention for transmitting an ordered list of
    values. Key names that are suffixed with a C-style array syntax should
    have values considered to form an order list, i.e. key[]=value, where
    the order of the key=value pairs in the stream specifies the order.
    
    For the WWW-Authenticate header values we opt to use the key wwwauth[].

 3. Teach Git to specify authentication schemes other than Basic in
    subsequent HTTP requests based on credential helper responses.


Handling the WWW-Authenticate header in detail
==============================================

RFC 6750 [8] envisions that OAuth Bearer resource servers would give
responses that include WWW-Authenticate headers, for example:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


Specifically, a WWW-Authenticate header consists of a scheme and arbitrary
attributes, depending on the scheme. This pattern enables generic OAuth or
OpenID Connect [9] authorities. Note that it is possible to have several
WWW-Authenticate challenges in a response.

First Git attempts to make a request, unauthenticated, which fails with a
401 response and includes WWW-Authenticate header(s).

Next, Git invokes a credential helper which may prompt the user. If the user
approves, a credential helper can generate a token (or any auth challenge
response) to be used for that request.

For example: with a remote that supports bearer tokens from an OpenID
Connect [9] authority, a credential helper can use OpenID Connect's
Discovery [10] and Dynamic Client Registration [11] to register a client and
make a request with the correct permissions to access the remote. In this
manner, a user can be dynamically sent to the right federated identity
provider for a remote without any up-front configuration or manual
processes.

Following from the principle of keeping authentication knowledge in Git to a
minimum, we modify Git to add all WWW-Authenticate values to the credential
helper call.

Git sends over standard input:

protocol=https
host=example.com
wwwauth[]=Bearer realm="login.example", scope="git.readwrite"
wwwauth[]=Basic realm="login.example"


A credential helper that understands the extra wwwauth[n] property can
decide on the "best" or correct authentication scheme, generate credentials
for the request, and interact with the user.

The credential helper would then return over standard output:

protocol=https
host=example.com
path=foo.git
username=bob@identity.example
password=<BEARER-TOKEN>


Note that WWW-Authenticate supports multiple challenges, either in one
header:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"


or in multiple headers:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


These have equivalent meaning (RFC 2616 Section 4.2 [12]). To simplify the
implementation, Git will not merge or split up any of these WWW-Authenticate
headers, and instead pass each header line as one credential helper
property. The credential helper is responsible for splitting, merging, and
otherwise parsing these header values.

An alternative option to sending the header fields individually would be to
merge the header values in to one key=value property, for example:

...
wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"



Future flexibility
==================

By allowing the credential helpers decide the best authentication scheme, we
can allow the remote Git server to both offer new schemes (or remove old
ones) that enlightened credential helpers could take immediate advantage of,
and to use credentials that are much more tightly scoped and bound to the
specific request.

For example imagine a new "FooBar" authentication scheme that is surfaced in
the following response:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: FooBar realm="login.example", algs="ES256 PS256"


With support for arbitrary authentication schemes, Git would call credential
helpers with the following over standard input:

protocol=https
host=example.com
wwwauth[]=FooBar realm="login.example", algs="ES256 PS256", nonce="abc123"


And then an enlightened credential helper would return over standard output:

protocol=https
host=example.com
authtype=FooBar
username=bob@id.example.com
password=<FooBar credential>


Git would be expected to attach this authorization header to the next
request:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: FooBar <FooBar credential>



Should Git not control the set of authentication schemes?
=========================================================

One concern that the reader may have regarding these changes is in allowing
helpers to select the authentication mechanism to use, it may be possible
that a weaker form of authentication is used.

Take for example a Git remote server that responds with the following
authentication schemes:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Negotiate ...
WWW-Authenticate: Basic ...


Today Git (and libcurl) prefer to Negotiate over Basic authentication [13].
If a helper responded with authtype=basic Git would now be using a "less
secure" mechanism.

The reason we still propose the credential helper decide on the
authentication scheme is that Git is not the best placed entity to decide
what type of authentication should be used for a particular request (see
Design Principle 3).

OAuth Bearer tokens are often bundled in Basic Authorization headers [14],
but given that the tokens are/can be short-lived and have a highly scoped
set of permissions, this solution could be argued as being more secure than
something like NTLM [15]. Similarly, the user may wish to be consulted on
selecting a particular user account, or directly selecting an authentication
mechanism for a request that otherwise they would not be able to use.

Also, as new authentication protocols appear Git does not need to be
modified or updated for the user to take advantage of them; the credential
helpers take on the responsibility of learning and selecting the "best"
option.


Why not SSH?
============

There's nothing wrong with SSH. However, Git's Smart HTTP transport is
widely used, often with OAuth Bearer tokens. Git's Smart HTTP transport
sometimes requires less client setup than SSH transport, and works in
environments when SSH ports may be blocked. As long as Git supports HTTP
transport, it should support common and popular HTTP authentication methods.


References
==========

 * [0] [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth
   headers
   https://lore.kernel.org/git/pull.1352.git.1663097156.gitgitgadget@gmail.com/

 * [1] [PATCH 0/3] Correct credential helper discrepancies handling input
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * [2] Git on the Server - The Protocols
   https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols

 * [3] HTTP Authentication: Basic and Digest Access Authentication
   https://datatracker.ietf.org/doc/html/rfc2617

 * [4] The Simple and Protected GSS-API Negotiation Mechanism
   https://datatracker.ietf.org/doc/html/rfc2478

 * [5] Git Credentials - Custom Helpers
   https://git-scm.com/docs/gitcredentials#_custom_helpers

 * [6] The OAuth 2.0 Authorization Framework
   https://datatracker.ietf.org/doc/html/rfc6749

 * [7] Hypertext Transfer Protocol (HTTP/1.1): Authentication
   https://datatracker.ietf.org/doc/html/rfc7235

 * [8] The OAuth 2.0 Authorization Framework: Bearer Token Usage
   https://datatracker.ietf.org/doc/html/rfc6750

 * [9] OpenID Connect Core 1.0
   https://openid.net/specs/openid-connect-core-1_0.html

 * [10] OpenID Connect Discovery 1.0
   https://openid.net/specs/openid-connect-discovery-1_0.html

 * [11] OpenID Connect Dynamic Client Registration 1.0
   https://openid.net/specs/openid-connect-registration-1_0.html

 * [12] Hypertext Transfer Protocol (HTTP/1.1)
   https://datatracker.ietf.org/doc/html/rfc2616

 * [13] libcurl http.c pickoneauth Function
   https://github.com/curl/curl/blob/c495dcd02e885fc3f35164b1c3c5f72fa4b60c46/lib/http.c#L381-L416

 * [14] Git Credential Manager GitHub Host Provider (using PAT as password)
   https://github.com/GitCredentialManager/git-credential-manager/blob/f77b766f6875b90251249f2aa1702b921309cf00/src/shared/GitHub/GitHubHostProvider.cs#L157

 * [15] NT LAN Manager (NTLM) Authentication Protocol
   https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-nlmp/b38c36ed-2804-4868-a9ff-8dd3182128e4


Updates from RFC
================

 * Submitted first three patches as separate submission:
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * Various style fixes and updates to- and addition of comments.

 * Drop the explicit integer index in new 'array' style credential helper
   attrbiutes ("key[n]=value" becomes just "key[]=value").

 * Added test helper; a mini HTTP server, and several tests.


Updates in v3
=============

 * Split final patch that added the test-http-server in to several, easier
   to review patches.

 * Updated wording in git-credential.txt to clarify which side of the
   credential helper protocol is sending/receiving the new wwwauth and
   authtype attributes.

Matthew John Cheetham (11):
  http: read HTTP WWW-Authenticate response headers
  credential: add WWW-Authenticate header to cred requests
  http: store all request headers on active_request_slot
  http: move proactive auth to first slot creation
  http: set specific auth scheme depending on credential
  test-http-server: add stub HTTP server test helper
  test-http-server: add HTTP error response function
  test-http-server: add HTTP request parsing
  test-http-server: pass Git requests to http-backend
  test-http-server: add simple authentication
  t5556: add HTTP authentication tests

 Documentation/git-credential.txt          |   29 +-
 Makefile                                  |    2 +
 contrib/buildsystems/CMakeLists.txt       |   13 +
 credential.c                              |   18 +
 credential.h                              |   16 +
 git-curl-compat.h                         |   10 +
 http-push.c                               |  103 +-
 http-walker.c                             |    2 +-
 http.c                                    |  200 +++-
 http.h                                    |    4 +-
 remote-curl.c                             |   36 +-
 t/helper/.gitignore                       |    1 +
 t/helper/test-credential-helper-replay.sh |   14 +
 t/helper/test-http-server.c               | 1146 +++++++++++++++++++++
 t/t5556-http-auth.sh                      |  260 +++++
 15 files changed, 1717 insertions(+), 137 deletions(-)
 create mode 100755 t/helper/test-credential-helper-replay.sh
 create mode 100644 t/helper/test-http-server.c
 create mode 100755 t/t5556-http-auth.sh


base-commit: 9c32cfb49c60fa8173b9666db02efe3b45a8522f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1352%2Fmjcheetham%2Femu-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1352/mjcheetham/emu-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1352

Range-diff vs v2:

  1:  f297c78f60a =  1:  f297c78f60a http: read HTTP WWW-Authenticate response headers
  2:  0838d992744 !  2:  e45e23406a5 credential: add WWW-Authenticate header to cred requests
     @@ Commit message
          C-style array syntax is used in the property name to denote multiple
          ordered values for the same property.
      
     -    In this case we send multiple `wwwauth[n]` properties where `n` is a
     -    zero-indexed number, reflecting the order the WWW-Authenticate headers
     -    appeared in the HTTP response.
     +    In this case we send multiple `wwwauth[]` properties where the order
     +    that the repeated attributes appear in the conversation reflects the
     +    order that the WWW-Authenticate headers appeared in the HTTP response.
      
          [1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47
      
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Documentation/git-credential.txt ##
     +@@ Documentation/git-credential.txt: separated by an `=` (equals) sign, followed by a newline.
     + The key may contain any bytes except `=`, newline, or NUL. The value may
     + contain any bytes except newline or NUL.
     + 
     +-In both cases, all bytes are treated as-is (i.e., there is no quoting,
     ++Attributes with keys that end with C-style array brackets `[]` can have
     ++multiple values. Each instance of a multi-valued attribute forms an
     ++ordered list of values - the order of the repeated attributes defines
     ++the order of the values. An empty multi-valued attribute (`key[]=\n`)
     ++acts to clear any previous entries and reset the list.
     ++
     ++In all cases, all bytes are treated as-is (i.e., there is no quoting,
     + and one cannot transmit a value with newline or NUL in it). The list of
     + attributes is terminated by a blank line or end-of-file.
     + 
      @@ Documentation/git-credential.txt: empty string.
       Components which are missing from the URL (e.g., there is no
       username in the example above) will be left unset.
       
      +`wwwauth[]`::
      +
     -+	When an HTTP response is received that includes one or more
     -+	'WWW-Authenticate' authentication headers, these can be passed to Git
     -+	(and subsequent credential helpers) with these attributes.
     -+	Each 'WWW-Authenticate' header value should be passed as a separate
     -+	attribute 'wwwauth[]' where the order of the attributes is the same
     -+	as they appear in the HTTP response.
     ++	When an HTTP response is received by Git that includes one or more
     ++	'WWW-Authenticate' authentication headers, these will be passed by Git
     ++	to credential helpers.
     ++	Each 'WWW-Authenticate' header value is passed as a multi-valued
     ++	attribute 'wwwauth[]', where the order of the attributes is the same as
     ++	they appear in the HTTP response.
      +
       GIT
       ---
  3:  c62fef65f46 =  3:  65ac638b8a0 http: store all request headers on active_request_slot
  4:  a790c01f9f2 =  4:  4d75ca29cc5 http: move proactive auth to first slot creation
  5:  b0b7cd7ee5e !  5:  2f38427aa8d http: set specific auth scheme depending on credential
     @@ Commit message
      
       ## Documentation/git-credential.txt ##
      @@ Documentation/git-credential.txt: username in the example above) will be left unset.
     - 	attribute 'wwwauth[]' where the order of the attributes is the same
     - 	as they appear in the HTTP response.
     + 	attribute 'wwwauth[]', where the order of the attributes is the same as
     + 	they appear in the HTTP response.
       
      +`authtype`::
      +
     -+	Indicates the type of authentication scheme used. If this is not
     -+	present the default is "Basic".
     ++	Indicates the type of authentication scheme that should be used by Git.
     ++	Credential helpers may reply to a request from Git with this attribute,
     ++	such that subsequent authenticated requests include the correct
     ++	`Authorization` header.
     ++	If this attribute is not present, the default value is "Basic".
      +	Known values include "Basic", "Digest", and "Bearer".
      +	If an unknown value is provided, this is taken as the authentication
      +	scheme for the `Authorization` header, and the `password` field is
  6:  f3f13ed8c82 !  6:  4947e81546a t5556-http-auth: add test for HTTP auth hdr logic
     @@ Metadata
      Author: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Commit message ##
     -    t5556-http-auth: add test for HTTP auth hdr logic
     +    test-http-server: add stub HTTP server test helper
      
     -    Add a series of tests to exercise the HTTP authentication header parsing
     -    and the interop with credential helpers. Credential helpers can respond
     -    to requests that contain WWW-Authenticate information with the ability
     -    to select the response Authenticate header scheme.
     +    Introduce a mini HTTP server helper that in the future will be enhanced
     +    to provide a frontend for the git-http-backend, with support for
     +    arbitrary authentication schemes.
      
     -    Introduce a mini HTTP server helper that provides a frontend for the
     -    git-http-backend, with support for arbitrary authentication schemes.
     -    The test-http-server is based heavily on the git-daemon, and forwards
     -    all successfully authenticated requests to the http-backend.
     +    Right now, test-http-server is a pared-down copy of the git-daemon that
     +    always returns a 501 Not Implemented response to all callers.
      
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
     @@ t/helper/.gitignore
      @@
       /test-tool
       /test-fake-ssh
     -+test-http-server
     -
     - ## t/helper/test-credential-helper-replay.sh (new) ##
     -@@
     -+cmd=$1
     -+teefile=$cmd-actual.cred
     -+catfile=$cmd-response.cred
     -+rm -f $teefile
     -+while read line;
     -+do
     -+	if test -z "$line"; then
     -+		break;
     -+	fi
     -+	echo "$line" >> $teefile
     -+done
     -+if test "$cmd" = "get"; then
     -+	cat $catfile
     -+fi
     ++/test-http-server
      
       ## t/helper/test-http-server.c (new) ##
      @@
     @@ t/helper/test-http-server.c (new)
      +"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
      +"           [--reuseaddr] [--pid-file=<file>]\n"
      +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
     -+"           [--anonymous-allowed]\n"
     -+"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
      +;
      +
      +/* Timeout, and initial timeout */
     @@ t/helper/test-http-server.c (new)
      +	}
      +}
      +
     -+//////////////////////////////////////////////////////////////////
     -+// The code in this section is used by "worker" instances to service
     -+// a single connection from a client.  The worker talks to the client
     -+// on 0 and 1.
     -+//////////////////////////////////////////////////////////////////
     ++/*
     ++ * The code in this section is used by "worker" instances to service
     ++ * a single connection from a client.  The worker talks to the client
     ++ * on 0 and 1.
     ++ */
      +
      +enum worker_result {
      +	/*
     @@ t/helper/test-http-server.c (new)
      +	 * Caller *might* keep the socket open and allow keep-alive.
      +	 */
      +	WR_OK       = 0,
     ++
      +	/*
      +	 * Various errors while processing the request and/or the response.
      +	 * Close the socket and clean up.
      +	 * Exit child-process with non-zero status.
      +	 */
      +	WR_IO_ERROR = 1<<0,
     ++
      +	/*
      +	 * Close the socket and clean up.  Does not imply an error.
      +	 */
     @@ t/helper/test-http-server.c (new)
      +	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
      +};
      +
     -+/*
     -+ * Fields from a parsed HTTP request.
     -+ */
     -+struct req {
     -+	struct strbuf start_line;
     -+
     -+	const char *method;
     -+	const char *http_version;
     -+
     -+	struct strbuf uri_path;
     -+	struct strbuf query_args;
     -+
     -+	struct string_list header_list;
     -+	const char *content_type;
     -+	ssize_t content_length;
     -+};
     -+
     -+#define REQ__INIT { \
     -+	.start_line = STRBUF_INIT, \
     -+	.uri_path = STRBUF_INIT, \
     -+	.query_args = STRBUF_INIT, \
     -+	.header_list = STRING_LIST_INIT_NODUP, \
     -+	.content_type = NULL, \
     -+	.content_length = -1 \
     -+	}
     -+
     -+static void req__release(struct req *req)
     -+{
     -+	strbuf_release(&req->start_line);
     -+
     -+	strbuf_release(&req->uri_path);
     -+	strbuf_release(&req->query_args);
     -+
     -+	string_list_clear(&req->header_list, 0);
     -+}
     -+
     -+static enum worker_result send_http_error(
     -+	int fd,
     -+	int http_code, const char *http_code_name,
     -+	int retry_after_seconds, struct string_list *response_headers,
     -+	enum worker_result wr_in)
     -+{
     -+	struct strbuf response_header = STRBUF_INIT;
     -+	struct strbuf response_content = STRBUF_INIT;
     -+	struct string_list_item *h;
     -+	enum worker_result wr;
     -+
     -+	strbuf_addf(&response_content, "Error: %d %s\r\n",
     -+		    http_code, http_code_name);
     -+	if (retry_after_seconds > 0)
     -+		strbuf_addf(&response_content, "Retry-After: %d\r\n",
     -+			    retry_after_seconds);
     -+
     -+	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
     -+	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
     -+	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
     -+	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
     -+	if (retry_after_seconds > 0)
     -+		strbuf_addf  (&response_header, "Retry-After: %d\r\n", retry_after_seconds);
     -+	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
     -+	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
     -+	if (response_headers)
     -+		for_each_string_list_item(h, response_headers)
     -+			strbuf_addf(&response_header, "%s\r\n", h->string);
     -+	strbuf_addstr(&response_header, "\r\n");
     -+
     -+	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
     -+		logerror("unable to write response header");
     -+		wr = WR_IO_ERROR;
     -+		goto done;
     -+	}
     -+
     -+	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
     -+		logerror("unable to write response content body");
     -+		wr = WR_IO_ERROR;
     -+		goto done;
     -+	}
     -+
     -+	wr = wr_in;
     -+
     -+done:
     -+	strbuf_release(&response_header);
     -+	strbuf_release(&response_content);
     -+
     -+	return wr;
     -+}
     -+
     -+/*
     -+ * Read the HTTP request up to the start of the optional message-body.
     -+ * We do this byte-by-byte because we have keep-alive turned on and
     -+ * cannot rely on an EOF.
     -+ *
     -+ * https://tools.ietf.org/html/rfc7230
     -+ *
     -+ * We cannot call die() here because our caller needs to properly
     -+ * respond to the client and/or close the socket before this
     -+ * child exits so that the client doesn't get a connection reset
     -+ * by peer error.
     -+ */
     -+static enum worker_result req__read(struct req *req, int fd)
     -+{
     -+	struct strbuf h = STRBUF_INIT;
     -+	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
     -+	int nr_start_line_fields;
     -+	const char *uri_target;
     -+	const char *query;
     -+	char *hp;
     -+	const char *hv;
     -+
     -+	enum worker_result result = WR_OK;
     -+
     -+	/*
     -+	 * Read line 0 of the request and split it into component parts:
     -+	 *
     -+	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
     -+	 *
     -+	 */
     -+	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
     -+		result = WR_OK | WR_HANGUP;
     -+		goto done;
     -+	}
     -+
     -+	strbuf_trim_trailing_newline(&req->start_line);
     -+
     -+	nr_start_line_fields = string_list_split(&start_line_fields,
     -+						 req->start_line.buf,
     -+						 ' ', -1);
     -+	if (nr_start_line_fields != 3) {
     -+		logerror("could not parse request start-line '%s'",
     -+			 req->start_line.buf);
     -+		result = WR_IO_ERROR;
     -+		goto done;
     -+	}
     -+
     -+	req->method = xstrdup(start_line_fields.items[0].string);
     -+	req->http_version = xstrdup(start_line_fields.items[2].string);
     -+
     -+	uri_target = start_line_fields.items[1].string;
     -+
     -+	if (strcmp(req->http_version, "HTTP/1.1")) {
     -+		logerror("unsupported version '%s' (expecting HTTP/1.1)",
     -+			 req->http_version);
     -+		result = WR_IO_ERROR;
     -+		goto done;
     -+	}
     -+
     -+	query = strchr(uri_target, '?');
     -+
     -+	if (query) {
     -+		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
     -+		strbuf_trim_trailing_dir_sep(&req->uri_path);
     -+		strbuf_addstr(&req->query_args, query + 1);
     -+	} else {
     -+		strbuf_addstr(&req->uri_path, uri_target);
     -+		strbuf_trim_trailing_dir_sep(&req->uri_path);
     -+	}
     -+
     -+	/*
     -+	 * Read the set of HTTP headers into a string-list.
     -+	 */
     -+	while (1) {
     -+		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
     -+			goto done;
     -+		strbuf_trim_trailing_newline(&h);
     -+
     -+		if (!h.len)
     -+			goto done; /* a blank line ends the header */
     -+
     -+		hp = strbuf_detach(&h, NULL);
     -+		string_list_append(&req->header_list, hp);
     -+
     -+		/* store common request headers separately */
     -+		if (skip_prefix(hp, "Content-Type: ", &hv)) {
     -+			req->content_type = hv;
     -+		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
     -+			req->content_length = strtol(hv, &hp, 10);
     -+		}
     -+	}
     -+
     -+	/*
     -+	 * We do not attempt to read the <message-body>, if it exists.
     -+	 * We let our caller read/chunk it in as appropriate.
     -+	 */
     -+
     -+done:
     -+	string_list_clear(&start_line_fields, 0);
     -+
     -+	/*
     -+	 * This is useful for debugging the request, but very noisy.
     -+	 */
     -+	if (trace2_is_enabled()) {
     -+		struct string_list_item *item;
     -+		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
     -+		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
     -+		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
     -+		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
     -+		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
     -+		if (req->content_length >= 0)
     -+			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
     -+		if (req->content_type)
     -+			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
     -+		for_each_string_list_item(item, &req->header_list)
     -+			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
     -+	}
     -+
     -+	return result;
     -+}
     -+
     -+static int is_git_request(struct req *req)
     -+{
     -+	static regex_t *smart_http_regex;
     -+	static int initialized;
     -+
     -+	if (!initialized) {
     -+		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
     -+		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
     -+			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
     -+			    REG_EXTENDED)) {
     -+			warning("could not compile smart HTTP regex");
     -+			smart_http_regex = NULL;
     -+		}
     -+		initialized = 1;
     -+	}
     -+
     -+	return smart_http_regex &&
     -+		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
     -+}
     -+
     -+static enum worker_result do__git(struct req *req, const char *user)
     -+{
     -+	const char *ok = "HTTP/1.1 200 OK\r\n";
     -+	struct child_process cp = CHILD_PROCESS_INIT;
     -+	int res;
     -+
     -+	if (write(1, ok, strlen(ok)) < 0)
     -+		return error(_("could not send '%s'"), ok);
     -+
     -+	if (user)
     -+		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
     -+
     -+	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
     -+	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
     -+			req->uri_path.buf);
     -+	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
     -+	if (req->query_args.len)
     -+		strvec_pushf(&cp.env, "QUERY_STRING=%s",
     -+				req->query_args.buf);
     -+	if (req->content_type)
     -+		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
     -+				req->content_type);
     -+	if (req->content_length >= 0)
     -+		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
     -+				(intmax_t)req->content_length);
     -+	cp.git_cmd = 1;
     -+	strvec_push(&cp.args, "http-backend");
     -+	res = run_command(&cp);
     -+	close(1);
     -+	close(0);
     -+	return !!res;
     -+}
     -+
     -+enum auth_result {
     -+	AUTH_UNKNOWN = 0,
     -+	AUTH_DENY = 1,
     -+	AUTH_ALLOW = 2,
     -+};
     -+
     -+struct auth_module {
     -+	const char *scheme;
     -+	const char *challenge_params;
     -+	struct string_list *tokens;
     -+};
     -+
     -+static int allow_anonymous;
     -+static struct auth_module **auth_modules = NULL;
     -+static size_t auth_modules_nr = 0;
     -+static size_t auth_modules_alloc = 0;
     -+
     -+static struct auth_module *get_auth_module(struct strbuf *scheme)
     -+{
     -+	int i;
     -+	struct auth_module *mod;
     -+	for (i = 0; i < auth_modules_nr; i++) {
     -+		mod = auth_modules[i];
     -+		if (!strcasecmp(mod->scheme, scheme->buf))
     -+			return mod;
     -+	}
     -+
     -+	return NULL;
     -+}
     -+
     -+static void add_auth_module(struct auth_module *mod)
     -+{
     -+	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
     -+	auth_modules[auth_modules_nr++] = mod;
     -+}
     -+
     -+static int is_authed(struct req *req, const char **user, enum worker_result *wr)
     -+{
     -+	enum auth_result result = AUTH_UNKNOWN;
     -+	struct string_list hdrs = STRING_LIST_INIT_NODUP;
     -+	struct auth_module *mod;
     -+
     -+	struct string_list_item *hdr;
     -+	struct string_list_item *token;
     -+	const char *v;
     -+	struct strbuf **split = NULL;
     -+	int i;
     -+	char *challenge;
     -+
     -+	/* ask all auth modules to validate the request */
     -+	for_each_string_list_item(hdr, &req->header_list) {
     -+		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
     -+			split = strbuf_split_str(v, ' ', 2);
     -+			if (!split[0] || !split[1]) continue;
     -+
     -+			// trim trailing space ' '
     -+			strbuf_setlen(split[0], split[0]->len - 1);
     -+
     -+			mod = get_auth_module(split[0]);
     -+			if (mod) {
     -+
     -+				for_each_string_list_item(token, mod->tokens) {
     -+					if (!strcmp(split[1]->buf, token->string)) {
     -+						result = AUTH_ALLOW;
     -+						goto done;
     -+					}
     -+				}
     -+
     -+				if (result != AUTH_UNKNOWN)
     -+					goto done;
     -+			}
     -+		}
     -+	}
     -+
     -+done:
     -+	switch (result) {
     -+	case AUTH_ALLOW:
     -+		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
     -+		*user = "VALID_TEST_USER";
     -+		*wr = WR_OK;
     -+		break;
     -+
     -+	case AUTH_DENY:
     -+		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
     -+		/* fall-through */
     -+
     -+	case AUTH_UNKNOWN:
     -+		if (allow_anonymous)
     -+			break;
     -+		for (i = 0; i < auth_modules_nr; i++) {
     -+			mod = auth_modules[i];
     -+			if (mod->challenge_params)
     -+				challenge = xstrfmt("WWW-Authenticate: %s %s",
     -+						    mod->scheme,
     -+						    mod->challenge_params);
     -+			else
     -+				challenge = xstrfmt("WWW-Authenticate: %s",
     -+						    mod->scheme);
     -+			string_list_append(&hdrs, challenge);
     -+		}
     -+		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
     -+	}
     -+
     -+	strbuf_list_free(split);
     -+	string_list_clear(&hdrs, 0);
     -+
     -+	return result == AUTH_ALLOW ||
     -+	      (result == AUTH_UNKNOWN && allow_anonymous);
     -+}
     -+
     -+static enum worker_result dispatch(struct req *req)
     -+{
     -+	enum worker_result wr = WR_OK;
     -+	const char *user = NULL;
     -+
     -+	if (!is_authed(req, &user, &wr))
     -+		return wr;
     -+
     -+	if (is_git_request(req))
     -+		return do__git(req, user);
     -+
     -+	return send_http_error(1, 501, "Not Implemented", -1, NULL,
     -+			       WR_OK | WR_HANGUP);
     -+}
     -+
      +static enum worker_result worker(void)
      +{
     -+	struct req req = REQ__INIT;
     ++	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
      +	char *client_addr = getenv("REMOTE_ADDR");
      +	char *client_port = getenv("REMOTE_PORT");
      +	enum worker_result wr = WR_OK;
     @@ t/helper/test-http-server.c (new)
      +	set_keep_alive(0);
      +
      +	while (1) {
     -+		req__release(&req);
     -+
     -+		alarm(init_timeout ? init_timeout : timeout);
     -+		wr = req__read(&req, 0);
     -+		alarm(0);
     -+
     -+		if (wr & WR_STOP_THE_MUSIC)
     -+			break;
     ++		if (write_in_full(1, response, strlen(response)) < 0) {
     ++			logerror("unable to write response");
     ++			wr = WR_IO_ERROR;
     ++		}
      +
     -+		wr = dispatch(&req);
      +		if (wr & WR_STOP_THE_MUSIC)
      +			break;
      +	}
     @@ t/helper/test-http-server.c (new)
      +	return !!(wr & WR_IO_ERROR);
      +}
      +
     -+//////////////////////////////////////////////////////////////////
     -+// This section contains the listener and child-process management
     -+// code used by the primary instance to accept incoming connections
     -+// and dispatch them to async child process "worker" instances.
     -+//////////////////////////////////////////////////////////////////
     ++/*
     ++ * This section contains the listener and child-process management
     ++ * code used by the primary instance to accept incoming connections
     ++ * and dispatch them to async child process "worker" instances.
     ++ */
      +
      +static int addrcmp(const struct sockaddr_storage *s1,
      +		   const struct sockaddr_storage *s2)
     @@ t/helper/test-http-server.c (new)
      +
      +	set_keep_alive(sockfd);
      +
     -+	if ( bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0 ) {
     ++	if (bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0) {
      +		logerror("Could not bind to %s: %s",
      +			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
      +			 strerror(errno));
     @@ t/helper/test-http-server.c (new)
      +	return service_loop(&socklist);
      +}
      +
     -+//////////////////////////////////////////////////////////////////
     -+// This section is executed by both the primary instance and all
     -+// worker instances.  So, yes, each child-process re-parses the
     -+// command line argument and re-discovers how it should behave.
     -+//////////////////////////////////////////////////////////////////
     ++/*
     ++ * This section is executed by both the primary instance and all
     ++ * worker instances.  So, yes, each child-process re-parses the
     ++ * command line argument and re-discovers how it should behave.
     ++ */
      +
      +int cmd_main(int argc, const char **argv)
      +{
     @@ t/helper/test-http-server.c (new)
      +	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
      +	int worker_mode = 0;
      +	int i;
     -+	struct auth_module *mod = NULL;
      +
      +	trace2_cmd_name("test-http-server");
      +	setup_git_directory_gently(NULL);
     @@ t/helper/test-http-server.c (new)
      +			pid_file = v;
      +			continue;
      +		}
     -+		if (skip_prefix(arg, "--allow-anonymous", &v)) {
     -+			allow_anonymous = 1;
     -+			continue;
     -+		}
     -+		if (skip_prefix(arg, "--auth=", &v)) {
     -+			struct strbuf **p = strbuf_split_str(v, ':', 2);
     -+
     -+			if (!p[0]) {
     -+				error("invalid argument '%s'", v);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			// trim trailing ':'
     -+			if (p[1])
     -+				strbuf_setlen(p[0], p[0]->len - 1);
     -+
     -+			if (get_auth_module(p[0])) {
     -+				error("duplicate auth scheme '%s'\n", p[0]->buf);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			mod = xmalloc(sizeof(struct auth_module));
     -+			mod->scheme = xstrdup(p[0]->buf);
     -+			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
     -+			mod->tokens = xmalloc(sizeof(struct string_list));
     -+			string_list_init_dup(mod->tokens);
     -+
     -+			add_auth_module(mod);
     -+
     -+			strbuf_list_free(p);
     -+			continue;
     -+		}
     -+		if (skip_prefix(arg, "--auth-token=", &v)) {
     -+			struct strbuf **p = strbuf_split_str(v, ':', 2);
     -+			if (!p[0]) {
     -+				error("invalid argument '%s'", v);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			if (!p[1]) {
     -+				error("missing token value '%s'\n", v);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			// trim trailing ':'
     -+			strbuf_setlen(p[0], p[0]->len - 1);
     -+
     -+			mod = get_auth_module(p[0]);
     -+			if (!mod) {
     -+				error("auth scheme not defined '%s'\n", p[0]->buf);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			string_list_append(mod->tokens, p[1]->buf);
     -+			strbuf_list_free(p);
     -+			continue;
     -+		}
      +
      +		fprintf(stderr, "error: unknown argument '%s'\n", arg);
      +		usage(test_http_auth_usage);
     @@ t/helper/test-http-server.c (new)
      +	 */
      +	return serve(&listen_addr, listen_port);
      +}
     -
     - ## t/t5556-http-auth.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+test_description='test http auth header and credential helper interop'
     -+
     -+. ./test-lib.sh
     -+
     -+test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
     -+
     -+# Setup a repository
     -+#
     -+REPO_DIR="$(pwd)"/repo
     -+
     -+# Setup some lookback URLs where test-http-server will be listening.
     -+# We will spawn it directly inside the repo directory, so we avoid
     -+# any need to configure directory mappings etc - we only serve this
     -+# repository from the root '/' of the server.
     -+#
     -+HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
     -+ORIGIN_URL=http://$HOST_PORT/
     -+
     -+# The pid-file is created by test-http-server when it starts.
     -+# The server will shutdown if/when we delete it (this is easier than
     -+# killing it by PID).
     -+#
     -+PID_FILE="$(pwd)"/pid-file.pid
     -+SERVER_LOG="$(pwd)"/OUT.server.log
     -+
     -+PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
     -+CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
     -+	&& export CREDENTIAL_HELPER
     -+
     -+test_expect_success 'setup repos' '
     -+	test_create_repo "$REPO_DIR" &&
     -+	git -C "$REPO_DIR" branch -M main
     -+'
     -+
     -+stop_http_server () {
     -+	if ! test -f "$PID_FILE"
     -+	then
     -+		return 0
     -+	fi
     -+	#
     -+	# The server will shutdown automatically when we delete the pid-file.
     -+	#
     -+	rm -f "$PID_FILE"
     -+	#
     -+	# Give it a few seconds to shutdown (mainly to completely release the
     -+	# port before the next test start another instance and it attempts to
     -+	# bind to it).
     -+	#
     -+	for k in 0 1 2 3 4
     -+	do
     -+		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
     -+		then
     -+			return 0
     -+		fi
     -+		sleep 1
     -+	done
     -+
     -+	echo "stop_http_server: timeout waiting for server shutdown"
     -+	return 1
     -+}
     -+
     -+start_http_server () {
     -+	#
     -+	# Launch our server into the background in repo_dir.
     -+	#
     -+	(
     -+		cd "$REPO_DIR"
     -+		test-http-server --verbose \
     -+			--listen=127.0.0.1 \
     -+			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
     -+			--reuseaddr \
     -+			--pid-file="$PID_FILE" \
     -+			"$@" \
     -+			2>"$SERVER_LOG" &
     -+	)
     -+	#
     -+	# Give it a few seconds to get started.
     -+	#
     -+	for k in 0 1 2 3 4
     -+	do
     -+		if test -f "$PID_FILE"
     -+		then
     -+			return 0
     -+		fi
     -+		sleep 1
     -+	done
     -+
     -+	echo "start_http_server: timeout waiting for server startup"
     -+	return 1
     -+}
     -+
     -+per_test_cleanup () {
     -+	stop_http_server &&
     -+	rm -f OUT.* &&
     -+	rm -f *.cred
     -+}
     -+
     -+test_expect_success 'http auth anonymous no challenge' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	start_http_server --allow-anonymous &&
     -+
     -+	# Attempt to read from a protected repository
     -+	git ls-remote $ORIGIN_URL
     -+'
     -+
     -+test_expect_success 'http auth www-auth headers to credential helper bearer valid' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	start_http_server \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=bearer:secret-token &&
     -+
     -+	cat >get-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >store-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-token
     -+	authtype=bearer
     -+	EOF
     -+
     -+	cat >get-response.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-token
     -+	authtype=bearer
     -+	EOF
     -+
     -+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     -+
     -+	test_cmp get-expected.cred get-actual.cred &&
     -+	test_cmp store-expected.cred store-actual.cred
     -+'
     -+
     -+test_expect_success 'http auth www-auth headers to credential helper basic valid' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	# base64("alice:secret-passwd")
     -+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
     -+	export USERPASS64 &&
     -+
     -+	start_http_server \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=basic:$USERPASS64 &&
     -+
     -+	cat >get-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >store-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-passwd
     -+	authtype=basic
     -+	EOF
     -+
     -+	cat >get-response.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-passwd
     -+	authtype=basic
     -+	EOF
     -+
     -+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     -+
     -+	test_cmp get-expected.cred get-actual.cred &&
     -+	test_cmp store-expected.cred store-actual.cred
     -+'
     -+
     -+test_expect_success 'http auth www-auth headers to credential helper custom scheme' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	start_http_server \
     -+		--auth=foobar:alg=test\ widget=1 \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=foobar:SECRET-FOOBAR-VALUE &&
     -+
     -+	cat >get-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	wwwauth[]=foobar alg=test widget=1
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >store-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=SECRET-FOOBAR-VALUE
     -+	authtype=foobar
     -+	EOF
     -+
     -+	cat >get-response.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=SECRET-FOOBAR-VALUE
     -+	authtype=foobar
     -+	EOF
     -+
     -+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     -+
     -+	test_cmp get-expected.cred get-actual.cred &&
     -+	test_cmp store-expected.cred store-actual.cred
     -+'
     -+
     -+test_expect_success 'http auth www-auth headers to credential helper invalid' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	start_http_server \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=bearer:secret-token &&
     -+
     -+	cat >get-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >erase-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=invalid-token
     -+	authtype=bearer
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >get-response.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=invalid-token
     -+	authtype=bearer
     -+	EOF
     -+
     -+	test_must_fail git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     -+
     -+	test_cmp get-expected.cred get-actual.cred &&
     -+	test_cmp erase-expected.cred erase-actual.cred
     -+'
     -+
     -+test_done
  -:  ----------- >  7:  93bdf1d7060 test-http-server: add HTTP error response function
  -:  ----------- >  8:  b3e9156755f test-http-server: add HTTP request parsing
  -:  ----------- >  9:  5fb248c074a test-http-server: pass Git requests to http-backend
  -:  ----------- > 10:  192f09b9de4 test-http-server: add simple authentication
  -:  ----------- > 11:  b64d2f2c473 t5556: add HTTP authentication tests

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v3 01/11] http: read HTTP WWW-Authenticate response headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 02/11] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
                       ` (14 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

According to RFC2616 Section 4.2 [1], header field names are not
case-sensitive meaning when collecting multiple values for the same
field name, we can just use the case of the first observed instance of
each field name and no normalisation is required.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us one line at
a time.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 credential.c |  1 +
 credential.h | 15 ++++++++++
 http.c       | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/credential.c b/credential.c
index f6389a50684..897b4679333 100644
--- a/credential.c
+++ b/credential.c
@@ -22,6 +22,7 @@ void credential_clear(struct credential *c)
 	free(c->username);
 	free(c->password);
 	string_list_clear(&c->helpers, 0);
+	strvec_clear(&c->wwwauth_headers);
 
 	credential_init(c);
 }
diff --git a/credential.h b/credential.h
index f430e77fea4..6f2e5bc610b 100644
--- a/credential.h
+++ b/credential.h
@@ -2,6 +2,7 @@
 #define CREDENTIAL_H
 
 #include "string-list.h"
+#include "strvec.h"
 
 /**
  * The credentials API provides an abstracted way of gathering username and
@@ -115,6 +116,19 @@ struct credential {
 	 */
 	struct string_list helpers;
 
+	/**
+	 * A `strvec` of WWW-Authenticate header values. Each string
+	 * is the value of a WWW-Authenticate header in an HTTP response,
+	 * in the order they were received in the response.
+	 */
+	struct strvec wwwauth_headers;
+
+	/**
+	 * Internal use only. Used to keep track of split header fields
+	 * in order to fold multiple lines into one value.
+	 */
+	unsigned header_is_last_match:1;
+
 	unsigned approved:1,
 		 configured:1,
 		 quit:1,
@@ -130,6 +144,7 @@ struct credential {
 
 #define CREDENTIAL_INIT { \
 	.helpers = STRING_LIST_INIT_DUP, \
+	.wwwauth_headers = STRVEC_INIT, \
 }
 
 /* Initialize a credential structure, setting all fields to empty. */
diff --git a/http.c b/http.c
index 5d0502f51fd..03d43d352e7 100644
--- a/http.c
+++ b/http.c
@@ -183,6 +183,82 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	return nmemb;
 }
 
+static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
+{
+	size_t size = eltsize * nmemb;
+	struct strvec *values = &http_auth.wwwauth_headers;
+	struct strbuf buf = STRBUF_INIT;
+	const char *val;
+	const char *z = NULL;
+
+	/*
+	 * Header lines may not come NULL-terminated from libcurl so we must
+	 * limit all scans to the maximum length of the header line, or leverage
+	 * strbufs for all operations.
+	 *
+	 * In addition, it is possible that header values can be split over
+	 * multiple lines as per RFC 2616 (even though this has since been
+	 * deprecated in RFC 7230). A continuation header field value is
+	 * identified as starting with a space or horizontal tab.
+	 *
+	 * The formal definition of a header field as given in RFC 2616 is:
+	 *
+	 *   message-header = field-name ":" [ field-value ]
+	 *   field-name     = token
+	 *   field-value    = *( field-content | LWS )
+	 *   field-content  = <the OCTETs making up the field-value
+	 *                    and consisting of either *TEXT or combinations
+	 *                    of token, separators, and quoted-string>
+	 */
+
+	strbuf_add(&buf, ptr, size);
+
+	/* Strip the CRLF that should be present at the end of each field */
+	strbuf_trim_trailing_newline(&buf);
+
+	/* Start of a new WWW-Authenticate header */
+	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
+		while (isspace(*val))
+			val++;
+
+		strvec_push(values, val);
+		http_auth.header_is_last_match = 1;
+		goto exit;
+	}
+
+	/*
+	 * This line could be a continuation of the previously matched header
+	 * field. If this is the case then we should append this value to the
+	 * end of the previously consumed value.
+	 */
+	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
+		const char **v = values->v + values->nr - 1;
+		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
+
+		free((void*)*v);
+		*v = append;
+
+		goto exit;
+	}
+
+	/* This is the start of a new header we don't care about */
+	http_auth.header_is_last_match = 0;
+
+	/*
+	 * If this is a HTTP status line and not a header field, this signals
+	 * a different HTTP response. libcurl writes all the output of all
+	 * response headers of all responses, including redirects.
+	 * We only care about the last HTTP request response's headers so clear
+	 * the existing array.
+	 */
+	if (skip_iprefix(buf.buf, "http/", &z))
+		strvec_clear(values);
+
+exit:
+	strbuf_release(&buf);
+	return size;
+}
+
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
 	return nmemb;
@@ -1829,6 +1905,8 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
+
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 02/11] credential: add WWW-Authenticate header to cred requests
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 01/11] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 03/11] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
                       ` (13 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[]` properties where the order
that the repeated attributes appear in the conversation reflects the
order that the WWW-Authenticate headers appeared in the HTTP response.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt | 17 ++++++++++++++++-
 credential.c                     | 12 ++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index f18673017f5..791a57dddfb 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -113,7 +113,13 @@ separated by an `=` (equals) sign, followed by a newline.
 The key may contain any bytes except `=`, newline, or NUL. The value may
 contain any bytes except newline or NUL.
 
-In both cases, all bytes are treated as-is (i.e., there is no quoting,
+Attributes with keys that end with C-style array brackets `[]` can have
+multiple values. Each instance of a multi-valued attribute forms an
+ordered list of values - the order of the repeated attributes defines
+the order of the values. An empty multi-valued attribute (`key[]=\n`)
+acts to clear any previous entries and reset the list.
+
+In all cases, all bytes are treated as-is (i.e., there is no quoting,
 and one cannot transmit a value with newline or NUL in it). The list of
 attributes is terminated by a blank line or end-of-file.
 
@@ -160,6 +166,15 @@ empty string.
 Components which are missing from the URL (e.g., there is no
 username in the example above) will be left unset.
 
+`wwwauth[]`::
+
+	When an HTTP response is received by Git that includes one or more
+	'WWW-Authenticate' authentication headers, these will be passed by Git
+	to credential helpers.
+	Each 'WWW-Authenticate' header value is passed as a multi-valued
+	attribute 'wwwauth[]', where the order of the attributes is the same as
+	they appear in the HTTP response.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/credential.c b/credential.c
index 897b4679333..8a3ad6c0ae2 100644
--- a/credential.c
+++ b/credential.c
@@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
 	fprintf(fp, "%s=%s\n", key, value);
 }
 
+static void credential_write_strvec(FILE *fp, const char *key,
+				    const struct strvec *vec)
+{
+	int i = 0;
+	const char *full_key = xstrfmt("%s[]", key);
+	for (; i < vec->nr; i++) {
+		credential_write_item(fp, full_key, vec->v[i], 0);
+	}
+	free((void*)full_key);
+}
+
 void credential_write(const struct credential *c, FILE *fp)
 {
 	credential_write_item(fp, "protocol", c->protocol, 1);
@@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
 static int run_credential_helper(struct credential *c,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 03/11] http: store all request headers on active_request_slot
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 01/11] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 02/11] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-09 23:18       ` Glen Choo
  2022-11-02 22:09     ` [PATCH v3 04/11] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
                       ` (12 subsequent siblings)
  15 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Once a list of headers has been set on the curl handle, it is not
possible to recover that `struct curl_slist` instance to add or modify
headers.

In future commits we will want to modify the set of request headers in
response to an authentication challenge/401 response from the server,
with information provided by a credential helper.

There are a number of different places where curl is used for an HTTP
request, and they do not have a common handling of request headers.
However, given that they all do call the `start_active_slot()` function,
either directly or indirectly via `run_slot()` or `run_one_slot()`, we
use this as the point to set the `CURLOPT_HTTPHEADER` option just
before the request is made.

We collect all request headers in a `struct curl_slist` on the
`struct active_request_slot` that is obtained from a call to
`get_active_slot(int)`. This function now takes a single argument to
define if the initial set of headers on the slot should include the
"Pragma: no-cache" header, along with all extra headers specified via
`http.extraHeader` config values.

The active request slot obtained from `get_active_slot(int)` will always
contain a fresh set of default headers and any headers set in previous
usages of this slot will be freed.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http-push.c   | 103 ++++++++++++++++++++++----------------------------
 http-walker.c |   2 +-
 http.c        |  82 ++++++++++++++++++----------------------
 http.h        |   4 +-
 remote-curl.c |  36 +++++++++---------
 5 files changed, 101 insertions(+), 126 deletions(-)

diff --git a/http-push.c b/http-push.c
index 5f4340a36e6..2b40959b376 100644
--- a/http-push.c
+++ b/http-push.c
@@ -211,29 +211,29 @@ static void curl_setup_http(CURL *curl, const char *url,
 	curl_easy_setopt(curl, CURLOPT_UPLOAD, 1);
 }
 
-static struct curl_slist *get_dav_token_headers(struct remote_lock *lock, enum dav_header_flag options)
+static struct curl_slist *append_dav_token_headers(struct curl_slist *headers,
+	struct remote_lock *lock, enum dav_header_flag options)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
 	if (options & DAV_HEADER_IF) {
 		strbuf_addf(&buf, "If: (<%s>)", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_LOCK) {
 		strbuf_addf(&buf, "Lock-Token: <%s>", lock->token);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	if (options & DAV_HEADER_TIMEOUT) {
 		strbuf_addf(&buf, "Timeout: Second-%ld", lock->timeout);
-		dav_headers = curl_slist_append(dav_headers, buf.buf);
+		headers = curl_slist_append(headers, buf.buf);
 		strbuf_reset(&buf);
 	}
 	strbuf_release(&buf);
 
-	return dav_headers;
+	return headers;
 }
 
 static void finish_request(struct transfer_request *request);
@@ -281,7 +281,7 @@ static void start_mkcol(struct transfer_request *request)
 
 	request->url = get_remote_object_url(repo->url, hex, 1);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MKCOL);
@@ -399,7 +399,7 @@ static void start_put(struct transfer_request *request)
 	strbuf_add(&buf, request->lock->tmpfile_suffix, the_hash_algo->hexsz + 1);
 	request->url = strbuf_detach(&buf, NULL);
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http(slot->curl, request->url, DAV_PUT,
@@ -417,15 +417,13 @@ static void start_put(struct transfer_request *request)
 static void start_move(struct transfer_request *request)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_response;
 	slot->callback_data = request;
 	curl_setup_http_get(slot->curl, request->url, DAV_MOVE);
-	dav_headers = curl_slist_append(dav_headers, request->dest);
-	dav_headers = curl_slist_append(dav_headers, "Overwrite: T");
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
+	slot->headers = curl_slist_append(slot->headers, request->dest);
+	slot->headers = curl_slist_append(slot->headers, "Overwrite: T");
 
 	if (start_active_slot(slot)) {
 		request->slot = slot;
@@ -440,17 +438,16 @@ static int refresh_lock(struct remote_lock *lock)
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
 	lock->refreshing = 1;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF | DAV_HEADER_TIMEOUT);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_LOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -464,7 +461,6 @@ static int refresh_lock(struct remote_lock *lock)
 	}
 
 	lock->refreshing = 0;
-	curl_slist_free_all(dav_headers);
 
 	return rc;
 }
@@ -838,7 +834,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	char *ep;
 	char timeout_header[25];
 	struct remote_lock *lock = NULL;
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	char *escaped;
 
@@ -849,7 +844,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	while (ep) {
 		char saved_character = ep[1];
 		ep[1] = '\0';
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
 		curl_setup_http_get(slot->curl, url, DAV_MKCOL);
 		if (start_active_slot(slot)) {
@@ -875,14 +870,15 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
+	slot = get_active_slot(0);
+	slot->results = &results;
+
 	xsnprintf(timeout_header, sizeof(timeout_header), "Timeout: Second-%ld", timeout);
-	dav_headers = curl_slist_append(dav_headers, timeout_header);
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
+	slot->headers = curl_slist_append(slot->headers, timeout_header);
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
 
-	slot = get_active_slot();
-	slot->results = &results;
 	curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	CALLOC_ARRAY(lock, 1);
@@ -921,7 +917,6 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		fprintf(stderr, "Unable to start LOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
 
@@ -945,15 +940,14 @@ static int unlock_remote(struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct remote_lock *prev = repo->locks;
-	struct curl_slist *dav_headers;
 	int rc = 0;
 
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_LOCK);
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_LOCK);
+
 	curl_setup_http_get(slot->curl, lock->url, DAV_UNLOCK);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -966,8 +960,6 @@ static int unlock_remote(struct remote_lock *lock)
 		fprintf(stderr, "Unable to start UNLOCK request\n");
 	}
 
-	curl_slist_free_all(dav_headers);
-
 	if (repo->locks == lock) {
 		repo->locks = lock->next;
 	} else {
@@ -1121,7 +1113,6 @@ static void remote_ls(const char *path, int flags,
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	struct remote_ls_ctx ls;
 
@@ -1134,14 +1125,14 @@ static void remote_ls(const char *path, int flags,
 
 	strbuf_addstr(&out_buffer.buf, PROPFIND_ALL_REQUEST);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 1");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 1");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1177,7 +1168,6 @@ static void remote_ls(const char *path, int flags,
 	free(url);
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 }
 
 static void get_remote_object_list(unsigned char parent)
@@ -1199,7 +1189,6 @@ static int locking_available(void)
 	struct slot_results results;
 	struct strbuf in_buffer = STRBUF_INIT;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers = http_copy_default_headers();
 	struct xml_ctx ctx;
 	int lock_flags = 0;
 	char *escaped;
@@ -1208,14 +1197,14 @@ static int locking_available(void)
 	strbuf_addf(&out_buffer.buf, PROPFIND_SUPPORTEDLOCK_REQUEST, escaped);
 	free(escaped);
 
-	dav_headers = curl_slist_append(dav_headers, "Depth: 0");
-	dav_headers = curl_slist_append(dav_headers, "Content-Type: text/xml");
-
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = curl_slist_append(slot->headers, "Depth: 0");
+	slot->headers = curl_slist_append(slot->headers,
+		"Content-Type: text/xml");
+
 	curl_setup_http(slot->curl, repo->url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
@@ -1257,7 +1246,6 @@ static int locking_available(void)
 
 	strbuf_release(&out_buffer.buf);
 	strbuf_release(&in_buffer);
-	curl_slist_free_all(dav_headers);
 
 	return lock_flags;
 }
@@ -1374,17 +1362,16 @@ static int update_remote(const struct object_id *oid, struct remote_lock *lock)
 	struct active_request_slot *slot;
 	struct slot_results results;
 	struct buffer out_buffer = { STRBUF_INIT, 0 };
-	struct curl_slist *dav_headers;
-
-	dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
 	strbuf_addf(&out_buffer.buf, "%s\n", oid_to_hex(oid));
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
+	slot->headers = append_dav_token_headers(slot->headers, lock,
+		DAV_HEADER_IF);
+
 	curl_setup_http(slot->curl, lock->url, DAV_PUT,
 			&out_buffer, fwrite_null);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -1486,18 +1473,18 @@ static void update_remote_info_refs(struct remote_lock *lock)
 	struct buffer buffer = { STRBUF_INIT, 0 };
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *dav_headers;
 
 	remote_ls("refs/", (PROCESS_FILES | RECURSIVE),
 		  add_remote_info_ref, &buffer.buf);
 	if (!aborted) {
-		dav_headers = get_dav_token_headers(lock, DAV_HEADER_IF);
 
-		slot = get_active_slot();
+		slot = get_active_slot(0);
 		slot->results = &results;
+		slot->headers = append_dav_token_headers(slot->headers, lock,
+			DAV_HEADER_IF);
+
 		curl_setup_http(slot->curl, lock->url, DAV_PUT,
 				&buffer, fwrite_null);
-		curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
 
 		if (start_active_slot(slot)) {
 			run_active_slot(slot);
@@ -1652,7 +1639,7 @@ static int delete_remote_branch(const char *pattern, int force)
 	if (dry_run)
 		return 0;
 	url = xstrfmt("%s%s", repo->url, remote_ref->name);
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->results = &results;
 	curl_setup_http_get(slot->curl, url, DAV_DELETE);
 	if (start_active_slot(slot)) {
diff --git a/http-walker.c b/http-walker.c
index b8f0f98ae14..8747de2fcdb 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -373,7 +373,7 @@ static void fetch_alternates(struct walker *walker, const char *base)
 	 * Use a callback to process the result, since another request
 	 * may fail and need to have alternates loaded before continuing
 	 */
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 	slot->callback_func = process_alternates_response;
 	alt_req.walker = walker;
 	slot->callback_data = &alt_req;
diff --git a/http.c b/http.c
index 03d43d352e7..f2ebb17c8c4 100644
--- a/http.c
+++ b/http.c
@@ -124,8 +124,6 @@ static unsigned long empty_auth_useless =
 	| CURLAUTH_DIGEST_IE
 	| CURLAUTH_DIGEST;
 
-static struct curl_slist *pragma_header;
-static struct curl_slist *no_pragma_header;
 static struct string_list extra_http_headers = STRING_LIST_INIT_DUP;
 
 static struct curl_slist *host_resolutions;
@@ -1133,11 +1131,6 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
-	pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma: no-cache");
-	no_pragma_header = curl_slist_append(http_copy_default_headers(),
-		"Pragma:");
-
 	{
 		char *http_max_requests = getenv("GIT_HTTP_MAX_REQUESTS");
 		if (http_max_requests)
@@ -1199,6 +1192,8 @@ void http_cleanup(void)
 
 	while (slot != NULL) {
 		struct active_request_slot *next = slot->next;
+		if (slot->headers)
+			curl_slist_free_all(slot->headers);
 		if (slot->curl) {
 			xmulti_remove_handle(slot);
 			curl_easy_cleanup(slot->curl);
@@ -1215,12 +1210,6 @@ void http_cleanup(void)
 
 	string_list_clear(&extra_http_headers, 0);
 
-	curl_slist_free_all(pragma_header);
-	pragma_header = NULL;
-
-	curl_slist_free_all(no_pragma_header);
-	no_pragma_header = NULL;
-
 	curl_slist_free_all(host_resolutions);
 	host_resolutions = NULL;
 
@@ -1255,7 +1244,18 @@ void http_cleanup(void)
 	FREE_AND_NULL(cached_accept_language);
 }
 
-struct active_request_slot *get_active_slot(void)
+static struct curl_slist *http_copy_default_headers(void)
+{
+	struct curl_slist *headers = NULL;
+	const struct string_list_item *item;
+
+	for_each_string_list_item(item, &extra_http_headers)
+		headers = curl_slist_append(headers, item->string);
+
+	return headers;
+}
+
+struct active_request_slot *get_active_slot(int no_pragma_header)
 {
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
@@ -1277,6 +1277,7 @@ struct active_request_slot *get_active_slot(void)
 		newslot->curl = NULL;
 		newslot->in_use = 0;
 		newslot->next = NULL;
+		newslot->headers = NULL;
 
 		slot = active_queue_head;
 		if (!slot) {
@@ -1294,6 +1295,15 @@ struct active_request_slot *get_active_slot(void)
 		curl_session_count++;
 	}
 
+	if (slot->headers)
+		curl_slist_free_all(slot->headers);
+
+	slot->headers = http_copy_default_headers();
+
+	if (!no_pragma_header)
+		slot->headers = curl_slist_append(slot->headers,
+			"Pragma: no-cache");
+
 	active_requests++;
 	slot->in_use = 1;
 	slot->results = NULL;
@@ -1303,7 +1313,6 @@ struct active_request_slot *get_active_slot(void)
 	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
 	if (curl_save_cookies)
 		curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
 	curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
 	curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
 	curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
@@ -1335,9 +1344,12 @@ struct active_request_slot *get_active_slot(void)
 
 int start_active_slot(struct active_request_slot *slot)
 {
-	CURLMcode curlm_result = curl_multi_add_handle(curlm, slot->curl);
+	CURLMcode curlm_result;
 	int num_transfers;
 
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, slot->headers);
+	curlm_result = curl_multi_add_handle(curlm, slot->curl);
+
 	if (curlm_result != CURLM_OK &&
 	    curlm_result != CURLM_CALL_MULTI_PERFORM) {
 		warning("curl_multi_add_handle failed: %s",
@@ -1652,17 +1664,6 @@ int run_one_slot(struct active_request_slot *slot,
 	return handle_curl_result(results);
 }
 
-struct curl_slist *http_copy_default_headers(void)
-{
-	struct curl_slist *headers = NULL;
-	const struct string_list_item *item;
-
-	for_each_string_list_item(item, &extra_http_headers)
-		headers = curl_slist_append(headers, item->string);
-
-	return headers;
-}
-
 static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
 {
 	char *ptr;
@@ -1880,12 +1881,11 @@ static int http_request(const char *url,
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *headers = http_copy_default_headers();
-	struct strbuf buf = STRBUF_INIT;
+	int no_cache = options && options->no_cache;
 	const char *accept_language;
 	int ret;
 
-	slot = get_active_slot();
+	slot = get_active_slot(!no_cache);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
 	if (!result) {
@@ -1910,27 +1910,23 @@ static int http_request(const char *url,
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-		headers = curl_slist_append(headers, accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			accept_language);
 
-	strbuf_addstr(&buf, "Pragma:");
-	if (options && options->no_cache)
-		strbuf_addstr(&buf, " no-cache");
 	if (options && options->initial_request &&
 	    http_follow_config == HTTP_FOLLOW_INITIAL)
 		curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 1);
 
-	headers = curl_slist_append(headers, buf.buf);
-
 	/* Add additional headers here */
 	if (options && options->extra_headers) {
 		const struct string_list_item *item;
 		for_each_string_list_item(item, options->extra_headers) {
-			headers = curl_slist_append(headers, item->string);
+			slot->headers = curl_slist_append(slot->headers,
+				item->string);
 		}
 	}
 
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -1948,9 +1944,6 @@ static int http_request(const char *url,
 		curlinfo_strbuf(slot->curl, CURLINFO_EFFECTIVE_URL,
 				options->effective_url);
 
-	curl_slist_free_all(headers);
-	strbuf_release(&buf);
-
 	return ret;
 }
 
@@ -2311,12 +2304,10 @@ struct http_pack_request *new_direct_http_pack_request(
 		goto abort;
 	}
 
-	preq->slot = get_active_slot();
+	preq->slot = get_active_slot(1);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
-	curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER,
-		no_pragma_header);
 
 	/*
 	 * If there is data present from a previous transfer attempt,
@@ -2481,14 +2472,13 @@ struct http_object_request *new_http_object_request(const char *base_url,
 		}
 	}
 
-	freq->slot = get_active_slot();
+	freq->slot = get_active_slot(1);
 
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_URL, freq->url);
-	curl_easy_setopt(freq->slot->curl, CURLOPT_HTTPHEADER, no_pragma_header);
 
 	/*
 	 * If we have successfully processed data from a previous fetch
diff --git a/http.h b/http.h
index 3c94c479100..a304cc408b2 100644
--- a/http.h
+++ b/http.h
@@ -22,6 +22,7 @@ struct slot_results {
 struct active_request_slot {
 	CURL *curl;
 	int in_use;
+	struct curl_slist *headers;
 	CURLcode curl_result;
 	long http_code;
 	int *finished;
@@ -43,7 +44,7 @@ size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf);
 curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp);
 
 /* Slot lifecycle functions */
-struct active_request_slot *get_active_slot(void);
+struct active_request_slot *get_active_slot(int no_pragma_header);
 int start_active_slot(struct active_request_slot *slot);
 void run_active_slot(struct active_request_slot *slot);
 void finish_all_active_slots(void);
@@ -64,7 +65,6 @@ void step_active_slots(void);
 void http_init(struct remote *remote, const char *url,
 	       int proactive_auth);
 void http_cleanup(void);
-struct curl_slist *http_copy_default_headers(void);
 
 extern long int git_curl_ipresolve;
 extern int active_requests;
diff --git a/remote-curl.c b/remote-curl.c
index 72dfb8fb86a..edbd4504beb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -847,14 +847,13 @@ static int run_slot(struct active_request_slot *slot,
 static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	struct strbuf buf = STRBUF_INIT;
 	int err;
 
-	slot = get_active_slot();
+	slot = get_active_slot(0);
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -862,13 +861,11 @@ static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, NULL);
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, "0000");
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, 4);
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buf);
 
 	err = run_slot(slot, results);
 
-	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 	return err;
 }
@@ -888,7 +885,6 @@ static curl_off_t xcurl_off_t(size_t len)
 static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_received)
 {
 	struct active_request_slot *slot;
-	struct curl_slist *headers = http_copy_default_headers();
 	int use_gzip = rpc->gzip_request;
 	char *gzip_body = NULL;
 	size_t gzip_size = 0;
@@ -930,21 +926,23 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 			needs_100_continue = 1;
 	}
 
-	headers = curl_slist_append(headers, rpc->hdr_content_type);
-	headers = curl_slist_append(headers, rpc->hdr_accept);
-	headers = curl_slist_append(headers, needs_100_continue ?
+retry:
+	slot = get_active_slot(0);
+
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_content_type);
+	slot->headers = curl_slist_append(slot->headers, rpc->hdr_accept);
+	slot->headers = curl_slist_append(slot->headers, needs_100_continue ?
 		"Expect: 100-continue" : "Expect:");
 
 	/* Add Accept-Language header */
 	if (rpc->hdr_accept_language)
-		headers = curl_slist_append(headers, rpc->hdr_accept_language);
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->hdr_accept_language);
 
 	/* Add the extra Git-Protocol header */
 	if (rpc->protocol_header)
-		headers = curl_slist_append(headers, rpc->protocol_header);
-
-retry:
-	slot = get_active_slot();
+		slot->headers = curl_slist_append(slot->headers,
+			rpc->protocol_header);
 
 	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
 	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
@@ -955,7 +953,8 @@ retry:
 		/* The request body is large and the size cannot be predicted.
 		 * We must use chunked encoding to send it.
 		 */
-		headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
+		slot->headers = curl_slist_append(slot->headers,
+			"Transfer-Encoding: chunked");
 		rpc->initial_buffer = 1;
 		curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out);
 		curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc);
@@ -1002,7 +1001,8 @@ retry:
 
 		gzip_size = stream.total_out;
 
-		headers = curl_slist_append(headers, "Content-Encoding: gzip");
+		slot->headers = curl_slist_append(slot->headers,
+			"Content-Encoding: gzip");
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, gzip_body);
 		curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE_LARGE, xcurl_off_t(gzip_size));
 
@@ -1025,7 +1025,6 @@ retry:
 		}
 	}
 
-	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
@@ -1055,7 +1054,6 @@ retry:
 	if (stateless_connect)
 		packet_response_end(rpc->in);
 
-	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 04/11] http: move proactive auth to first slot creation
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 03/11] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 05/11] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
                       ` (11 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Rather than proactively seek credentials to authenticate a request at
`http_init()` time, do it when the first `active_request_slot` is
created.

Because credential helpers may modify the headers used for a request we
can only auth when a slot is created (when we can first start to gather
request headers).

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 http.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/http.c b/http.c
index f2ebb17c8c4..17b47195d22 100644
--- a/http.c
+++ b/http.c
@@ -515,18 +515,18 @@ static int curl_empty_auth_enabled(void)
 	return 0;
 }
 
-static void init_curl_http_auth(CURL *result)
+static void init_curl_http_auth(struct active_request_slot *slot)
 {
 	if (!http_auth.username || !*http_auth.username) {
 		if (curl_empty_auth_enabled())
-			curl_easy_setopt(result, CURLOPT_USERPWD, ":");
+			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
 	}
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(result, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(result, CURLOPT_PASSWORD, http_auth.password);
+	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
+	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
 }
 
 /* *var must be free-able */
@@ -901,9 +901,6 @@ static CURL *get_curl_handle(void)
 #endif
 	}
 
-	if (http_proactive_auth)
-		init_curl_http_auth(result);
-
 	if (getenv("GIT_SSL_VERSION"))
 		ssl_version = getenv("GIT_SSL_VERSION");
 	if (ssl_version && *ssl_version) {
@@ -1260,6 +1257,7 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 	struct active_request_slot *slot = active_queue_head;
 	struct active_request_slot *newslot;
 
+	int proactive_auth = 0;
 	int num_transfers;
 
 	/* Wait for a slot to open up if the queue is full */
@@ -1282,6 +1280,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 		slot = active_queue_head;
 		if (!slot) {
 			active_queue_head = newslot;
+
+			/* Auth first slot if asked for proactive auth */
+			proactive_auth = http_proactive_auth;
 		} else {
 			while (slot->next != NULL)
 				slot = slot->next;
@@ -1336,8 +1337,9 @@ struct active_request_slot *get_active_slot(int no_pragma_header)
 
 	curl_easy_setopt(slot->curl, CURLOPT_IPRESOLVE, git_curl_ipresolve);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, http_auth_methods);
-	if (http_auth.password || curl_empty_auth_enabled())
-		init_curl_http_auth(slot->curl);
+
+	if (http_auth.password || curl_empty_auth_enabled() || proactive_auth)
+		init_curl_http_auth(slot);
 
 	return slot;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 05/11] http: set specific auth scheme depending on credential
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 04/11] http: move proactive auth to first slot creation Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-09 23:40       ` Glen Choo
  2022-11-02 22:09     ` [PATCH v3 06/11] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
                       ` (10 subsequent siblings)
  15 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a new credential field `authtype` that can be used by
credential helpers to indicate the type of the credential or
authentication mechanism to use for a request.

Modify http.c to now specify the correct authentication scheme or
credential type when authenticating the curl handle. If the new
`authtype` field in the credential structure is `NULL` or "Basic" then
use the existing username/password options. If the field is "Bearer"
then use the OAuth bearer token curl option. Otherwise, the `authtype`
field is the authentication scheme and the `password` field is the
raw, unencoded value.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt | 12 ++++++++++++
 credential.c                     |  5 +++++
 credential.h                     |  1 +
 git-curl-compat.h                | 10 ++++++++++
 http.c                           | 24 +++++++++++++++++++++---
 5 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index 791a57dddfb..9069bfb2d50 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -175,6 +175,18 @@ username in the example above) will be left unset.
 	attribute 'wwwauth[]', where the order of the attributes is the same as
 	they appear in the HTTP response.
 
+`authtype`::
+
+	Indicates the type of authentication scheme that should be used by Git.
+	Credential helpers may reply to a request from Git with this attribute,
+	such that subsequent authenticated requests include the correct
+	`Authorization` header.
+	If this attribute is not present, the default value is "Basic".
+	Known values include "Basic", "Digest", and "Bearer".
+	If an unknown value is provided, this is taken as the authentication
+	scheme for the `Authorization` header, and the `password` field is
+	used as the raw unencoded authorization parameters of the same header.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/credential.c b/credential.c
index 8a3ad6c0ae2..a556f9f375a 100644
--- a/credential.c
+++ b/credential.c
@@ -21,6 +21,7 @@ void credential_clear(struct credential *c)
 	free(c->path);
 	free(c->username);
 	free(c->password);
+	free(c->authtype);
 	string_list_clear(&c->helpers, 0);
 	strvec_clear(&c->wwwauth_headers);
 
@@ -235,6 +236,9 @@ int credential_read(struct credential *c, FILE *fp)
 		} else if (!strcmp(key, "path")) {
 			free(c->path);
 			c->path = xstrdup(value);
+		} else if (!strcmp(key, "authtype")) {
+			free(c->authtype);
+			c->authtype = xstrdup(value);
 		} else if (!strcmp(key, "url")) {
 			credential_from_url(c, value);
 		} else if (!strcmp(key, "quit")) {
@@ -281,6 +285,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_item(fp, "authtype", c->authtype, 0);
 	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
diff --git a/credential.h b/credential.h
index 6f2e5bc610b..8d580b054d0 100644
--- a/credential.h
+++ b/credential.h
@@ -140,6 +140,7 @@ struct credential {
 	char *protocol;
 	char *host;
 	char *path;
+	char *authtype;
 };
 
 #define CREDENTIAL_INIT { \
diff --git a/git-curl-compat.h b/git-curl-compat.h
index 56a83b6bbd8..839049f6dfe 100644
--- a/git-curl-compat.h
+++ b/git-curl-compat.h
@@ -126,4 +126,14 @@
 #define GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS
 #endif
 
+/**
+ * CURLAUTH_BEARER was added in 7.61.0, released in July 2018.
+ * However, only 7.69.0 fixes a bug where Bearer headers were not
+ * actually sent with reused connections on subsequent transfers
+ * (curl/curl@dea17b519dc1).
+ */
+#if LIBCURL_VERSION_NUM >= 0x074500
+#define GIT_CURL_HAVE_CURLAUTH_BEARER
+#endif
+
 #endif
diff --git a/http.c b/http.c
index 17b47195d22..ac620bcbf0c 100644
--- a/http.c
+++ b/http.c
@@ -517,7 +517,8 @@ static int curl_empty_auth_enabled(void)
 
 static void init_curl_http_auth(struct active_request_slot *slot)
 {
-	if (!http_auth.username || !*http_auth.username) {
+	if (!http_auth.authtype &&
+		(!http_auth.username || !*http_auth.username)) {
 		if (curl_empty_auth_enabled())
 			curl_easy_setopt(slot->curl, CURLOPT_USERPWD, ":");
 		return;
@@ -525,8 +526,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
 
 	credential_fill(&http_auth);
 
-	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
-	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
+	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
+				|| !strcasecmp(http_auth.authtype, "digest")) {
+		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
+			http_auth.username);
+		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
+			http_auth.password);
+#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
+	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
+		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
+		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
+			http_auth.password);
+#endif
+	} else {
+		struct strbuf auth = STRBUF_INIT;
+		strbuf_addf(&auth, "Authorization: %s %s",
+			http_auth.authtype, http_auth.password);
+		slot->headers = curl_slist_append(slot->headers, auth.buf);
+		strbuf_release(&auth);
+	}
 }
 
 /* *var must be free-able */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 06/11] test-http-server: add stub HTTP server test helper
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 05/11] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-07 19:19       ` Derrick Stolee
  2022-11-02 22:09     ` [PATCH v3 07/11] test-http-server: add HTTP error response function Matthew John Cheetham via GitGitGadget
                       ` (9 subsequent siblings)
  15 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a mini HTTP server helper that in the future will be enhanced
to provide a frontend for the git-http-backend, with support for
arbitrary authentication schemes.

Right now, test-http-server is a pared-down copy of the git-daemon that
always returns a 501 Not Implemented response to all callers.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Makefile                            |   2 +
 contrib/buildsystems/CMakeLists.txt |  13 +
 t/helper/.gitignore                 |   1 +
 t/helper/test-http-server.c         | 685 ++++++++++++++++++++++++++++
 4 files changed, 701 insertions(+)
 create mode 100644 t/helper/test-http-server.c

diff --git a/Makefile b/Makefile
index d93ad956e58..39b130f711d 100644
--- a/Makefile
+++ b/Makefile
@@ -1500,6 +1500,8 @@ else
 	endif
 	BASIC_CFLAGS += $(CURL_CFLAGS)
 
+	TEST_PROGRAMS_NEED_X += test-http-server
+
 	REMOTE_CURL_PRIMARY = git-remote-http$X
 	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
 	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 787738e6fa3..45251695ce0 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -989,6 +989,19 @@ set(wrapper_scripts
 set(wrapper_test_scripts
 	test-fake-ssh test-tool)
 
+if(CURL_FOUND)
+       list(APPEND wrapper_test_scripts test-http-server)
+
+       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
+       target_link_libraries(test-http-server common-main)
+
+       if(MSVC)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
+       endif()
+endif()
 
 foreach(script ${wrapper_scripts})
 	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 8c2ddcce95f..9aa9c752997 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -1,2 +1,3 @@
 /test-tool
 /test-fake-ssh
+/test-http-server
diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
new file mode 100644
index 00000000000..18f1f741305
--- /dev/null
+++ b/t/helper/test-http-server.c
@@ -0,0 +1,685 @@
+#include "config.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "string-list.h"
+#include "trace2.h"
+#include "version.h"
+#include "dir.h"
+#include "date.h"
+
+#define TR2_CAT "test-http-server"
+
+static const char *pid_file;
+static int verbose;
+static int reuseaddr;
+
+static const char test_http_auth_usage[] =
+"http-server [--verbose]\n"
+"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
+"           [--reuseaddr] [--pid-file=<file>]\n"
+"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
+;
+
+/* Timeout, and initial timeout */
+static unsigned int timeout;
+static unsigned int init_timeout;
+
+static void logreport(const char *label, const char *err, va_list params)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
+	strbuf_vaddf(&msg, err, params);
+	strbuf_addch(&msg, '\n');
+
+	fwrite(msg.buf, sizeof(char), msg.len, stderr);
+	fflush(stderr);
+
+	strbuf_release(&msg);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void logerror(const char *err, ...)
+{
+	va_list params;
+	va_start(params, err);
+	logreport("error", err, params);
+	va_end(params);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void loginfo(const char *err, ...)
+{
+	va_list params;
+	if (!verbose)
+		return;
+	va_start(params, err);
+	logreport("info", err, params);
+	va_end(params);
+}
+
+static void set_keep_alive(int sockfd)
+{
+	int ka = 1;
+
+	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
+		if (errno != ENOTSOCK)
+			logerror("unable to set SO_KEEPALIVE on socket: %s",
+				strerror(errno));
+	}
+}
+
+/*
+ * The code in this section is used by "worker" instances to service
+ * a single connection from a client.  The worker talks to the client
+ * on 0 and 1.
+ */
+
+enum worker_result {
+	/*
+	 * Operation successful.
+	 * Caller *might* keep the socket open and allow keep-alive.
+	 */
+	WR_OK       = 0,
+
+	/*
+	 * Various errors while processing the request and/or the response.
+	 * Close the socket and clean up.
+	 * Exit child-process with non-zero status.
+	 */
+	WR_IO_ERROR = 1<<0,
+
+	/*
+	 * Close the socket and clean up.  Does not imply an error.
+	 */
+	WR_HANGUP   = 1<<1,
+
+	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
+};
+
+static enum worker_result worker(void)
+{
+	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
+	char *client_addr = getenv("REMOTE_ADDR");
+	char *client_port = getenv("REMOTE_PORT");
+	enum worker_result wr = WR_OK;
+
+	if (client_addr)
+		loginfo("Connection from %s:%s", client_addr, client_port);
+
+	set_keep_alive(0);
+
+	while (1) {
+		if (write_in_full(1, response, strlen(response)) < 0) {
+			logerror("unable to write response");
+			wr = WR_IO_ERROR;
+		}
+
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+	}
+
+	close(0);
+	close(1);
+
+	return !!(wr & WR_IO_ERROR);
+}
+
+/*
+ * This section contains the listener and child-process management
+ * code used by the primary instance to accept incoming connections
+ * and dispatch them to async child process "worker" instances.
+ */
+
+static int addrcmp(const struct sockaddr_storage *s1,
+		   const struct sockaddr_storage *s2)
+{
+	const struct sockaddr *sa1 = (const struct sockaddr*) s1;
+	const struct sockaddr *sa2 = (const struct sockaddr*) s2;
+
+	if (sa1->sa_family != sa2->sa_family)
+		return sa1->sa_family - sa2->sa_family;
+	if (sa1->sa_family == AF_INET)
+		return memcmp(&((struct sockaddr_in *)s1)->sin_addr,
+		    &((struct sockaddr_in *)s2)->sin_addr,
+		    sizeof(struct in_addr));
+#ifndef NO_IPV6
+	if (sa1->sa_family == AF_INET6)
+		return memcmp(&((struct sockaddr_in6 *)s1)->sin6_addr,
+		    &((struct sockaddr_in6 *)s2)->sin6_addr,
+		    sizeof(struct in6_addr));
+#endif
+	return 0;
+}
+
+static int max_connections = 32;
+
+static unsigned int live_children;
+
+static struct child {
+	struct child *next;
+	struct child_process cld;
+	struct sockaddr_storage address;
+} *firstborn;
+
+static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child *newborn, **cradle;
+
+	newborn = xcalloc(1, sizeof(*newborn));
+	live_children++;
+	memcpy(&newborn->cld, cld, sizeof(*cld));
+	memcpy(&newborn->address, addr, addrlen);
+	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
+		if (!addrcmp(&(*cradle)->address, &newborn->address))
+			break;
+	newborn->next = *cradle;
+	*cradle = newborn;
+}
+
+/*
+ * This gets called if the number of connections grows
+ * past "max_connections".
+ *
+ * We kill the newest connection from a duplicate IP.
+ */
+static void kill_some_child(void)
+{
+	const struct child *blanket, *next;
+
+	if (!(blanket = firstborn))
+		return;
+
+	for (; (next = blanket->next); blanket = next)
+		if (!addrcmp(&blanket->address, &next->address)) {
+			kill(blanket->cld.pid, SIGTERM);
+			break;
+		}
+}
+
+static void check_dead_children(void)
+{
+	int status;
+	pid_t pid;
+
+	struct child **cradle, *blanket;
+	for (cradle = &firstborn; (blanket = *cradle);)
+		if ((pid = waitpid(blanket->cld.pid, &status, WNOHANG)) > 1) {
+			const char *dead = "";
+			if (status)
+				dead = " (with error)";
+			loginfo("[%"PRIuMAX"] Disconnected%s", (uintmax_t)pid, dead);
+
+			/* remove the child */
+			*cradle = blanket->next;
+			live_children--;
+			child_process_clear(&blanket->cld);
+			free(blanket);
+		} else
+			cradle = &blanket->next;
+}
+
+static struct strvec cld_argv = STRVEC_INIT;
+static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child_process cld = CHILD_PROCESS_INIT;
+
+	if (max_connections && live_children >= max_connections) {
+		kill_some_child();
+		sleep(1);  /* give it some time to die */
+		check_dead_children();
+		if (live_children >= max_connections) {
+			close(incoming);
+			logerror("Too many children, dropping connection");
+			return;
+		}
+	}
+
+	if (addr->sa_family == AF_INET) {
+		char buf[128] = "";
+		struct sockaddr_in *sin_addr = (void *) addr;
+		inet_ntop(addr->sa_family, &sin_addr->sin_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=%s", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin_addr->sin_port));
+#ifndef NO_IPV6
+	} else if (addr->sa_family == AF_INET6) {
+		char buf[128] = "";
+		struct sockaddr_in6 *sin6_addr = (void *) addr;
+		inet_ntop(AF_INET6, &sin6_addr->sin6_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=[%s]", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin6_addr->sin6_port));
+#endif
+	}
+
+	strvec_pushv(&cld.args, cld_argv.v);
+	cld.in = incoming;
+	cld.out = dup(incoming);
+
+	if (cld.out < 0)
+		logerror("could not dup() `incoming`");
+	else if (start_command(&cld))
+		logerror("unable to fork");
+	else
+		add_child(&cld, addr, addrlen);
+}
+
+static void child_handler(int signo)
+{
+	/*
+	 * Otherwise empty handler because systemcalls will get interrupted
+	 * upon signal receipt
+	 * SysV needs the handler to be rearmed
+	 */
+	signal(SIGCHLD, child_handler);
+}
+
+static int set_reuse_addr(int sockfd)
+{
+	int on = 1;
+
+	if (!reuseaddr)
+		return 0;
+	return setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR,
+			  &on, sizeof(on));
+}
+
+struct socketlist {
+	int *list;
+	size_t nr;
+	size_t alloc;
+};
+
+static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
+{
+#ifdef NO_IPV6
+	static char ip[INET_ADDRSTRLEN];
+#else
+	static char ip[INET6_ADDRSTRLEN];
+#endif
+
+	switch (family) {
+#ifndef NO_IPV6
+	case AF_INET6:
+		inet_ntop(family, &((struct sockaddr_in6*)sin)->sin6_addr, ip, len);
+		break;
+#endif
+	case AF_INET:
+		inet_ntop(family, &((struct sockaddr_in*)sin)->sin_addr, ip, len);
+		break;
+	default:
+		xsnprintf(ip, sizeof(ip), "<unknown>");
+	}
+	return ip;
+}
+
+#ifndef NO_IPV6
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	int socknum = 0;
+	char pbuf[NI_MAXSERV];
+	struct addrinfo hints, *ai0, *ai;
+	int gai;
+	long flags;
+
+	xsnprintf(pbuf, sizeof(pbuf), "%d", listen_port);
+	memset(&hints, 0, sizeof(hints));
+	hints.ai_family = AF_UNSPEC;
+	hints.ai_socktype = SOCK_STREAM;
+	hints.ai_protocol = IPPROTO_TCP;
+	hints.ai_flags = AI_PASSIVE;
+
+	gai = getaddrinfo(listen_addr, pbuf, &hints, &ai0);
+	if (gai) {
+		logerror("getaddrinfo() for %s failed: %s", listen_addr, gai_strerror(gai));
+		return 0;
+	}
+
+	for (ai = ai0; ai; ai = ai->ai_next) {
+		int sockfd;
+
+		sockfd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+		if (sockfd < 0)
+			continue;
+		if (sockfd >= FD_SETSIZE) {
+			logerror("Socket descriptor too large");
+			close(sockfd);
+			continue;
+		}
+
+#ifdef IPV6_V6ONLY
+		if (ai->ai_family == AF_INET6) {
+			int on = 1;
+			setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY,
+				   &on, sizeof(on));
+			/* Note: error is not fatal */
+		}
+#endif
+
+		if (set_reuse_addr(sockfd)) {
+			logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+			close(sockfd);
+			continue;
+		}
+
+		set_keep_alive(sockfd);
+
+		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
+			logerror("Could not bind to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+		if (listen(sockfd, 5) < 0) {
+			logerror("Could not listen to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+
+		flags = fcntl(sockfd, F_GETFD, 0);
+		if (flags >= 0)
+			fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+		ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+		socklist->list[socklist->nr++] = sockfd;
+		socknum++;
+	}
+
+	freeaddrinfo(ai0);
+
+	return socknum;
+}
+
+#else /* NO_IPV6 */
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	struct sockaddr_in sin;
+	int sockfd;
+	long flags;
+
+	memset(&sin, 0, sizeof sin);
+	sin.sin_family = AF_INET;
+	sin.sin_port = htons(listen_port);
+
+	if (listen_addr) {
+		/* Well, host better be an IP address here. */
+		if (inet_pton(AF_INET, listen_addr, &sin.sin_addr.s_addr) <= 0)
+			return 0;
+	} else {
+		sin.sin_addr.s_addr = htonl(INADDR_ANY);
+	}
+
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (sockfd < 0)
+		return 0;
+
+	if (set_reuse_addr(sockfd)) {
+		logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	set_keep_alive(sockfd);
+
+	if (bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0) {
+		logerror("Could not bind to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	if (listen(sockfd, 5) < 0) {
+		logerror("Could not listen to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	flags = fcntl(sockfd, F_GETFD, 0);
+	if (flags >= 0)
+		fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+	ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+	socklist->list[socklist->nr++] = sockfd;
+	return 1;
+}
+
+#endif
+
+static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	if (!listen_addr->nr)
+		setup_named_sock("127.0.0.1", listen_port, socklist);
+	else {
+		int i, socknum;
+		for (i = 0; i < listen_addr->nr; i++) {
+			socknum = setup_named_sock(listen_addr->items[i].string,
+						   listen_port, socklist);
+
+			if (socknum == 0)
+				logerror("unable to allocate any listen sockets for host %s on port %u",
+					 listen_addr->items[i].string, listen_port);
+		}
+	}
+}
+
+static int service_loop(struct socketlist *socklist)
+{
+	struct pollfd *pfd;
+	int i;
+
+	CALLOC_ARRAY(pfd, socklist->nr);
+
+	for (i = 0; i < socklist->nr; i++) {
+		pfd[i].fd = socklist->list[i];
+		pfd[i].events = POLLIN;
+	}
+
+	signal(SIGCHLD, child_handler);
+
+	for (;;) {
+		int i;
+		int nr_ready;
+		int timeout = (pid_file ? 100 : -1);
+
+		check_dead_children();
+
+		nr_ready = poll(pfd, socklist->nr, timeout);
+		if (nr_ready < 0) {
+			if (errno != EINTR) {
+				logerror("Poll failed, resuming: %s",
+				      strerror(errno));
+				sleep(1);
+			}
+			continue;
+		}
+		else if (nr_ready == 0) {
+			/*
+			 * If we have a pid_file, then we watch it.
+			 * If someone deletes it, we shutdown the service.
+			 * The shell scripts in the test suite will use this.
+			 */
+			if (!pid_file || file_exists(pid_file))
+				continue;
+			goto shutdown;
+		}
+
+		for (i = 0; i < socklist->nr; i++) {
+			if (pfd[i].revents & POLLIN) {
+				union {
+					struct sockaddr sa;
+					struct sockaddr_in sai;
+#ifndef NO_IPV6
+					struct sockaddr_in6 sai6;
+#endif
+				} ss;
+				socklen_t sslen = sizeof(ss);
+				int incoming = accept(pfd[i].fd, &ss.sa, &sslen);
+				if (incoming < 0) {
+					switch (errno) {
+					case EAGAIN:
+					case EINTR:
+					case ECONNABORTED:
+						continue;
+					default:
+						die_errno("accept returned");
+					}
+				}
+				handle(incoming, &ss.sa, sslen);
+			}
+		}
+	}
+
+shutdown:
+	loginfo("Starting graceful shutdown (pid-file gone)");
+	for (i = 0; i < socklist->nr; i++)
+		close(socklist->list[i]);
+
+	return 0;
+}
+
+static int serve(struct string_list *listen_addr, int listen_port)
+{
+	struct socketlist socklist = { NULL, 0, 0 };
+
+	socksetup(listen_addr, listen_port, &socklist);
+	if (socklist.nr == 0)
+		die("unable to allocate any listen sockets on port %u",
+		    listen_port);
+
+	loginfo("Ready to rumble");
+
+	/*
+	 * Wait to create the pid-file until we've setup the sockets
+	 * and are open for business.
+	 */
+	if (pid_file)
+		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
+
+	return service_loop(&socklist);
+}
+
+/*
+ * This section is executed by both the primary instance and all
+ * worker instances.  So, yes, each child-process re-parses the
+ * command line argument and re-discovers how it should behave.
+ */
+
+int cmd_main(int argc, const char **argv)
+{
+	int listen_port = 0;
+	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
+	int worker_mode = 0;
+	int i;
+
+	trace2_cmd_name("test-http-server");
+	setup_git_directory_gently(NULL);
+
+	for (i = 1; i < argc; i++) {
+		const char *arg = argv[i];
+		const char *v;
+
+		if (skip_prefix(arg, "--listen=", &v)) {
+			string_list_append(&listen_addr, xstrdup_tolower(v));
+			continue;
+		}
+		if (skip_prefix(arg, "--port=", &v)) {
+			char *end;
+			unsigned long n;
+			n = strtoul(v, &end, 0);
+			if (*v && !*end) {
+				listen_port = n;
+				continue;
+			}
+		}
+		if (!strcmp(arg, "--worker")) {
+			worker_mode = 1;
+			trace2_cmd_mode("worker");
+			continue;
+		}
+		if (!strcmp(arg, "--verbose")) {
+			verbose = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--timeout=", &v)) {
+			timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--init-timeout=", &v)) {
+			init_timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--max-connections=", &v)) {
+			max_connections = atoi(v);
+			if (max_connections < 0)
+				max_connections = 0; /* unlimited */
+			continue;
+		}
+		if (!strcmp(arg, "--reuseaddr")) {
+			reuseaddr = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--pid-file=", &v)) {
+			pid_file = v;
+			continue;
+		}
+
+		fprintf(stderr, "error: unknown argument '%s'\n", arg);
+		usage(test_http_auth_usage);
+	}
+
+	/* avoid splitting a message in the middle */
+	setvbuf(stderr, NULL, _IOFBF, 4096);
+
+	if (listen_port == 0)
+		listen_port = DEFAULT_GIT_PORT;
+
+	/*
+	 * If no --listen=<addr> args are given, the setup_named_sock()
+	 * code will use receive a NULL address and set INADDR_ANY.
+	 * This exposes both internal and external interfaces on the
+	 * port.
+	 *
+	 * Disallow that and default to the internal-use-only loopback
+	 * address.
+	 */
+	if (!listen_addr.nr)
+		string_list_append(&listen_addr, "127.0.0.1");
+
+	/*
+	 * worker_mode is set in our own child process instances
+	 * (that are bound to a connected socket from a client).
+	 */
+	if (worker_mode)
+		return worker();
+
+	/*
+	 * `cld_argv` is a bit of a clever hack. The top-level instance
+	 * of test-http-server does the normal bind/listen/accept stuff.
+	 * For each incoming socket, the top-level process spawns
+	 * a child instance of test-http-server *WITH* the additional
+	 * `--worker` argument. This causes the child to set `worker_mode`
+	 * and immediately call `worker()` using the connected socket (and
+	 * without the usual need for fork() or threads).
+	 *
+	 * The magic here is made possible because `cld_argv` is static
+	 * and handle() (called by service_loop()) knows about it.
+	 */
+	strvec_push(&cld_argv, argv[0]);
+	strvec_push(&cld_argv, "--worker");
+	for (i = 1; i < argc; ++i)
+		strvec_push(&cld_argv, argv[i]);
+
+	/*
+	 * Setup primary instance to listen for connections.
+	 */
+	return serve(&listen_addr, listen_port);
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 07/11] test-http-server: add HTTP error response function
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 06/11] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 08/11] test-http-server: add HTTP request parsing Matthew John Cheetham via GitGitGadget
                       ` (8 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a function to the test-http-server test helper to write more
full and valid HTTP error responses, including all the standard response
headers like `Server` and `Date`.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 59 +++++++++++++++++++++++++++++++++----
 1 file changed, 53 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 18f1f741305..53508639714 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -97,9 +97,59 @@ enum worker_result {
 	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
 };
 
+static enum worker_result send_http_error(
+	int fd,
+	int http_code, const char *http_code_name,
+	int retry_after_seconds, struct string_list *response_headers,
+	enum worker_result wr_in)
+{
+	struct strbuf response_header = STRBUF_INIT;
+	struct strbuf response_content = STRBUF_INIT;
+	struct string_list_item *h;
+	enum worker_result wr;
+
+	strbuf_addf(&response_content, "Error: %d %s\r\n",
+		    http_code, http_code_name);
+	if (retry_after_seconds > 0)
+		strbuf_addf(&response_content, "Retry-After: %d\r\n",
+			    retry_after_seconds);
+
+	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
+	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
+	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
+	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
+	if (retry_after_seconds > 0)
+		strbuf_addf(&response_header, "Retry-After: %d\r\n", retry_after_seconds);
+	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
+	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
+	if (response_headers)
+		for_each_string_list_item(h, response_headers)
+			strbuf_addf(&response_header, "%s\r\n", h->string);
+	strbuf_addstr(&response_header, "\r\n");
+
+	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
+		logerror("unable to write response header");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
+		logerror("unable to write response content body");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	wr = wr_in;
+
+done:
+	strbuf_release(&response_header);
+	strbuf_release(&response_content);
+
+	return wr;
+}
+
 static enum worker_result worker(void)
 {
-	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
 	char *client_addr = getenv("REMOTE_ADDR");
 	char *client_port = getenv("REMOTE_PORT");
 	enum worker_result wr = WR_OK;
@@ -110,11 +160,8 @@ static enum worker_result worker(void)
 	set_keep_alive(0);
 
 	while (1) {
-		if (write_in_full(1, response, strlen(response)) < 0) {
-			logerror("unable to write response");
-			wr = WR_IO_ERROR;
-		}
-
+		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
+			WR_OK | WR_HANGUP);
 		if (wr & WR_STOP_THE_MUSIC)
 			break;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 08/11] test-http-server: add HTTP request parsing
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 07/11] test-http-server: add HTTP error response function Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 09/11] test-http-server: pass Git requests to http-backend Matthew John Cheetham via GitGitGadget
                       ` (7 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add ability to parse HTTP requests to the test-http-server test helper.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 176 +++++++++++++++++++++++++++++++++++-
 1 file changed, 174 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 53508639714..7bde678e264 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -97,6 +97,42 @@ enum worker_result {
 	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
 };
 
+/*
+ * Fields from a parsed HTTP request.
+ */
+struct req {
+	struct strbuf start_line;
+
+	const char *method;
+	const char *http_version;
+
+	struct strbuf uri_path;
+	struct strbuf query_args;
+
+	struct string_list header_list;
+	const char *content_type;
+	ssize_t content_length;
+};
+
+#define REQ__INIT { \
+	.start_line = STRBUF_INIT, \
+	.uri_path = STRBUF_INIT, \
+	.query_args = STRBUF_INIT, \
+	.header_list = STRING_LIST_INIT_NODUP, \
+	.content_type = NULL, \
+	.content_length = -1 \
+	}
+
+static void req__release(struct req *req)
+{
+	strbuf_release(&req->start_line);
+
+	strbuf_release(&req->uri_path);
+	strbuf_release(&req->query_args);
+
+	string_list_clear(&req->header_list, 0);
+}
+
 static enum worker_result send_http_error(
 	int fd,
 	int http_code, const char *http_code_name,
@@ -148,8 +184,136 @@ done:
 	return wr;
 }
 
+/*
+ * Read the HTTP request up to the start of the optional message-body.
+ * We do this byte-by-byte because we have keep-alive turned on and
+ * cannot rely on an EOF.
+ *
+ * https://tools.ietf.org/html/rfc7230
+ *
+ * We cannot call die() here because our caller needs to properly
+ * respond to the client and/or close the socket before this
+ * child exits so that the client doesn't get a connection reset
+ * by peer error.
+ */
+static enum worker_result req__read(struct req *req, int fd)
+{
+	struct strbuf h = STRBUF_INIT;
+	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
+	int nr_start_line_fields;
+	const char *uri_target;
+	const char *query;
+	char *hp;
+	const char *hv;
+
+	enum worker_result result = WR_OK;
+
+	/*
+	 * Read line 0 of the request and split it into component parts:
+	 *
+	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
+	 *
+	 */
+	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
+		result = WR_OK | WR_HANGUP;
+		goto done;
+	}
+
+	strbuf_trim_trailing_newline(&req->start_line);
+
+	nr_start_line_fields = string_list_split(&start_line_fields,
+						 req->start_line.buf,
+						 ' ', -1);
+	if (nr_start_line_fields != 3) {
+		logerror("could not parse request start-line '%s'",
+			 req->start_line.buf);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	req->method = xstrdup(start_line_fields.items[0].string);
+	req->http_version = xstrdup(start_line_fields.items[2].string);
+
+	uri_target = start_line_fields.items[1].string;
+
+	if (strcmp(req->http_version, "HTTP/1.1")) {
+		logerror("unsupported version '%s' (expecting HTTP/1.1)",
+			 req->http_version);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	query = strchr(uri_target, '?');
+
+	if (query) {
+		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+		strbuf_addstr(&req->query_args, query + 1);
+	} else {
+		strbuf_addstr(&req->uri_path, uri_target);
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+	}
+
+	/*
+	 * Read the set of HTTP headers into a string-list.
+	 */
+	while (1) {
+		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
+			goto done;
+		strbuf_trim_trailing_newline(&h);
+
+		if (!h.len)
+			goto done; /* a blank line ends the header */
+
+		hp = strbuf_detach(&h, NULL);
+		string_list_append(&req->header_list, hp);
+
+		/* store common request headers separately */
+		if (skip_prefix(hp, "Content-Type: ", &hv)) {
+			req->content_type = hv;
+		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
+			req->content_length = strtol(hv, &hp, 10);
+		}
+	}
+
+	/*
+	 * We do not attempt to read the <message-body>, if it exists.
+	 * We let our caller read/chunk it in as appropriate.
+	 */
+
+done:
+	string_list_clear(&start_line_fields, 0);
+
+	/*
+	 * This is useful for debugging the request, but very noisy.
+	 */
+	if (trace2_is_enabled()) {
+		struct string_list_item *item;
+		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
+		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
+		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
+		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
+		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
+		if (req->content_length >= 0)
+			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
+		if (req->content_type)
+			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
+		for_each_string_list_item(item, &req->header_list)
+			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
+	}
+
+	return result;
+}
+
+static enum worker_result dispatch(struct req *req)
+{
+	return send_http_error(1, 501, "Not Implemented", -1, NULL,
+			       WR_OK | WR_HANGUP);
+}
+
 static enum worker_result worker(void)
 {
+	struct req req = REQ__INIT;
 	char *client_addr = getenv("REMOTE_ADDR");
 	char *client_port = getenv("REMOTE_PORT");
 	enum worker_result wr = WR_OK;
@@ -160,8 +324,16 @@ static enum worker_result worker(void)
 	set_keep_alive(0);
 
 	while (1) {
-		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
-			WR_OK | WR_HANGUP);
+		req__release(&req);
+
+		alarm(init_timeout ? init_timeout : timeout);
+		wr = req__read(&req, 0);
+		alarm(0);
+
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+
+		wr = dispatch(&req);
 		if (wr & WR_STOP_THE_MUSIC)
 			break;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 09/11] test-http-server: pass Git requests to http-backend
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (7 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 08/11] test-http-server: add HTTP request parsing Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 10/11] test-http-server: add simple authentication Matthew John Cheetham via GitGitGadget
                       ` (6 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Teach the test-http-sever test helper to forward Git requests to the
`git-http-backend`.

Introduce a new test script t5556-http-auth.sh that spins up the test
HTTP server and attempts an `ls-remote` on the served repository,
without any authentication.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c |  56 +++++++++++++++++++
 t/t5556-http-auth.sh        | 105 ++++++++++++++++++++++++++++++++++++
 2 files changed, 161 insertions(+)
 create mode 100755 t/t5556-http-auth.sh

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 7bde678e264..9f1d6b58067 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -305,8 +305,64 @@ done:
 	return result;
 }
 
+static int is_git_request(struct req *req)
+{
+	static regex_t *smart_http_regex;
+	static int initialized;
+
+	if (!initialized) {
+		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
+		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
+			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
+			    REG_EXTENDED)) {
+			warning("could not compile smart HTTP regex");
+			smart_http_regex = NULL;
+		}
+		initialized = 1;
+	}
+
+	return smart_http_regex &&
+		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
+}
+
+static enum worker_result do__git(struct req *req, const char *user)
+{
+	const char *ok = "HTTP/1.1 200 OK\r\n";
+	struct child_process cp = CHILD_PROCESS_INIT;
+	int res;
+
+	if (write(1, ok, strlen(ok)) < 0)
+		return error(_("could not send '%s'"), ok);
+
+	if (user)
+		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
+
+	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
+	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
+			req->uri_path.buf);
+	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
+	if (req->query_args.len)
+		strvec_pushf(&cp.env, "QUERY_STRING=%s",
+				req->query_args.buf);
+	if (req->content_type)
+		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
+				req->content_type);
+	if (req->content_length >= 0)
+		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
+				(intmax_t)req->content_length);
+	cp.git_cmd = 1;
+	strvec_push(&cp.args, "http-backend");
+	res = run_command(&cp);
+	close(1);
+	close(0);
+	return !!res;
+}
+
 static enum worker_result dispatch(struct req *req)
 {
+	if (is_git_request(req))
+		return do__git(req, NULL);
+
 	return send_http_error(1, 501, "Not Implemented", -1, NULL,
 			       WR_OK | WR_HANGUP);
 }
diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
new file mode 100755
index 00000000000..78da151f122
--- /dev/null
+++ b/t/t5556-http-auth.sh
@@ -0,0 +1,105 @@
+#!/bin/sh
+
+test_description='test http auth header and credential helper interop'
+
+. ./test-lib.sh
+
+test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
+
+# Setup a repository
+#
+REPO_DIR="$(pwd)"/repo
+
+# Setup some lookback URLs where test-http-server will be listening.
+# We will spawn it directly inside the repo directory, so we avoid
+# any need to configure directory mappings etc - we only serve this
+# repository from the root '/' of the server.
+#
+HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
+ORIGIN_URL=http://$HOST_PORT/
+
+# The pid-file is created by test-http-server when it starts.
+# The server will shutdown if/when we delete it (this is easier than
+# killing it by PID).
+#
+PID_FILE="$(pwd)"/pid-file.pid
+SERVER_LOG="$(pwd)"/OUT.server.log
+
+PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
+
+test_expect_success 'setup repos' '
+	test_create_repo "$REPO_DIR" &&
+	git -C "$REPO_DIR" branch -M main
+'
+
+stop_http_server () {
+	if ! test -f "$PID_FILE"
+	then
+		return 0
+	fi
+	#
+	# The server will shutdown automatically when we delete the pid-file.
+	#
+	rm -f "$PID_FILE"
+	#
+	# Give it a few seconds to shutdown (mainly to completely release the
+	# port before the next test start another instance and it attempts to
+	# bind to it).
+	#
+	for k in 0 1 2 3 4
+	do
+		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "stop_http_server: timeout waiting for server shutdown"
+	return 1
+}
+
+start_http_server () {
+	#
+	# Launch our server into the background in repo_dir.
+	#
+	(
+		cd "$REPO_DIR"
+		test-http-server --verbose \
+			--listen=127.0.0.1 \
+			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
+			--reuseaddr \
+			--pid-file="$PID_FILE" \
+			"$@" \
+			2>"$SERVER_LOG" &
+	)
+	#
+	# Give it a few seconds to get started.
+	#
+	for k in 0 1 2 3 4
+	do
+		if test -f "$PID_FILE"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "start_http_server: timeout waiting for server startup"
+	return 1
+}
+
+per_test_cleanup () {
+	stop_http_server &&
+	rm -f OUT.*
+}
+
+test_expect_success 'http auth anonymous no challenge' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server --allow-anonymous &&
+
+	# Attempt to read from a protected repository
+	git ls-remote $ORIGIN_URL
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 10/11] test-http-server: add simple authentication
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (8 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 09/11] test-http-server: pass Git requests to http-backend Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-02 22:09     ` [PATCH v3 11/11] t5556: add HTTP authentication tests Matthew John Cheetham via GitGitGadget
                       ` (5 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add simple authentication to the test-http-server test helper.
Authentication schemes and sets of valid tokens can be specified via
command-line arguments. Incoming requests are compared against the set
of valid schemes and tokens and only approved if a matching token is
found, or if no auth was provided and anonymous auth is enabled.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 188 +++++++++++++++++++++++++++++++++++-
 1 file changed, 187 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 9f1d6b58067..9a458743d13 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -18,6 +18,8 @@ static const char test_http_auth_usage[] =
 "           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
 "           [--reuseaddr] [--pid-file=<file>]\n"
 "           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
+"           [--anonymous-allowed]\n"
+"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
 ;
 
 /* Timeout, and initial timeout */
@@ -358,10 +360,136 @@ static enum worker_result do__git(struct req *req, const char *user)
 	return !!res;
 }
 
+enum auth_result {
+	/* No auth module matches the request. */
+	AUTH_UNKNOWN = 0,
+
+	/* Auth module denied the request. */
+	AUTH_DENY = 1,
+
+	/* Auth module successfully validated the request. */
+	AUTH_ALLOW = 2,
+};
+
+struct auth_module {
+	char *scheme;
+	char *challenge_params;
+	struct string_list *tokens;
+};
+
+static int allow_anonymous;
+static struct auth_module **auth_modules = NULL;
+static size_t auth_modules_nr = 0;
+static size_t auth_modules_alloc = 0;
+
+static struct auth_module *get_auth_module(const char *scheme)
+{
+	int i;
+	struct auth_module *mod;
+	for (i = 0; i < auth_modules_nr; i++) {
+		mod = auth_modules[i];
+		if (!strcasecmp(mod->scheme, scheme))
+			return mod;
+	}
+
+	return NULL;
+}
+
+static void add_auth_module(struct auth_module *mod)
+{
+	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
+	auth_modules[auth_modules_nr++] = mod;
+}
+
+static int is_authed(struct req *req, const char **user, enum worker_result *wr)
+{
+	enum auth_result result = AUTH_UNKNOWN;
+	struct string_list hdrs = STRING_LIST_INIT_NODUP;
+	struct auth_module *mod;
+
+	struct string_list_item *hdr;
+	struct string_list_item *token;
+	const char *v;
+	struct strbuf **split = NULL;
+	int i;
+	char *challenge;
+
+	/*
+	 * Check all auth modules and try to validate the request.
+	 * The first module that matches a valid token approves the request.
+	 * If no module is found, or if there is no valid token, then 401 error.
+	 * Otherwise, only permit the request if anonymous auth is enabled.
+	 */
+	for_each_string_list_item(hdr, &req->header_list) {
+		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
+			split = strbuf_split_str(v, ' ', 2);
+			if (!split[0] || !split[1]) continue;
+
+			/* trim trailing space ' ' */
+			strbuf_setlen(split[0], split[0]->len - 1);
+
+			mod = get_auth_module(split[0]->buf);
+			if (mod) {
+				result = AUTH_DENY;
+
+				for_each_string_list_item(token, mod->tokens) {
+					if (!strcmp(split[1]->buf, token->string)) {
+						result = AUTH_ALLOW;
+						break;
+					}
+				}
+
+				goto done;
+			}
+		}
+	}
+
+done:
+	switch (result) {
+	case AUTH_ALLOW:
+		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
+		*user = "VALID_TEST_USER";
+		*wr = WR_OK;
+		break;
+
+	case AUTH_DENY:
+		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
+		/* fall-through */
+
+	case AUTH_UNKNOWN:
+		if (result != AUTH_DENY && allow_anonymous)
+			break;
+		for (i = 0; i < auth_modules_nr; i++) {
+			mod = auth_modules[i];
+			if (mod->challenge_params)
+				challenge = xstrfmt("WWW-Authenticate: %s %s",
+						    mod->scheme,
+						    mod->challenge_params);
+			else
+				challenge = xstrfmt("WWW-Authenticate: %s",
+						    mod->scheme);
+			string_list_append(&hdrs, challenge);
+		}
+		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
+	}
+
+	strbuf_list_free(split);
+	string_list_clear(&hdrs, 0);
+
+	return result == AUTH_ALLOW ||
+	      (result == AUTH_UNKNOWN && allow_anonymous);
+}
+
 static enum worker_result dispatch(struct req *req)
 {
+	enum worker_result wr = WR_OK;
+	const char *user = NULL;
+
+	if (!is_authed(req, &user, &wr))
+		return wr;
+
 	if (is_git_request(req))
-		return do__git(req, NULL);
+		return do__git(req, user);
 
 	return send_http_error(1, 501, "Not Implemented", -1, NULL,
 			       WR_OK | WR_HANGUP);
@@ -854,6 +982,7 @@ int cmd_main(int argc, const char **argv)
 	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
 	int worker_mode = 0;
 	int i;
+	struct auth_module *mod = NULL;
 
 	trace2_cmd_name("test-http-server");
 	setup_git_directory_gently(NULL);
@@ -906,6 +1035,63 @@ int cmd_main(int argc, const char **argv)
 			pid_file = v;
 			continue;
 		}
+		if (skip_prefix(arg, "--allow-anonymous", &v)) {
+			allow_anonymous = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--auth=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			/* trim trailing ':' */
+			if (p[1])
+				strbuf_setlen(p[0], p[0]->len - 1);
+
+			if (get_auth_module(p[0]->buf)) {
+				error("duplicate auth scheme '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			mod = xmalloc(sizeof(struct auth_module));
+			mod->scheme = xstrdup(p[0]->buf);
+			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
+			CALLOC_ARRAY(mod->tokens, 1);
+			string_list_init_dup(mod->tokens);
+
+			add_auth_module(mod);
+
+			strbuf_list_free(p);
+			continue;
+		}
+		if (skip_prefix(arg, "--auth-token=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			if (!p[1]) {
+				error("missing token value '%s'\n", v);
+				usage(test_http_auth_usage);
+			}
+
+			/* trim trailing ':' */
+			strbuf_setlen(p[0], p[0]->len - 1);
+
+			mod = get_auth_module(p[0]->buf);
+			if (!mod) {
+				error("auth scheme not defined '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			string_list_append(mod->tokens, p[1]->buf);
+			strbuf_list_free(p);
+			continue;
+		}
 
 		fprintf(stderr, "error: unknown argument '%s'\n", arg);
 		usage(test_http_auth_usage);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v3 11/11] t5556: add HTTP authentication tests
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (9 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 10/11] test-http-server: add simple authentication Matthew John Cheetham via GitGitGadget
@ 2022-11-02 22:09     ` Matthew John Cheetham via GitGitGadget
  2022-11-03 19:00     ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers M Hickford
                       ` (4 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-11-02 22:09 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add a series of tests to exercise the HTTP authentication header parsing
and the interop with credential helpers. Credential helpers can respond
to requests that contain WWW-Authenticate information with the ability
to select the response Authenticate header scheme.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-credential-helper-replay.sh |  14 ++
 t/t5556-http-auth.sh                      | 157 +++++++++++++++++++++-
 2 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100755 t/helper/test-credential-helper-replay.sh

diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
new file mode 100755
index 00000000000..03e5e63dad6
--- /dev/null
+++ b/t/helper/test-credential-helper-replay.sh
@@ -0,0 +1,14 @@
+cmd=$1
+teefile=$cmd-actual.cred
+catfile=$cmd-response.cred
+rm -f $teefile
+while read line;
+do
+	if test -z "$line"; then
+		break;
+	fi
+	echo "$line" >> $teefile
+done
+if test "$cmd" = "get"; then
+	cat $catfile
+fi
diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
index 78da151f122..43f1791a0fe 100755
--- a/t/t5556-http-auth.sh
+++ b/t/t5556-http-auth.sh
@@ -26,6 +26,8 @@ PID_FILE="$(pwd)"/pid-file.pid
 SERVER_LOG="$(pwd)"/OUT.server.log
 
 PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
+CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
+	&& export CREDENTIAL_HELPER
 
 test_expect_success 'setup repos' '
 	test_create_repo "$REPO_DIR" &&
@@ -91,7 +93,8 @@ start_http_server () {
 
 per_test_cleanup () {
 	stop_http_server &&
-	rm -f OUT.*
+	rm -f OUT.* &&
+	rm -f *.cred
 }
 
 test_expect_success 'http auth anonymous no challenge' '
@@ -102,4 +105,156 @@ test_expect_success 'http auth anonymous no challenge' '
 	git ls-remote $ORIGIN_URL
 '
 
+test_expect_success 'http auth www-auth headers to credential helper bearer valid' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=bearer:secret-token &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-token
+	authtype=bearer
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-token
+	authtype=bearer
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper basic valid' '
+	test_when_finished "per_test_cleanup" &&
+	# base64("alice:secret-passwd")
+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
+	export USERPASS64 &&
+
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=basic:$USERPASS64 &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	authtype=basic
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	authtype=basic
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper custom scheme' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=foobar:alg=test\ widget=1 \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=foobar:SECRET-FOOBAR-VALUE &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=foobar alg=test widget=1
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=SECRET-FOOBAR-VALUE
+	authtype=foobar
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=SECRET-FOOBAR-VALUE
+	authtype=foobar
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper invalid' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=bearer:secret-token &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >erase-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-token
+	authtype=bearer
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-token
+	authtype=bearer
+	EOF
+
+	test_must_fail git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp erase-expected.cred erase-actual.cred
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (10 preceding siblings ...)
  2022-11-02 22:09     ` [PATCH v3 11/11] t5556: add HTTP authentication tests Matthew John Cheetham via GitGitGadget
@ 2022-11-03 19:00     ` M Hickford
  2022-12-12 22:07       ` Matthew John Cheetham
  2022-11-07 19:23     ` Derrick Stolee
                       ` (3 subsequent siblings)
  15 siblings, 1 reply; 171+ messages in thread
From: M Hickford @ 2022-11-03 19:00 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

On Wed, 2 Nov 2022 at 22:09, Matthew John Cheetham via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> `authtype`::
>
> Indicates the type of authentication scheme that should be used by Git.
> Credential helpers may reply to a request from Git with this attribute,
> such that subsequent authenticated requests include the correct
> `Authorization` header.
> If this attribute is not present, the default value is "Basic".
> Known values include "Basic", "Digest", and "Bearer".
> If an unknown value is provided, this is taken as the authentication
> scheme for the `Authorization` header, and the `password` field is
> used as the raw unencoded authorization parameters of the same header.

Do you have an example using authtype=Digest? Would the helper
populate the password field with the user's verbatim password or the
Digest challenge response? Put another way, is the Digest
challenge-response logic in Git (libcurl) or the helper?

https://www.rfc-editor.org/rfc/rfc7616#section-3.4

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 06/11] test-http-server: add stub HTTP server test helper
  2022-11-02 22:09     ` [PATCH v3 06/11] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
@ 2022-11-07 19:19       ` Derrick Stolee
  0 siblings, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-11-07 19:19 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham, M Hickford,
	Jeff Hostetler, Matthew John Cheetham

On 11/2/22 6:09 PM, Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Introduce a mini HTTP server helper that in the future will be enhanced
> to provide a frontend for the git-http-backend, with support for
> arbitrary authentication schemes.
> 
> Right now, test-http-server is a pared-down copy of the git-daemon that
> always returns a 501 Not Implemented response to all callers.

Thanks for splitting this out. I ran a diff between daemon.c and
this version of t/helper/test-http-server.c. Most of the diff was
functionality removed from daemon.c, and the small bits that were
new to this file are either comments detailing how the helper
works or custom bits related to the test environment (like the
pid file). It was much easier to validate that these changes made
sense.

Looking good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (11 preceding siblings ...)
  2022-11-03 19:00     ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers M Hickford
@ 2022-11-07 19:23     ` Derrick Stolee
  2022-11-09 23:06     ` Glen Choo
                       ` (2 subsequent siblings)
  15 siblings, 0 replies; 171+ messages in thread
From: Derrick Stolee @ 2022-11-07 19:23 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Lessley Dennington, Matthew John Cheetham, M Hickford,
	Jeff Hostetler, Matthew John Cheetham

On 11/2/22 6:09 PM, Matthew John Cheetham via GitGitGadget wrote:
> Following from my original RFC submission [0], this submission is considered
> ready for full review. This patch series is now based on top of current
> master (9c32cfb49c60fa8173b9666db02efe3b45a8522f) that includes my now
> separately submitted patches [1] to fix up the other credential helpers'
> behaviour.

> Updates in v3
> =============
> 
>  * Split final patch that added the test-http-server in to several, easier
>    to review patches.
> 
>  * Updated wording in git-credential.txt to clarify which side of the
>    credential helper protocol is sending/receiving the new wwwauth and
>    authtype attributes.

You also updated some commit messages based on v2 feedback. Thanks!

The commit splitting you did in this version is greatly appreciated.
I found this version to be in good shape. It's a solid foundation to
build upon (if any future work is necessary).

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (12 preceding siblings ...)
  2022-11-07 19:23     ` Derrick Stolee
@ 2022-11-09 23:06     ` Glen Choo
  2022-12-12 22:03       ` Matthew John Cheetham
  2022-11-28  9:40     ` Junio C Hamano
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
  15 siblings, 1 reply; 171+ messages in thread
From: Glen Choo @ 2022-11-09 23:06 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

Hi Matthew!

We covered this series in Review Club. As usual, participants will send
their own feedback on this thread, but you may also find the meeting
notes handy:

  https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit?pli=1#

"Matthew John Cheetham via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Background
> ==========
>
> [...]
>
> Limitations
> ===========
>
> [...]
>
> Goals
> =====
>
> [...]
>
> Design Principles
> =================
>
> [...]

Thanks for the well-written cover letter! I suspect that not many folks
are familiar with the history and workings of credential helpers, the
current state of auth and how credential helper limitations create
challenges for auth.

I've learned a lot reading this, and it makes the motivations of this
series clear :)

> Proposed Changes
> ================
>
>  1. Teach Git to read HTTP response headers, specifically the standard
>     WWW-Authenticate (RFC 7235 Section 4.1) headers.
>
>  2. Teach Git to include extra information about HTTP responses that require
>     authentication when calling credential helpers. Specifically the
>     WWW-Authenticate header information.
>     
>     Because the extra information forms an ordered list, and the existing
>     credential helper I/O format only provides for simple key=value pairs,
>     we introduce a new convention for transmitting an ordered list of
>     values. Key names that are suffixed with a C-style array syntax should
>     have values considered to form an order list, i.e. key[]=value, where
>     the order of the key=value pairs in the stream specifies the order.
>     
>     For the WWW-Authenticate header values we opt to use the key wwwauth[].
>
>  3. Teach Git to specify authentication schemes other than Basic in
>     subsequent HTTP requests based on credential helper responses.
>

From a reading of this section + the subject line, it's not immediately
obvious that 3. also requires extending the credential helper protocol
to include the "authtype" field. IMO it's significant enough to warrant
an explicit call-out.

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 03/11] http: store all request headers on active_request_slot
  2022-11-02 22:09     ` [PATCH v3 03/11] http: store all request headers on active_request_slot Matthew John Cheetham via GitGitGadget
@ 2022-11-09 23:18       ` Glen Choo
  0 siblings, 0 replies; 171+ messages in thread
From: Glen Choo @ 2022-11-09 23:18 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

"Matthew John Cheetham via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Once a list of headers has been set on the curl handle, it is not
> possible to recover that `struct curl_slist` instance to add or modify
> headers.
>
> In future commits we will want to modify the set of request headers in
> response to an authentication challenge/401 response from the server,
> with information provided by a credential helper.
>
> There are a number of different places where curl is used for an HTTP
> request, and they do not have a common handling of request headers.
> However, given that they all do call the `start_active_slot()` function,
> either directly or indirectly via `run_slot()` or `run_one_slot()`, we
> use this as the point to set the `CURLOPT_HTTPHEADER` option just
> before the request is made.
>
> We collect all request headers in a `struct curl_slist` on the
> `struct active_request_slot` that is obtained from a call to
> `get_active_slot(int)`. This function now takes a single argument to
> define if the initial set of headers on the slot should include the
> "Pragma: no-cache" header, along with all extra headers specified via
> `http.extraHeader` config values.

I admit that I'm not that familiar with the http subsystem, so I'll
focus on the style.

If I'm reading this patch correctly, there are two related, but distinct
changes:

- store and modify the headers on the slot
- change how headers are initialized and remove now-unncessary libcurl
  calls that set headers

Both are simple, but given the number of LoCs changed, I found it quite
difficult to track which LoCs were part of which work. Could this be
broken up into two patches instead, i.e.:

- store headers on the slot without changing how they are initialized
- add extra header initialization logic to get_active_slot() and remove
  the unnecessary libcurl calls


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 05/11] http: set specific auth scheme depending on credential
  2022-11-02 22:09     ` [PATCH v3 05/11] http: set specific auth scheme depending on credential Matthew John Cheetham via GitGitGadget
@ 2022-11-09 23:40       ` Glen Choo
  2022-12-12 21:53         ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Glen Choo @ 2022-11-09 23:40 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

"Matthew John Cheetham via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Matthew John Cheetham <mjcheetham@outlook.com>
>
> Introduce a new credential field `authtype` that can be used by
> credential helpers to indicate the type of the credential or
> authentication mechanism to use for a request.
>
> Modify http.c to now specify the correct authentication scheme or
> credential type when authenticating the curl handle. If the new
> `authtype` field in the credential structure is `NULL` or "Basic" then
> use the existing username/password options. If the field is "Bearer"
> then use the OAuth bearer token curl option. Otherwise, the `authtype`
> field is the authentication scheme and the `password` field is the
> raw, unencoded value.
>
> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
> ---
>  Documentation/git-credential.txt | 12 ++++++++++++
>  credential.c                     |  5 +++++
>  credential.h                     |  1 +
>  git-curl-compat.h                | 10 ++++++++++
>  http.c                           | 24 +++++++++++++++++++++---
>  5 files changed, 49 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
> index 791a57dddfb..9069bfb2d50 100644
> --- a/Documentation/git-credential.txt
> +++ b/Documentation/git-credential.txt
> @@ -175,6 +175,18 @@ username in the example above) will be left unset.
>  	attribute 'wwwauth[]', where the order of the attributes is the same as
>  	they appear in the HTTP response.
>  
> +`authtype`::
> +
> +	Indicates the type of authentication scheme that should be used by Git.
> +	Credential helpers may reply to a request from Git with this attribute,
> +	such that subsequent authenticated requests include the correct
> +	`Authorization` header.
> +	If this attribute is not present, the default value is "Basic".
> +	Known values include "Basic", "Digest", and "Bearer".
> +	If an unknown value is provided, this is taken as the authentication
> +	scheme for the `Authorization` header, and the `password` field is
> +	used as the raw unencoded authorization parameters of the same header.
> +

[...]

> @@ -525,8 +526,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
>  
>  	credential_fill(&http_auth);
>  
> -	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
> -	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
> +	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
> +				|| !strcasecmp(http_auth.authtype, "digest")) {
> +		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
> +			http_auth.username);
> +		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
> +			http_auth.password);
> +#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
> +	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
> +		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
> +		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
> +			http_auth.password);
> +#endif
> +	} else {
> +		struct strbuf auth = STRBUF_INIT;
> +		strbuf_addf(&auth, "Authorization: %s %s",
> +			http_auth.authtype, http_auth.password);
> +		slot->headers = curl_slist_append(slot->headers, auth.buf);
> +		strbuf_release(&auth);
> +	}

As expected, a "Bearer" authtype doesn't require passing a username to
curl, but as you noted in the cover letter, credential helpers were
designed with username-password authentication in mind, which raises the
question of what a credential helper should do with "Bearer"
credentials.

e.g. it is not clear to me where the "username" comes from in the tests, e.g.

  +test_expect_success 'http auth www-auth headers to credential helper basic valid' '
  +	test_when_finished "per_test_cleanup" &&
  +	# base64("alice:secret-passwd")
  +	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
  +	export USERPASS64 &&
  +
  +	start_http_server \
  +		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
  +		--auth=basic:realm=\"example.com\" \
  +		--auth-token=basic:$USERPASS64 &&
  +
  +	cat >get-expected.cred <<-EOF &&
  +	protocol=http
  +	host=$HOST_PORT
  +	wwwauth[]=bearer authority="id.example.com" q=1 p=0
  +	wwwauth[]=basic realm="example.com"
  +	EOF
  +
  +	cat >store-expected.cred <<-EOF &&
  +	protocol=http
  +	host=$HOST_PORT
  +	username=alice
  +	password=secret-passwd
  +	authtype=basic
  +	EOF
  +
  +	cat >get-response.cred <<-EOF &&
  +	protocol=http
  +	host=$HOST_PORT
  +	username=alice
  +	password=secret-passwd
  +	authtype=basic
  +	EOF
  +
  +	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
  +
  +	test_cmp get-expected.cred get-actual.cred &&
  +	test_cmp store-expected.cred store-actual.cred
  +'

I'm not sure how we plan to handle this. Some approaches I can see are:

- We require that credential helpers set a reasonable value for
  "username". Presumably most credential helpers generating bearer
  tokens have some idea of user identity, so this might be reasonable,
  though it is wasteful, since we never use it in a meaningul way, e.g.
  I don't think Git asks the credential helper for "username=alice" and
  the credential helper decides to return the 'alice' credential instead
  of the 'bob' credential (but I could be mistaken).

- We require that credential helpers set _some_ value for "username",
  even if it is bogus. If so, we should communicate this explicitly.

- It is okay for "username" to be missing. This seems like the most
  elegant approach for credential helpers. I'm not sure if we're there
  yet with this series, e.g. http.c::handle_curl_result() reads:

    else if (results->http_code == 401) {
      if (http_auth.username && http_auth.password) {
        credential_reject(&http_auth);
        return HTTP_NOAUTH;

  which seems to assume both a username _and_ password. If the username
  is missing, we presumably don't send "erase", which might be a problem
  for revoked access tokens (though presumably not an issue for OIDC id
  tokens).

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (13 preceding siblings ...)
  2022-11-09 23:06     ` Glen Choo
@ 2022-11-28  9:40     ` Junio C Hamano
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
  15 siblings, 0 replies; 171+ messages in thread
From: Junio C Hamano @ 2022-11-28  9:40 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget,
	Ævar Arnfjörð Bjarmason
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Matthew John Cheetham

"Matthew John Cheetham via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Testing these new additions, I introduce a new test helper test-http-server
> that acts as a frontend to git-http-backend; a mini HTTP server based
> heavily on git-daemon, with simple authentication configurable by command
> line args.

I did not try to figure out the reason but the topic with its tests
seem to break in 'seen' the linux-cmake-ctest CI job.

  https://github.com/git/git/actions/runs/3562942886/jobs/5985179202

but the same test does not break under usual "make test".

Can people who are interested in the cmake-ctest stuff take a look?

It is tempting to eject the ab/cmake-nix-and-ci topic that is
already in 'next', under the theory that what that topic does to the
tests "works" for some tests but not for others, and this topic is
an unfortunate collateral damage whose tests weren't something the
other topic did not support well.  If the cmake-ctest stuff is in
such a shape, then it may have been a bit premature to merge it
down.

Thanks.

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v4 0/8] Enhance credential helper protocol to include auth headers
  2022-11-02 22:09   ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
                       ` (14 preceding siblings ...)
  2022-11-28  9:40     ` Junio C Hamano
@ 2022-12-12 21:36     ` Matthew John Cheetham via GitGitGadget
  2022-12-12 21:36       ` [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
                         ` (8 more replies)
  15 siblings, 9 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Following from my original RFC submission [0], this submission is considered
ready for full review. This patch series is now based on top of current
master (9c32cfb49c60fa8173b9666db02efe3b45a8522f) that includes my now
separately submitted patches [1] to fix up the other credential helpers'
behaviour.

In this patch series I update the existing credential helper design in order
to allow for some new scenarios, and future evolution of auth methods that
Git hosts may wish to provide. I outline the background, summary of changes
and some challenges below.

Testing these new additions, I introduce a new test helper test-http-server
that acts as a frontend to git-http-backend; a mini HTTP server based
heavily on git-daemon, with simple authentication configurable by command
line args.


Background
==========

Git uses a variety of protocols [2]: local, Smart HTTP, Dumb HTTP, SSH, and
Git. Here I focus on the Smart HTTP protocol, and attempt to enhance the
authentication capabilities of this protocol to address limitations (see
below).

The Smart HTTP protocol in Git supports a few different types of HTTP
authentication - Basic and Digest (RFC 2617) [3], and Negotiate (RFC 2478)
[4]. Git uses a extensible model where credential helpers can provide
credentials for protocols [5]. Several helpers support alternatives such as
OAuth authentication (RFC 6749) [6], but this is typically done as an
extension. For example, a helper might use basic auth and set the password
to an OAuth Bearer access token. Git uses standard input and output to
communicate with credential helpers.

After a HTTP 401 response, Git would call a credential helper with the
following over standard input:

protocol=https
host=example.com


And then a credential helper would return over standard output:

protocol=https
host=example.com
username=bob@id.example.com
password=<BEARER-TOKEN>


Git then the following request to the remote, including the standard HTTP
Authorization header (RFC 7235 Section 4.2) [7]:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: Basic base64(bob@id.example.com:<BEARER-TOKEN>)


Credential helpers are encouraged (see gitcredentials.txt) to return the
minimum information necessary.


Limitations
===========

Because this credential model was built mostly for password based
authentication systems, it's somewhat limited. In particular:

 1. To generate valid credentials, additional information about the request
    (or indeed the requestee and their device) may be required. For example,
    OAuth is based around scopes. A scope, like "git.read", might be
    required to read data from the remote. However, the remote cannot tell
    the credential helper what scope is required for this request.

 2. This system is not fully extensible. Each time a new type of
    authentication (like OAuth Bearer) is invented, Git needs updates before
    credential helpers can take advantage of it (or leverage a new
    capability in libcurl).


Goals
=====

 * As a user with multiple federated cloud identities:
   
   * Reach out to a remote and have my credential helper automatically
     prompt me for the correct identity.
   * Allow credential helpers to differentiate between different authorities
     or authentication/authorization challenge types, even from the same DNS
     hostname (and without needing to use credential.useHttpPath).
   * Leverage existing authentication systems built-in to many operating
     systems and devices to boost security and reduce reliance on passwords.

 * As a Git host and/or cloud identity provider:
   
   * Enforce security policies (like requiring two-factor authentication)
     dynamically.
   * Allow integration with third party standard based identity providers in
     enterprises allowing customers to have a single plane of control for
     critical identities with access to source code.


Design Principles
=================

 * Use the existing infrastructure. Git credential helpers are an
   already-working model.
 * Follow widely-adopted time-proven open standards, avoid net new ideas in
   the authentication space.
 * Minimize knowledge of authentication in Git; maintain modularity and
   extensibility.


Proposed Changes
================

 1. Teach Git to read HTTP response headers, specifically the standard
    WWW-Authenticate (RFC 7235 Section 4.1) headers.

 2. Teach Git to include extra information about HTTP responses that require
    authentication when calling credential helpers. Specifically the
    WWW-Authenticate header information.
    
    Because the extra information forms an ordered list, and the existing
    credential helper I/O format only provides for simple key=value pairs,
    we introduce a new convention for transmitting an ordered list of
    values. Key names that are suffixed with a C-style array syntax should
    have values considered to form an order list, i.e. key[]=value, where
    the order of the key=value pairs in the stream specifies the order.
    
    For the WWW-Authenticate header values we opt to use the key wwwauth[].


Handling the WWW-Authenticate header in detail
==============================================

RFC 6750 [8] envisions that OAuth Bearer resource servers would give
responses that include WWW-Authenticate headers, for example:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


Specifically, a WWW-Authenticate header consists of a scheme and arbitrary
attributes, depending on the scheme. This pattern enables generic OAuth or
OpenID Connect [9] authorities. Note that it is possible to have several
WWW-Authenticate challenges in a response.

First Git attempts to make a request, unauthenticated, which fails with a
401 response and includes WWW-Authenticate header(s).

Next, Git invokes a credential helper which may prompt the user. If the user
approves, a credential helper can generate a token (or any auth challenge
response) to be used for that request.

For example: with a remote that supports bearer tokens from an OpenID
Connect [9] authority, a credential helper can use OpenID Connect's
Discovery [10] and Dynamic Client Registration [11] to register a client and
make a request with the correct permissions to access the remote. In this
manner, a user can be dynamically sent to the right federated identity
provider for a remote without any up-front configuration or manual
processes.

Following from the principle of keeping authentication knowledge in Git to a
minimum, we modify Git to add all WWW-Authenticate values to the credential
helper call.

Git sends over standard input:

protocol=https
host=example.com
wwwauth[]=Bearer realm="login.example", scope="git.readwrite"
wwwauth[]=Basic realm="login.example"


A credential helper that understands the extra wwwauth[n] property can
decide on the "best" or correct authentication scheme, generate credentials
for the request, and interact with the user.

The credential helper would then return over standard output:

protocol=https
host=example.com
path=foo.git
username=bob@identity.example
password=<BEARER-TOKEN>


Note that WWW-Authenticate supports multiple challenges, either in one
header:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"


or in multiple headers:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


These have equivalent meaning (RFC 2616 Section 4.2 [12]). To simplify the
implementation, Git will not merge or split up any of these WWW-Authenticate
headers, and instead pass each header line as one credential helper
property. The credential helper is responsible for splitting, merging, and
otherwise parsing these header values.

An alternative option to sending the header fields individually would be to
merge the header values in to one key=value property, for example:

...
wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"



Future work
===========

In the future we can further expand the protocol to allow credential helpers
decide the best authentication scheme. Today credential helpers are still
only expected to return a username/password pair to Git, meaning the other
authentication schemes that may be offered still need challenge responses
sent via a Basic Authorization header. The changes outlined above still
permit helpers to select and configure an available authentication mode, but
require the remote for example to unpack a bearer token from a basic
challenge.

More careful consideration is required in the handling of custom
authentication schemes which may not have a username, or may require
arbitrary additional request header values be set.

For example imagine a new "FooBar" authentication scheme that is surfaced in
the following response:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: FooBar realm="login.example", algs="ES256 PS256"


With support for arbitrary authentication schemes, Git would call credential
helpers with the following over standard input:

protocol=https
host=example.com
wwwauth[]=FooBar realm="login.example", algs="ES256 PS256", nonce="abc123"


And then an enlightened credential helper could return over standard output:

protocol=https
host=example.com
authtype=FooBar
username=bob@id.example.com
password=<FooBar credential>
header[]=X-FooBar: 12345
header[]=X-FooBar-Alt: ABCDEF


Git would be expected to attach this authorization header to the next
request:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: FooBar <FooBar credential>
X-FooBar: 12345
X-FooBar-Alt: ABCDEF



Why not SSH?
============

There's nothing wrong with SSH. However, Git's Smart HTTP transport is
widely used, often with OAuth Bearer tokens. Git's Smart HTTP transport
sometimes requires less client setup than SSH transport, and works in
environments when SSH ports may be blocked. As long as Git supports HTTP
transport, it should support common and popular HTTP authentication methods.


References
==========

 * [0] [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth
   headers
   https://lore.kernel.org/git/pull.1352.git.1663097156.gitgitgadget@gmail.com/

 * [1] [PATCH 0/3] Correct credential helper discrepancies handling input
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * [2] Git on the Server - The Protocols
   https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols

 * [3] HTTP Authentication: Basic and Digest Access Authentication
   https://datatracker.ietf.org/doc/html/rfc2617

 * [4] The Simple and Protected GSS-API Negotiation Mechanism
   https://datatracker.ietf.org/doc/html/rfc2478

 * [5] Git Credentials - Custom Helpers
   https://git-scm.com/docs/gitcredentials#_custom_helpers

 * [6] The OAuth 2.0 Authorization Framework
   https://datatracker.ietf.org/doc/html/rfc6749

 * [7] Hypertext Transfer Protocol (HTTP/1.1): Authentication
   https://datatracker.ietf.org/doc/html/rfc7235

 * [8] The OAuth 2.0 Authorization Framework: Bearer Token Usage
   https://datatracker.ietf.org/doc/html/rfc6750

 * [9] OpenID Connect Core 1.0
   https://openid.net/specs/openid-connect-core-1_0.html

 * [10] OpenID Connect Discovery 1.0
   https://openid.net/specs/openid-connect-discovery-1_0.html

 * [11] OpenID Connect Dynamic Client Registration 1.0
   https://openid.net/specs/openid-connect-registration-1_0.html

 * [12] Hypertext Transfer Protocol (HTTP/1.1)
   https://datatracker.ietf.org/doc/html/rfc2616


Updates from RFC
================

 * Submitted first three patches as separate submission:
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * Various style fixes and updates to- and addition of comments.

 * Drop the explicit integer index in new 'array' style credential helper
   attrbiutes ("key[n]=value" becomes just "key[]=value").

 * Added test helper; a mini HTTP server, and several tests.


Updates in v3
=============

 * Split final patch that added the test-http-server in to several, easier
   to review patches.

 * Updated wording in git-credential.txt to clarify which side of the
   credential helper protocol is sending/receiving the new wwwauth and
   authtype attributes.


Updates in v4
=============

 * Drop authentication scheme selection authtype attribute patches to
   greatly simplify the series; auth scheme selection is punted to a future
   series. This series still allows credential helpers to generate
   credentials and intelligently select correct identities for a given auth
   challenge.

Matthew John Cheetham (8):
  http: read HTTP WWW-Authenticate response headers
  credential: add WWW-Authenticate header to cred requests
  test-http-server: add stub HTTP server test helper
  test-http-server: add HTTP error response function
  test-http-server: add HTTP request parsing
  test-http-server: pass Git requests to http-backend
  test-http-server: add simple authentication
  t5556: add HTTP authentication tests

 Documentation/git-credential.txt          |   18 +-
 Makefile                                  |    2 +
 contrib/buildsystems/CMakeLists.txt       |   13 +
 credential.c                              |   13 +
 credential.h                              |   15 +
 http.c                                    |   78 ++
 t/helper/.gitignore                       |    1 +
 t/helper/test-credential-helper-replay.sh |   14 +
 t/helper/test-http-server.c               | 1146 +++++++++++++++++++++
 t/t5556-http-auth.sh                      |  223 ++++
 10 files changed, 1522 insertions(+), 1 deletion(-)
 create mode 100755 t/helper/test-credential-helper-replay.sh
 create mode 100644 t/helper/test-http-server.c
 create mode 100755 t/t5556-http-auth.sh


base-commit: c48035d29b4e524aed3a32f0403676f0d9128863
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1352%2Fmjcheetham%2Femu-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1352/mjcheetham/emu-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1352

Range-diff vs v3:

  1:  f297c78f60a =  1:  b5b56ccd941 http: read HTTP WWW-Authenticate response headers
  2:  e45e23406a5 !  2:  d02875dda7c credential: add WWW-Authenticate header to cred requests
     @@ Documentation/git-credential.txt: empty string.
      +	to credential helpers.
      +	Each 'WWW-Authenticate' header value is passed as a multi-valued
      +	attribute 'wwwauth[]', where the order of the attributes is the same as
     -+	they appear in the HTTP response.
     ++	they appear in the HTTP response. This attribute is 'one-way' from Git
     ++	to pass additional information to credential helpers.
      +
     + Unrecognised attributes are silently discarded.
     + 
       GIT
     - ---
     - Part of the linkgit:git[1] suite
      
       ## credential.c ##
      @@ credential.c: static void credential_write_item(FILE *fp, const char *key, const char *value,
  3:  65ac638b8a0 <  -:  ----------- http: store all request headers on active_request_slot
  4:  4d75ca29cc5 <  -:  ----------- http: move proactive auth to first slot creation
  5:  2f38427aa8d <  -:  ----------- http: set specific auth scheme depending on credential
  6:  4947e81546a =  3:  07a1845ea56 test-http-server: add stub HTTP server test helper
  7:  93bdf1d7060 =  4:  98dd286db7c test-http-server: add HTTP error response function
  8:  b3e9156755f =  5:  5c4e36e23ee test-http-server: add HTTP request parsing
  9:  5fb248c074a =  6:  0a0f4fd10c8 test-http-server: pass Git requests to http-backend
 10:  192f09b9de4 =  7:  794256754c1 test-http-server: add simple authentication
 11:  b64d2f2c473 !  8:  8ecf6383522 t5556: add HTTP authentication tests
     @@ Commit message
          t5556: add HTTP authentication tests
      
          Add a series of tests to exercise the HTTP authentication header parsing
     -    and the interop with credential helpers. Credential helpers can respond
     -    to requests that contain WWW-Authenticate information with the ability
     -    to select the response Authenticate header scheme.
     +    and the interop with credential helpers. Credential helpers will receive
     +    WWW-Authenticate information in credential requests.
      
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
       	git ls-remote $ORIGIN_URL
       '
       
     -+test_expect_success 'http auth www-auth headers to credential helper bearer valid' '
     -+	test_when_finished "per_test_cleanup" &&
     -+	start_http_server \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=bearer:secret-token &&
     -+
     -+	cat >get-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
     -+	wwwauth[]=basic realm="example.com"
     -+	EOF
     -+
     -+	cat >store-expected.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-token
     -+	authtype=bearer
     -+	EOF
     -+
     -+	cat >get-response.cred <<-EOF &&
     -+	protocol=http
     -+	host=$HOST_PORT
     -+	username=alice
     -+	password=secret-token
     -+	authtype=bearer
     -+	EOF
     -+
     -+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     -+
     -+	test_cmp get-expected.cred get-actual.cred &&
     -+	test_cmp store-expected.cred store-actual.cred
     -+'
     -+
      +test_expect_success 'http auth www-auth headers to credential helper basic valid' '
      +	test_when_finished "per_test_cleanup" &&
      +	# base64("alice:secret-passwd")
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	export USERPASS64 &&
      +
      +	start_http_server \
     -+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
      +		--auth=basic:realm=\"example.com\" \
      +		--auth-token=basic:$USERPASS64 &&
      +
      +	cat >get-expected.cred <<-EOF &&
      +	protocol=http
      +	host=$HOST_PORT
     -+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
      +	wwwauth[]=basic realm="example.com"
      +	EOF
      +
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	host=$HOST_PORT
      +	username=alice
      +	password=secret-passwd
     -+	authtype=basic
      +	EOF
      +
      +	cat >get-response.cred <<-EOF &&
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	host=$HOST_PORT
      +	username=alice
      +	password=secret-passwd
     -+	authtype=basic
      +	EOF
      +
      +	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	test_cmp store-expected.cred store-actual.cred
      +'
      +
     -+test_expect_success 'http auth www-auth headers to credential helper custom scheme' '
     ++test_expect_success 'http auth www-auth headers to credential helper custom schemes' '
      +	test_when_finished "per_test_cleanup" &&
     ++	# base64("alice:secret-passwd")
     ++	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
     ++	export USERPASS64 &&
     ++
      +	start_http_server \
      +		--auth=foobar:alg=test\ widget=1 \
      +		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
      +		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=foobar:SECRET-FOOBAR-VALUE &&
     ++		--auth-token=basic:$USERPASS64 &&
      +
      +	cat >get-expected.cred <<-EOF &&
      +	protocol=http
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	protocol=http
      +	host=$HOST_PORT
      +	username=alice
     -+	password=SECRET-FOOBAR-VALUE
     -+	authtype=foobar
     ++	password=secret-passwd
      +	EOF
      +
      +	cat >get-response.cred <<-EOF &&
      +	protocol=http
      +	host=$HOST_PORT
      +	username=alice
     -+	password=SECRET-FOOBAR-VALUE
     -+	authtype=foobar
     ++	password=secret-passwd
      +	EOF
      +
      +	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +
      +test_expect_success 'http auth www-auth headers to credential helper invalid' '
      +	test_when_finished "per_test_cleanup" &&
     ++	# base64("alice:secret-passwd")
     ++	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
     ++	export USERPASS64 &&
      +	start_http_server \
      +		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
      +		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=bearer:secret-token &&
     ++		--auth-token=basic:$USERPASS64 &&
      +
      +	cat >get-expected.cred <<-EOF &&
      +	protocol=http
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	protocol=http
      +	host=$HOST_PORT
      +	username=alice
     -+	password=invalid-token
     -+	authtype=bearer
     ++	password=invalid-passwd
      +	wwwauth[]=bearer authority="id.example.com" q=1 p=0
      +	wwwauth[]=basic realm="example.com"
      +	EOF
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	protocol=http
      +	host=$HOST_PORT
      +	username=alice
     -+	password=invalid-token
     -+	authtype=bearer
     ++	password=invalid-passwd
      +	EOF
      +
      +	test_must_fail git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:15         ` Victoria Dye
  2022-12-15  9:27         ` Ævar Arnfjörð Bjarmason
  2022-12-12 21:36       ` [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
                         ` (7 subsequent siblings)
  8 siblings, 2 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

According to RFC2616 Section 4.2 [1], header field names are not
case-sensitive meaning when collecting multiple values for the same
field name, we can just use the case of the first observed instance of
each field name and no normalisation is required.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us one line at
a time.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 credential.c |  1 +
 credential.h | 15 ++++++++++
 http.c       | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/credential.c b/credential.c
index f6389a50684..897b4679333 100644
--- a/credential.c
+++ b/credential.c
@@ -22,6 +22,7 @@ void credential_clear(struct credential *c)
 	free(c->username);
 	free(c->password);
 	string_list_clear(&c->helpers, 0);
+	strvec_clear(&c->wwwauth_headers);
 
 	credential_init(c);
 }
diff --git a/credential.h b/credential.h
index f430e77fea4..6f2e5bc610b 100644
--- a/credential.h
+++ b/credential.h
@@ -2,6 +2,7 @@
 #define CREDENTIAL_H
 
 #include "string-list.h"
+#include "strvec.h"
 
 /**
  * The credentials API provides an abstracted way of gathering username and
@@ -115,6 +116,19 @@ struct credential {
 	 */
 	struct string_list helpers;
 
+	/**
+	 * A `strvec` of WWW-Authenticate header values. Each string
+	 * is the value of a WWW-Authenticate header in an HTTP response,
+	 * in the order they were received in the response.
+	 */
+	struct strvec wwwauth_headers;
+
+	/**
+	 * Internal use only. Used to keep track of split header fields
+	 * in order to fold multiple lines into one value.
+	 */
+	unsigned header_is_last_match:1;
+
 	unsigned approved:1,
 		 configured:1,
 		 quit:1,
@@ -130,6 +144,7 @@ struct credential {
 
 #define CREDENTIAL_INIT { \
 	.helpers = STRING_LIST_INIT_DUP, \
+	.wwwauth_headers = STRVEC_INIT, \
 }
 
 /* Initialize a credential structure, setting all fields to empty. */
diff --git a/http.c b/http.c
index 8a5ba3f4776..c4e9cd73e14 100644
--- a/http.c
+++ b/http.c
@@ -183,6 +183,82 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	return nmemb;
 }
 
+static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
+{
+	size_t size = eltsize * nmemb;
+	struct strvec *values = &http_auth.wwwauth_headers;
+	struct strbuf buf = STRBUF_INIT;
+	const char *val;
+	const char *z = NULL;
+
+	/*
+	 * Header lines may not come NULL-terminated from libcurl so we must
+	 * limit all scans to the maximum length of the header line, or leverage
+	 * strbufs for all operations.
+	 *
+	 * In addition, it is possible that header values can be split over
+	 * multiple lines as per RFC 2616 (even though this has since been
+	 * deprecated in RFC 7230). A continuation header field value is
+	 * identified as starting with a space or horizontal tab.
+	 *
+	 * The formal definition of a header field as given in RFC 2616 is:
+	 *
+	 *   message-header = field-name ":" [ field-value ]
+	 *   field-name     = token
+	 *   field-value    = *( field-content | LWS )
+	 *   field-content  = <the OCTETs making up the field-value
+	 *                    and consisting of either *TEXT or combinations
+	 *                    of token, separators, and quoted-string>
+	 */
+
+	strbuf_add(&buf, ptr, size);
+
+	/* Strip the CRLF that should be present at the end of each field */
+	strbuf_trim_trailing_newline(&buf);
+
+	/* Start of a new WWW-Authenticate header */
+	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
+		while (isspace(*val))
+			val++;
+
+		strvec_push(values, val);
+		http_auth.header_is_last_match = 1;
+		goto exit;
+	}
+
+	/*
+	 * This line could be a continuation of the previously matched header
+	 * field. If this is the case then we should append this value to the
+	 * end of the previously consumed value.
+	 */
+	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
+		const char **v = values->v + values->nr - 1;
+		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
+
+		free((void*)*v);
+		*v = append;
+
+		goto exit;
+	}
+
+	/* This is the start of a new header we don't care about */
+	http_auth.header_is_last_match = 0;
+
+	/*
+	 * If this is a HTTP status line and not a header field, this signals
+	 * a different HTTP response. libcurl writes all the output of all
+	 * response headers of all responses, including redirects.
+	 * We only care about the last HTTP request response's headers so clear
+	 * the existing array.
+	 */
+	if (skip_iprefix(buf.buf, "http/", &z))
+		strvec_clear(values);
+
+exit:
+	strbuf_release(&buf);
+	return size;
+}
+
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
 	return nmemb;
@@ -1864,6 +1940,8 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
+
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
  2022-12-12 21:36       ` [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:15         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 3/8] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
                         ` (6 subsequent siblings)
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[]` properties where the order
that the repeated attributes appear in the conversation reflects the
order that the WWW-Authenticate headers appeared in the HTTP response.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Documentation/git-credential.txt | 18 +++++++++++++++++-
 credential.c                     | 12 ++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index ac2818b9f66..bf0de0e9408 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -113,7 +113,13 @@ separated by an `=` (equals) sign, followed by a newline.
 The key may contain any bytes except `=`, newline, or NUL. The value may
 contain any bytes except newline or NUL.
 
-In both cases, all bytes are treated as-is (i.e., there is no quoting,
+Attributes with keys that end with C-style array brackets `[]` can have
+multiple values. Each instance of a multi-valued attribute forms an
+ordered list of values - the order of the repeated attributes defines
+the order of the values. An empty multi-valued attribute (`key[]=\n`)
+acts to clear any previous entries and reset the list.
+
+In all cases, all bytes are treated as-is (i.e., there is no quoting,
 and one cannot transmit a value with newline or NUL in it). The list of
 attributes is terminated by a blank line or end-of-file.
 
@@ -160,6 +166,16 @@ empty string.
 Components which are missing from the URL (e.g., there is no
 username in the example above) will be left unset.
 
+`wwwauth[]`::
+
+	When an HTTP response is received by Git that includes one or more
+	'WWW-Authenticate' authentication headers, these will be passed by Git
+	to credential helpers.
+	Each 'WWW-Authenticate' header value is passed as a multi-valued
+	attribute 'wwwauth[]', where the order of the attributes is the same as
+	they appear in the HTTP response. This attribute is 'one-way' from Git
+	to pass additional information to credential helpers.
+
 Unrecognised attributes are silently discarded.
 
 GIT
diff --git a/credential.c b/credential.c
index 897b4679333..8a3ad6c0ae2 100644
--- a/credential.c
+++ b/credential.c
@@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
 	fprintf(fp, "%s=%s\n", key, value);
 }
 
+static void credential_write_strvec(FILE *fp, const char *key,
+				    const struct strvec *vec)
+{
+	int i = 0;
+	const char *full_key = xstrfmt("%s[]", key);
+	for (; i < vec->nr; i++) {
+		credential_write_item(fp, full_key, vec->v[i], 0);
+	}
+	free((void*)full_key);
+}
+
 void credential_write(const struct credential *c, FILE *fp)
 {
 	credential_write_item(fp, "protocol", c->protocol, 1);
@@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
 	credential_write_item(fp, "path", c->path, 0);
 	credential_write_item(fp, "username", c->username, 0);
 	credential_write_item(fp, "password", c->password, 0);
+	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
 }
 
 static int run_credential_helper(struct credential *c,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 3/8] test-http-server: add stub HTTP server test helper
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
  2022-12-12 21:36       ` [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
  2022-12-12 21:36       ` [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:16         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 4/8] test-http-server: add HTTP error response function Matthew John Cheetham via GitGitGadget
                         ` (5 subsequent siblings)
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a mini HTTP server helper that in the future will be enhanced
to provide a frontend for the git-http-backend, with support for
arbitrary authentication schemes.

Right now, test-http-server is a pared-down copy of the git-daemon that
always returns a 501 Not Implemented response to all callers.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 Makefile                            |   2 +
 contrib/buildsystems/CMakeLists.txt |  13 +
 t/helper/.gitignore                 |   1 +
 t/helper/test-http-server.c         | 685 ++++++++++++++++++++++++++++
 4 files changed, 701 insertions(+)
 create mode 100644 t/helper/test-http-server.c

diff --git a/Makefile b/Makefile
index b258fdbed86..1eb795bbfd4 100644
--- a/Makefile
+++ b/Makefile
@@ -1611,6 +1611,8 @@ else
 	endif
 	BASIC_CFLAGS += $(CURL_CFLAGS)
 
+	TEST_PROGRAMS_NEED_X += test-http-server
+
 	REMOTE_CURL_PRIMARY = git-remote-http$X
 	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
 	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 2f6e0197ffa..e9b9bfbb437 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -989,6 +989,19 @@ set(wrapper_scripts
 set(wrapper_test_scripts
 	test-fake-ssh test-tool)
 
+if(CURL_FOUND)
+       list(APPEND wrapper_test_scripts test-http-server)
+
+       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
+       target_link_libraries(test-http-server common-main)
+
+       if(MSVC)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
+               set_target_properties(test-http-server
+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
+       endif()
+endif()
 
 foreach(script ${wrapper_scripts})
 	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 8c2ddcce95f..9aa9c752997 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -1,2 +1,3 @@
 /test-tool
 /test-fake-ssh
+/test-http-server
diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
new file mode 100644
index 00000000000..18f1f741305
--- /dev/null
+++ b/t/helper/test-http-server.c
@@ -0,0 +1,685 @@
+#include "config.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "string-list.h"
+#include "trace2.h"
+#include "version.h"
+#include "dir.h"
+#include "date.h"
+
+#define TR2_CAT "test-http-server"
+
+static const char *pid_file;
+static int verbose;
+static int reuseaddr;
+
+static const char test_http_auth_usage[] =
+"http-server [--verbose]\n"
+"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
+"           [--reuseaddr] [--pid-file=<file>]\n"
+"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
+;
+
+/* Timeout, and initial timeout */
+static unsigned int timeout;
+static unsigned int init_timeout;
+
+static void logreport(const char *label, const char *err, va_list params)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
+	strbuf_vaddf(&msg, err, params);
+	strbuf_addch(&msg, '\n');
+
+	fwrite(msg.buf, sizeof(char), msg.len, stderr);
+	fflush(stderr);
+
+	strbuf_release(&msg);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void logerror(const char *err, ...)
+{
+	va_list params;
+	va_start(params, err);
+	logreport("error", err, params);
+	va_end(params);
+}
+
+__attribute__((format (printf, 1, 2)))
+static void loginfo(const char *err, ...)
+{
+	va_list params;
+	if (!verbose)
+		return;
+	va_start(params, err);
+	logreport("info", err, params);
+	va_end(params);
+}
+
+static void set_keep_alive(int sockfd)
+{
+	int ka = 1;
+
+	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
+		if (errno != ENOTSOCK)
+			logerror("unable to set SO_KEEPALIVE on socket: %s",
+				strerror(errno));
+	}
+}
+
+/*
+ * The code in this section is used by "worker" instances to service
+ * a single connection from a client.  The worker talks to the client
+ * on 0 and 1.
+ */
+
+enum worker_result {
+	/*
+	 * Operation successful.
+	 * Caller *might* keep the socket open and allow keep-alive.
+	 */
+	WR_OK       = 0,
+
+	/*
+	 * Various errors while processing the request and/or the response.
+	 * Close the socket and clean up.
+	 * Exit child-process with non-zero status.
+	 */
+	WR_IO_ERROR = 1<<0,
+
+	/*
+	 * Close the socket and clean up.  Does not imply an error.
+	 */
+	WR_HANGUP   = 1<<1,
+
+	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
+};
+
+static enum worker_result worker(void)
+{
+	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
+	char *client_addr = getenv("REMOTE_ADDR");
+	char *client_port = getenv("REMOTE_PORT");
+	enum worker_result wr = WR_OK;
+
+	if (client_addr)
+		loginfo("Connection from %s:%s", client_addr, client_port);
+
+	set_keep_alive(0);
+
+	while (1) {
+		if (write_in_full(1, response, strlen(response)) < 0) {
+			logerror("unable to write response");
+			wr = WR_IO_ERROR;
+		}
+
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+	}
+
+	close(0);
+	close(1);
+
+	return !!(wr & WR_IO_ERROR);
+}
+
+/*
+ * This section contains the listener and child-process management
+ * code used by the primary instance to accept incoming connections
+ * and dispatch them to async child process "worker" instances.
+ */
+
+static int addrcmp(const struct sockaddr_storage *s1,
+		   const struct sockaddr_storage *s2)
+{
+	const struct sockaddr *sa1 = (const struct sockaddr*) s1;
+	const struct sockaddr *sa2 = (const struct sockaddr*) s2;
+
+	if (sa1->sa_family != sa2->sa_family)
+		return sa1->sa_family - sa2->sa_family;
+	if (sa1->sa_family == AF_INET)
+		return memcmp(&((struct sockaddr_in *)s1)->sin_addr,
+		    &((struct sockaddr_in *)s2)->sin_addr,
+		    sizeof(struct in_addr));
+#ifndef NO_IPV6
+	if (sa1->sa_family == AF_INET6)
+		return memcmp(&((struct sockaddr_in6 *)s1)->sin6_addr,
+		    &((struct sockaddr_in6 *)s2)->sin6_addr,
+		    sizeof(struct in6_addr));
+#endif
+	return 0;
+}
+
+static int max_connections = 32;
+
+static unsigned int live_children;
+
+static struct child {
+	struct child *next;
+	struct child_process cld;
+	struct sockaddr_storage address;
+} *firstborn;
+
+static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child *newborn, **cradle;
+
+	newborn = xcalloc(1, sizeof(*newborn));
+	live_children++;
+	memcpy(&newborn->cld, cld, sizeof(*cld));
+	memcpy(&newborn->address, addr, addrlen);
+	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
+		if (!addrcmp(&(*cradle)->address, &newborn->address))
+			break;
+	newborn->next = *cradle;
+	*cradle = newborn;
+}
+
+/*
+ * This gets called if the number of connections grows
+ * past "max_connections".
+ *
+ * We kill the newest connection from a duplicate IP.
+ */
+static void kill_some_child(void)
+{
+	const struct child *blanket, *next;
+
+	if (!(blanket = firstborn))
+		return;
+
+	for (; (next = blanket->next); blanket = next)
+		if (!addrcmp(&blanket->address, &next->address)) {
+			kill(blanket->cld.pid, SIGTERM);
+			break;
+		}
+}
+
+static void check_dead_children(void)
+{
+	int status;
+	pid_t pid;
+
+	struct child **cradle, *blanket;
+	for (cradle = &firstborn; (blanket = *cradle);)
+		if ((pid = waitpid(blanket->cld.pid, &status, WNOHANG)) > 1) {
+			const char *dead = "";
+			if (status)
+				dead = " (with error)";
+			loginfo("[%"PRIuMAX"] Disconnected%s", (uintmax_t)pid, dead);
+
+			/* remove the child */
+			*cradle = blanket->next;
+			live_children--;
+			child_process_clear(&blanket->cld);
+			free(blanket);
+		} else
+			cradle = &blanket->next;
+}
+
+static struct strvec cld_argv = STRVEC_INIT;
+static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
+{
+	struct child_process cld = CHILD_PROCESS_INIT;
+
+	if (max_connections && live_children >= max_connections) {
+		kill_some_child();
+		sleep(1);  /* give it some time to die */
+		check_dead_children();
+		if (live_children >= max_connections) {
+			close(incoming);
+			logerror("Too many children, dropping connection");
+			return;
+		}
+	}
+
+	if (addr->sa_family == AF_INET) {
+		char buf[128] = "";
+		struct sockaddr_in *sin_addr = (void *) addr;
+		inet_ntop(addr->sa_family, &sin_addr->sin_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=%s", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin_addr->sin_port));
+#ifndef NO_IPV6
+	} else if (addr->sa_family == AF_INET6) {
+		char buf[128] = "";
+		struct sockaddr_in6 *sin6_addr = (void *) addr;
+		inet_ntop(AF_INET6, &sin6_addr->sin6_addr, buf, sizeof(buf));
+		strvec_pushf(&cld.env, "REMOTE_ADDR=[%s]", buf);
+		strvec_pushf(&cld.env, "REMOTE_PORT=%d",
+				 ntohs(sin6_addr->sin6_port));
+#endif
+	}
+
+	strvec_pushv(&cld.args, cld_argv.v);
+	cld.in = incoming;
+	cld.out = dup(incoming);
+
+	if (cld.out < 0)
+		logerror("could not dup() `incoming`");
+	else if (start_command(&cld))
+		logerror("unable to fork");
+	else
+		add_child(&cld, addr, addrlen);
+}
+
+static void child_handler(int signo)
+{
+	/*
+	 * Otherwise empty handler because systemcalls will get interrupted
+	 * upon signal receipt
+	 * SysV needs the handler to be rearmed
+	 */
+	signal(SIGCHLD, child_handler);
+}
+
+static int set_reuse_addr(int sockfd)
+{
+	int on = 1;
+
+	if (!reuseaddr)
+		return 0;
+	return setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR,
+			  &on, sizeof(on));
+}
+
+struct socketlist {
+	int *list;
+	size_t nr;
+	size_t alloc;
+};
+
+static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
+{
+#ifdef NO_IPV6
+	static char ip[INET_ADDRSTRLEN];
+#else
+	static char ip[INET6_ADDRSTRLEN];
+#endif
+
+	switch (family) {
+#ifndef NO_IPV6
+	case AF_INET6:
+		inet_ntop(family, &((struct sockaddr_in6*)sin)->sin6_addr, ip, len);
+		break;
+#endif
+	case AF_INET:
+		inet_ntop(family, &((struct sockaddr_in*)sin)->sin_addr, ip, len);
+		break;
+	default:
+		xsnprintf(ip, sizeof(ip), "<unknown>");
+	}
+	return ip;
+}
+
+#ifndef NO_IPV6
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	int socknum = 0;
+	char pbuf[NI_MAXSERV];
+	struct addrinfo hints, *ai0, *ai;
+	int gai;
+	long flags;
+
+	xsnprintf(pbuf, sizeof(pbuf), "%d", listen_port);
+	memset(&hints, 0, sizeof(hints));
+	hints.ai_family = AF_UNSPEC;
+	hints.ai_socktype = SOCK_STREAM;
+	hints.ai_protocol = IPPROTO_TCP;
+	hints.ai_flags = AI_PASSIVE;
+
+	gai = getaddrinfo(listen_addr, pbuf, &hints, &ai0);
+	if (gai) {
+		logerror("getaddrinfo() for %s failed: %s", listen_addr, gai_strerror(gai));
+		return 0;
+	}
+
+	for (ai = ai0; ai; ai = ai->ai_next) {
+		int sockfd;
+
+		sockfd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+		if (sockfd < 0)
+			continue;
+		if (sockfd >= FD_SETSIZE) {
+			logerror("Socket descriptor too large");
+			close(sockfd);
+			continue;
+		}
+
+#ifdef IPV6_V6ONLY
+		if (ai->ai_family == AF_INET6) {
+			int on = 1;
+			setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY,
+				   &on, sizeof(on));
+			/* Note: error is not fatal */
+		}
+#endif
+
+		if (set_reuse_addr(sockfd)) {
+			logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+			close(sockfd);
+			continue;
+		}
+
+		set_keep_alive(sockfd);
+
+		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
+			logerror("Could not bind to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+		if (listen(sockfd, 5) < 0) {
+			logerror("Could not listen to %s: %s",
+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
+				 strerror(errno));
+			close(sockfd);
+			continue;	/* not fatal */
+		}
+
+		flags = fcntl(sockfd, F_GETFD, 0);
+		if (flags >= 0)
+			fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+		ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+		socklist->list[socklist->nr++] = sockfd;
+		socknum++;
+	}
+
+	freeaddrinfo(ai0);
+
+	return socknum;
+}
+
+#else /* NO_IPV6 */
+
+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	struct sockaddr_in sin;
+	int sockfd;
+	long flags;
+
+	memset(&sin, 0, sizeof sin);
+	sin.sin_family = AF_INET;
+	sin.sin_port = htons(listen_port);
+
+	if (listen_addr) {
+		/* Well, host better be an IP address here. */
+		if (inet_pton(AF_INET, listen_addr, &sin.sin_addr.s_addr) <= 0)
+			return 0;
+	} else {
+		sin.sin_addr.s_addr = htonl(INADDR_ANY);
+	}
+
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (sockfd < 0)
+		return 0;
+
+	if (set_reuse_addr(sockfd)) {
+		logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	set_keep_alive(sockfd);
+
+	if (bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0) {
+		logerror("Could not bind to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	if (listen(sockfd, 5) < 0) {
+		logerror("Could not listen to %s: %s",
+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
+			 strerror(errno));
+		close(sockfd);
+		return 0;
+	}
+
+	flags = fcntl(sockfd, F_GETFD, 0);
+	if (flags >= 0)
+		fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
+
+	ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
+	socklist->list[socklist->nr++] = sockfd;
+	return 1;
+}
+
+#endif
+
+static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
+{
+	if (!listen_addr->nr)
+		setup_named_sock("127.0.0.1", listen_port, socklist);
+	else {
+		int i, socknum;
+		for (i = 0; i < listen_addr->nr; i++) {
+			socknum = setup_named_sock(listen_addr->items[i].string,
+						   listen_port, socklist);
+
+			if (socknum == 0)
+				logerror("unable to allocate any listen sockets for host %s on port %u",
+					 listen_addr->items[i].string, listen_port);
+		}
+	}
+}
+
+static int service_loop(struct socketlist *socklist)
+{
+	struct pollfd *pfd;
+	int i;
+
+	CALLOC_ARRAY(pfd, socklist->nr);
+
+	for (i = 0; i < socklist->nr; i++) {
+		pfd[i].fd = socklist->list[i];
+		pfd[i].events = POLLIN;
+	}
+
+	signal(SIGCHLD, child_handler);
+
+	for (;;) {
+		int i;
+		int nr_ready;
+		int timeout = (pid_file ? 100 : -1);
+
+		check_dead_children();
+
+		nr_ready = poll(pfd, socklist->nr, timeout);
+		if (nr_ready < 0) {
+			if (errno != EINTR) {
+				logerror("Poll failed, resuming: %s",
+				      strerror(errno));
+				sleep(1);
+			}
+			continue;
+		}
+		else if (nr_ready == 0) {
+			/*
+			 * If we have a pid_file, then we watch it.
+			 * If someone deletes it, we shutdown the service.
+			 * The shell scripts in the test suite will use this.
+			 */
+			if (!pid_file || file_exists(pid_file))
+				continue;
+			goto shutdown;
+		}
+
+		for (i = 0; i < socklist->nr; i++) {
+			if (pfd[i].revents & POLLIN) {
+				union {
+					struct sockaddr sa;
+					struct sockaddr_in sai;
+#ifndef NO_IPV6
+					struct sockaddr_in6 sai6;
+#endif
+				} ss;
+				socklen_t sslen = sizeof(ss);
+				int incoming = accept(pfd[i].fd, &ss.sa, &sslen);
+				if (incoming < 0) {
+					switch (errno) {
+					case EAGAIN:
+					case EINTR:
+					case ECONNABORTED:
+						continue;
+					default:
+						die_errno("accept returned");
+					}
+				}
+				handle(incoming, &ss.sa, sslen);
+			}
+		}
+	}
+
+shutdown:
+	loginfo("Starting graceful shutdown (pid-file gone)");
+	for (i = 0; i < socklist->nr; i++)
+		close(socklist->list[i]);
+
+	return 0;
+}
+
+static int serve(struct string_list *listen_addr, int listen_port)
+{
+	struct socketlist socklist = { NULL, 0, 0 };
+
+	socksetup(listen_addr, listen_port, &socklist);
+	if (socklist.nr == 0)
+		die("unable to allocate any listen sockets on port %u",
+		    listen_port);
+
+	loginfo("Ready to rumble");
+
+	/*
+	 * Wait to create the pid-file until we've setup the sockets
+	 * and are open for business.
+	 */
+	if (pid_file)
+		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
+
+	return service_loop(&socklist);
+}
+
+/*
+ * This section is executed by both the primary instance and all
+ * worker instances.  So, yes, each child-process re-parses the
+ * command line argument and re-discovers how it should behave.
+ */
+
+int cmd_main(int argc, const char **argv)
+{
+	int listen_port = 0;
+	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
+	int worker_mode = 0;
+	int i;
+
+	trace2_cmd_name("test-http-server");
+	setup_git_directory_gently(NULL);
+
+	for (i = 1; i < argc; i++) {
+		const char *arg = argv[i];
+		const char *v;
+
+		if (skip_prefix(arg, "--listen=", &v)) {
+			string_list_append(&listen_addr, xstrdup_tolower(v));
+			continue;
+		}
+		if (skip_prefix(arg, "--port=", &v)) {
+			char *end;
+			unsigned long n;
+			n = strtoul(v, &end, 0);
+			if (*v && !*end) {
+				listen_port = n;
+				continue;
+			}
+		}
+		if (!strcmp(arg, "--worker")) {
+			worker_mode = 1;
+			trace2_cmd_mode("worker");
+			continue;
+		}
+		if (!strcmp(arg, "--verbose")) {
+			verbose = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--timeout=", &v)) {
+			timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--init-timeout=", &v)) {
+			init_timeout = atoi(v);
+			continue;
+		}
+		if (skip_prefix(arg, "--max-connections=", &v)) {
+			max_connections = atoi(v);
+			if (max_connections < 0)
+				max_connections = 0; /* unlimited */
+			continue;
+		}
+		if (!strcmp(arg, "--reuseaddr")) {
+			reuseaddr = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--pid-file=", &v)) {
+			pid_file = v;
+			continue;
+		}
+
+		fprintf(stderr, "error: unknown argument '%s'\n", arg);
+		usage(test_http_auth_usage);
+	}
+
+	/* avoid splitting a message in the middle */
+	setvbuf(stderr, NULL, _IOFBF, 4096);
+
+	if (listen_port == 0)
+		listen_port = DEFAULT_GIT_PORT;
+
+	/*
+	 * If no --listen=<addr> args are given, the setup_named_sock()
+	 * code will use receive a NULL address and set INADDR_ANY.
+	 * This exposes both internal and external interfaces on the
+	 * port.
+	 *
+	 * Disallow that and default to the internal-use-only loopback
+	 * address.
+	 */
+	if (!listen_addr.nr)
+		string_list_append(&listen_addr, "127.0.0.1");
+
+	/*
+	 * worker_mode is set in our own child process instances
+	 * (that are bound to a connected socket from a client).
+	 */
+	if (worker_mode)
+		return worker();
+
+	/*
+	 * `cld_argv` is a bit of a clever hack. The top-level instance
+	 * of test-http-server does the normal bind/listen/accept stuff.
+	 * For each incoming socket, the top-level process spawns
+	 * a child instance of test-http-server *WITH* the additional
+	 * `--worker` argument. This causes the child to set `worker_mode`
+	 * and immediately call `worker()` using the connected socket (and
+	 * without the usual need for fork() or threads).
+	 *
+	 * The magic here is made possible because `cld_argv` is static
+	 * and handle() (called by service_loop()) knows about it.
+	 */
+	strvec_push(&cld_argv, argv[0]);
+	strvec_push(&cld_argv, "--worker");
+	for (i = 1; i < argc; ++i)
+		strvec_push(&cld_argv, argv[i]);
+
+	/*
+	 * Setup primary instance to listen for connections.
+	 */
+	return serve(&listen_addr, listen_port);
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 4/8] test-http-server: add HTTP error response function
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 3/8] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:17         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 5/8] test-http-server: add HTTP request parsing Matthew John Cheetham via GitGitGadget
                         ` (4 subsequent siblings)
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Introduce a function to the test-http-server test helper to write more
full and valid HTTP error responses, including all the standard response
headers like `Server` and `Date`.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 59 +++++++++++++++++++++++++++++++++----
 1 file changed, 53 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 18f1f741305..53508639714 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -97,9 +97,59 @@ enum worker_result {
 	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
 };
 
+static enum worker_result send_http_error(
+	int fd,
+	int http_code, const char *http_code_name,
+	int retry_after_seconds, struct string_list *response_headers,
+	enum worker_result wr_in)
+{
+	struct strbuf response_header = STRBUF_INIT;
+	struct strbuf response_content = STRBUF_INIT;
+	struct string_list_item *h;
+	enum worker_result wr;
+
+	strbuf_addf(&response_content, "Error: %d %s\r\n",
+		    http_code, http_code_name);
+	if (retry_after_seconds > 0)
+		strbuf_addf(&response_content, "Retry-After: %d\r\n",
+			    retry_after_seconds);
+
+	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
+	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
+	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
+	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
+	if (retry_after_seconds > 0)
+		strbuf_addf(&response_header, "Retry-After: %d\r\n", retry_after_seconds);
+	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
+	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
+	if (response_headers)
+		for_each_string_list_item(h, response_headers)
+			strbuf_addf(&response_header, "%s\r\n", h->string);
+	strbuf_addstr(&response_header, "\r\n");
+
+	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
+		logerror("unable to write response header");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
+		logerror("unable to write response content body");
+		wr = WR_IO_ERROR;
+		goto done;
+	}
+
+	wr = wr_in;
+
+done:
+	strbuf_release(&response_header);
+	strbuf_release(&response_content);
+
+	return wr;
+}
+
 static enum worker_result worker(void)
 {
-	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
 	char *client_addr = getenv("REMOTE_ADDR");
 	char *client_port = getenv("REMOTE_PORT");
 	enum worker_result wr = WR_OK;
@@ -110,11 +160,8 @@ static enum worker_result worker(void)
 	set_keep_alive(0);
 
 	while (1) {
-		if (write_in_full(1, response, strlen(response)) < 0) {
-			logerror("unable to write response");
-			wr = WR_IO_ERROR;
-		}
-
+		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
+			WR_OK | WR_HANGUP);
 		if (wr & WR_STOP_THE_MUSIC)
 			break;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 5/8] test-http-server: add HTTP request parsing
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 4/8] test-http-server: add HTTP error response function Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:18         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 6/8] test-http-server: pass Git requests to http-backend Matthew John Cheetham via GitGitGadget
                         ` (3 subsequent siblings)
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add ability to parse HTTP requests to the test-http-server test helper.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 176 +++++++++++++++++++++++++++++++++++-
 1 file changed, 174 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 53508639714..7bde678e264 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -97,6 +97,42 @@ enum worker_result {
 	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
 };
 
+/*
+ * Fields from a parsed HTTP request.
+ */
+struct req {
+	struct strbuf start_line;
+
+	const char *method;
+	const char *http_version;
+
+	struct strbuf uri_path;
+	struct strbuf query_args;
+
+	struct string_list header_list;
+	const char *content_type;
+	ssize_t content_length;
+};
+
+#define REQ__INIT { \
+	.start_line = STRBUF_INIT, \
+	.uri_path = STRBUF_INIT, \
+	.query_args = STRBUF_INIT, \
+	.header_list = STRING_LIST_INIT_NODUP, \
+	.content_type = NULL, \
+	.content_length = -1 \
+	}
+
+static void req__release(struct req *req)
+{
+	strbuf_release(&req->start_line);
+
+	strbuf_release(&req->uri_path);
+	strbuf_release(&req->query_args);
+
+	string_list_clear(&req->header_list, 0);
+}
+
 static enum worker_result send_http_error(
 	int fd,
 	int http_code, const char *http_code_name,
@@ -148,8 +184,136 @@ done:
 	return wr;
 }
 
+/*
+ * Read the HTTP request up to the start of the optional message-body.
+ * We do this byte-by-byte because we have keep-alive turned on and
+ * cannot rely on an EOF.
+ *
+ * https://tools.ietf.org/html/rfc7230
+ *
+ * We cannot call die() here because our caller needs to properly
+ * respond to the client and/or close the socket before this
+ * child exits so that the client doesn't get a connection reset
+ * by peer error.
+ */
+static enum worker_result req__read(struct req *req, int fd)
+{
+	struct strbuf h = STRBUF_INIT;
+	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
+	int nr_start_line_fields;
+	const char *uri_target;
+	const char *query;
+	char *hp;
+	const char *hv;
+
+	enum worker_result result = WR_OK;
+
+	/*
+	 * Read line 0 of the request and split it into component parts:
+	 *
+	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
+	 *
+	 */
+	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
+		result = WR_OK | WR_HANGUP;
+		goto done;
+	}
+
+	strbuf_trim_trailing_newline(&req->start_line);
+
+	nr_start_line_fields = string_list_split(&start_line_fields,
+						 req->start_line.buf,
+						 ' ', -1);
+	if (nr_start_line_fields != 3) {
+		logerror("could not parse request start-line '%s'",
+			 req->start_line.buf);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	req->method = xstrdup(start_line_fields.items[0].string);
+	req->http_version = xstrdup(start_line_fields.items[2].string);
+
+	uri_target = start_line_fields.items[1].string;
+
+	if (strcmp(req->http_version, "HTTP/1.1")) {
+		logerror("unsupported version '%s' (expecting HTTP/1.1)",
+			 req->http_version);
+		result = WR_IO_ERROR;
+		goto done;
+	}
+
+	query = strchr(uri_target, '?');
+
+	if (query) {
+		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+		strbuf_addstr(&req->query_args, query + 1);
+	} else {
+		strbuf_addstr(&req->uri_path, uri_target);
+		strbuf_trim_trailing_dir_sep(&req->uri_path);
+	}
+
+	/*
+	 * Read the set of HTTP headers into a string-list.
+	 */
+	while (1) {
+		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
+			goto done;
+		strbuf_trim_trailing_newline(&h);
+
+		if (!h.len)
+			goto done; /* a blank line ends the header */
+
+		hp = strbuf_detach(&h, NULL);
+		string_list_append(&req->header_list, hp);
+
+		/* store common request headers separately */
+		if (skip_prefix(hp, "Content-Type: ", &hv)) {
+			req->content_type = hv;
+		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
+			req->content_length = strtol(hv, &hp, 10);
+		}
+	}
+
+	/*
+	 * We do not attempt to read the <message-body>, if it exists.
+	 * We let our caller read/chunk it in as appropriate.
+	 */
+
+done:
+	string_list_clear(&start_line_fields, 0);
+
+	/*
+	 * This is useful for debugging the request, but very noisy.
+	 */
+	if (trace2_is_enabled()) {
+		struct string_list_item *item;
+		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
+		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
+		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
+		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
+		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
+		if (req->content_length >= 0)
+			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
+		if (req->content_type)
+			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
+		for_each_string_list_item(item, &req->header_list)
+			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
+	}
+
+	return result;
+}
+
+static enum worker_result dispatch(struct req *req)
+{
+	return send_http_error(1, 501, "Not Implemented", -1, NULL,
+			       WR_OK | WR_HANGUP);
+}
+
 static enum worker_result worker(void)
 {
+	struct req req = REQ__INIT;
 	char *client_addr = getenv("REMOTE_ADDR");
 	char *client_port = getenv("REMOTE_PORT");
 	enum worker_result wr = WR_OK;
@@ -160,8 +324,16 @@ static enum worker_result worker(void)
 	set_keep_alive(0);
 
 	while (1) {
-		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
-			WR_OK | WR_HANGUP);
+		req__release(&req);
+
+		alarm(init_timeout ? init_timeout : timeout);
+		wr = req__read(&req, 0);
+		alarm(0);
+
+		if (wr & WR_STOP_THE_MUSIC)
+			break;
+
+		wr = dispatch(&req);
 		if (wr & WR_STOP_THE_MUSIC)
 			break;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 6/8] test-http-server: pass Git requests to http-backend
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 5/8] test-http-server: add HTTP request parsing Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:20         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 7/8] test-http-server: add simple authentication Matthew John Cheetham via GitGitGadget
                         ` (2 subsequent siblings)
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Teach the test-http-sever test helper to forward Git requests to the
`git-http-backend`.

Introduce a new test script t5556-http-auth.sh that spins up the test
HTTP server and attempts an `ls-remote` on the served repository,
without any authentication.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c |  56 +++++++++++++++++++
 t/t5556-http-auth.sh        | 105 ++++++++++++++++++++++++++++++++++++
 2 files changed, 161 insertions(+)
 create mode 100755 t/t5556-http-auth.sh

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 7bde678e264..9f1d6b58067 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -305,8 +305,64 @@ done:
 	return result;
 }
 
+static int is_git_request(struct req *req)
+{
+	static regex_t *smart_http_regex;
+	static int initialized;
+
+	if (!initialized) {
+		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
+		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
+			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
+			    REG_EXTENDED)) {
+			warning("could not compile smart HTTP regex");
+			smart_http_regex = NULL;
+		}
+		initialized = 1;
+	}
+
+	return smart_http_regex &&
+		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
+}
+
+static enum worker_result do__git(struct req *req, const char *user)
+{
+	const char *ok = "HTTP/1.1 200 OK\r\n";
+	struct child_process cp = CHILD_PROCESS_INIT;
+	int res;
+
+	if (write(1, ok, strlen(ok)) < 0)
+		return error(_("could not send '%s'"), ok);
+
+	if (user)
+		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
+
+	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
+	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
+			req->uri_path.buf);
+	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
+	if (req->query_args.len)
+		strvec_pushf(&cp.env, "QUERY_STRING=%s",
+				req->query_args.buf);
+	if (req->content_type)
+		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
+				req->content_type);
+	if (req->content_length >= 0)
+		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
+				(intmax_t)req->content_length);
+	cp.git_cmd = 1;
+	strvec_push(&cp.args, "http-backend");
+	res = run_command(&cp);
+	close(1);
+	close(0);
+	return !!res;
+}
+
 static enum worker_result dispatch(struct req *req)
 {
+	if (is_git_request(req))
+		return do__git(req, NULL);
+
 	return send_http_error(1, 501, "Not Implemented", -1, NULL,
 			       WR_OK | WR_HANGUP);
 }
diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
new file mode 100755
index 00000000000..78da151f122
--- /dev/null
+++ b/t/t5556-http-auth.sh
@@ -0,0 +1,105 @@
+#!/bin/sh
+
+test_description='test http auth header and credential helper interop'
+
+. ./test-lib.sh
+
+test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
+
+# Setup a repository
+#
+REPO_DIR="$(pwd)"/repo
+
+# Setup some lookback URLs where test-http-server will be listening.
+# We will spawn it directly inside the repo directory, so we avoid
+# any need to configure directory mappings etc - we only serve this
+# repository from the root '/' of the server.
+#
+HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
+ORIGIN_URL=http://$HOST_PORT/
+
+# The pid-file is created by test-http-server when it starts.
+# The server will shutdown if/when we delete it (this is easier than
+# killing it by PID).
+#
+PID_FILE="$(pwd)"/pid-file.pid
+SERVER_LOG="$(pwd)"/OUT.server.log
+
+PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
+
+test_expect_success 'setup repos' '
+	test_create_repo "$REPO_DIR" &&
+	git -C "$REPO_DIR" branch -M main
+'
+
+stop_http_server () {
+	if ! test -f "$PID_FILE"
+	then
+		return 0
+	fi
+	#
+	# The server will shutdown automatically when we delete the pid-file.
+	#
+	rm -f "$PID_FILE"
+	#
+	# Give it a few seconds to shutdown (mainly to completely release the
+	# port before the next test start another instance and it attempts to
+	# bind to it).
+	#
+	for k in 0 1 2 3 4
+	do
+		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "stop_http_server: timeout waiting for server shutdown"
+	return 1
+}
+
+start_http_server () {
+	#
+	# Launch our server into the background in repo_dir.
+	#
+	(
+		cd "$REPO_DIR"
+		test-http-server --verbose \
+			--listen=127.0.0.1 \
+			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
+			--reuseaddr \
+			--pid-file="$PID_FILE" \
+			"$@" \
+			2>"$SERVER_LOG" &
+	)
+	#
+	# Give it a few seconds to get started.
+	#
+	for k in 0 1 2 3 4
+	do
+		if test -f "$PID_FILE"
+		then
+			return 0
+		fi
+		sleep 1
+	done
+
+	echo "start_http_server: timeout waiting for server startup"
+	return 1
+}
+
+per_test_cleanup () {
+	stop_http_server &&
+	rm -f OUT.*
+}
+
+test_expect_success 'http auth anonymous no challenge' '
+	test_when_finished "per_test_cleanup" &&
+	start_http_server --allow-anonymous &&
+
+	# Attempt to read from a protected repository
+	git ls-remote $ORIGIN_URL
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 7/8] test-http-server: add simple authentication
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 6/8] test-http-server: pass Git requests to http-backend Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:23         ` Victoria Dye
  2022-12-12 21:36       ` [PATCH v4 8/8] t5556: add HTTP authentication tests Matthew John Cheetham via GitGitGadget
  2023-01-11 22:13       ` [PATCH v5 00/10] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add simple authentication to the test-http-server test helper.
Authentication schemes and sets of valid tokens can be specified via
command-line arguments. Incoming requests are compared against the set
of valid schemes and tokens and only approved if a matching token is
found, or if no auth was provided and anonymous auth is enabled.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-http-server.c | 188 +++++++++++++++++++++++++++++++++++-
 1 file changed, 187 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 9f1d6b58067..9a458743d13 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -18,6 +18,8 @@ static const char test_http_auth_usage[] =
 "           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
 "           [--reuseaddr] [--pid-file=<file>]\n"
 "           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
+"           [--anonymous-allowed]\n"
+"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
 ;
 
 /* Timeout, and initial timeout */
@@ -358,10 +360,136 @@ static enum worker_result do__git(struct req *req, const char *user)
 	return !!res;
 }
 
+enum auth_result {
+	/* No auth module matches the request. */
+	AUTH_UNKNOWN = 0,
+
+	/* Auth module denied the request. */
+	AUTH_DENY = 1,
+
+	/* Auth module successfully validated the request. */
+	AUTH_ALLOW = 2,
+};
+
+struct auth_module {
+	char *scheme;
+	char *challenge_params;
+	struct string_list *tokens;
+};
+
+static int allow_anonymous;
+static struct auth_module **auth_modules = NULL;
+static size_t auth_modules_nr = 0;
+static size_t auth_modules_alloc = 0;
+
+static struct auth_module *get_auth_module(const char *scheme)
+{
+	int i;
+	struct auth_module *mod;
+	for (i = 0; i < auth_modules_nr; i++) {
+		mod = auth_modules[i];
+		if (!strcasecmp(mod->scheme, scheme))
+			return mod;
+	}
+
+	return NULL;
+}
+
+static void add_auth_module(struct auth_module *mod)
+{
+	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
+	auth_modules[auth_modules_nr++] = mod;
+}
+
+static int is_authed(struct req *req, const char **user, enum worker_result *wr)
+{
+	enum auth_result result = AUTH_UNKNOWN;
+	struct string_list hdrs = STRING_LIST_INIT_NODUP;
+	struct auth_module *mod;
+
+	struct string_list_item *hdr;
+	struct string_list_item *token;
+	const char *v;
+	struct strbuf **split = NULL;
+	int i;
+	char *challenge;
+
+	/*
+	 * Check all auth modules and try to validate the request.
+	 * The first module that matches a valid token approves the request.
+	 * If no module is found, or if there is no valid token, then 401 error.
+	 * Otherwise, only permit the request if anonymous auth is enabled.
+	 */
+	for_each_string_list_item(hdr, &req->header_list) {
+		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
+			split = strbuf_split_str(v, ' ', 2);
+			if (!split[0] || !split[1]) continue;
+
+			/* trim trailing space ' ' */
+			strbuf_setlen(split[0], split[0]->len - 1);
+
+			mod = get_auth_module(split[0]->buf);
+			if (mod) {
+				result = AUTH_DENY;
+
+				for_each_string_list_item(token, mod->tokens) {
+					if (!strcmp(split[1]->buf, token->string)) {
+						result = AUTH_ALLOW;
+						break;
+					}
+				}
+
+				goto done;
+			}
+		}
+	}
+
+done:
+	switch (result) {
+	case AUTH_ALLOW:
+		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
+		*user = "VALID_TEST_USER";
+		*wr = WR_OK;
+		break;
+
+	case AUTH_DENY:
+		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
+		/* fall-through */
+
+	case AUTH_UNKNOWN:
+		if (result != AUTH_DENY && allow_anonymous)
+			break;
+		for (i = 0; i < auth_modules_nr; i++) {
+			mod = auth_modules[i];
+			if (mod->challenge_params)
+				challenge = xstrfmt("WWW-Authenticate: %s %s",
+						    mod->scheme,
+						    mod->challenge_params);
+			else
+				challenge = xstrfmt("WWW-Authenticate: %s",
+						    mod->scheme);
+			string_list_append(&hdrs, challenge);
+		}
+		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
+	}
+
+	strbuf_list_free(split);
+	string_list_clear(&hdrs, 0);
+
+	return result == AUTH_ALLOW ||
+	      (result == AUTH_UNKNOWN && allow_anonymous);
+}
+
 static enum worker_result dispatch(struct req *req)
 {
+	enum worker_result wr = WR_OK;
+	const char *user = NULL;
+
+	if (!is_authed(req, &user, &wr))
+		return wr;
+
 	if (is_git_request(req))
-		return do__git(req, NULL);
+		return do__git(req, user);
 
 	return send_http_error(1, 501, "Not Implemented", -1, NULL,
 			       WR_OK | WR_HANGUP);
@@ -854,6 +982,7 @@ int cmd_main(int argc, const char **argv)
 	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
 	int worker_mode = 0;
 	int i;
+	struct auth_module *mod = NULL;
 
 	trace2_cmd_name("test-http-server");
 	setup_git_directory_gently(NULL);
@@ -906,6 +1035,63 @@ int cmd_main(int argc, const char **argv)
 			pid_file = v;
 			continue;
 		}
+		if (skip_prefix(arg, "--allow-anonymous", &v)) {
+			allow_anonymous = 1;
+			continue;
+		}
+		if (skip_prefix(arg, "--auth=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			/* trim trailing ':' */
+			if (p[1])
+				strbuf_setlen(p[0], p[0]->len - 1);
+
+			if (get_auth_module(p[0]->buf)) {
+				error("duplicate auth scheme '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			mod = xmalloc(sizeof(struct auth_module));
+			mod->scheme = xstrdup(p[0]->buf);
+			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
+			CALLOC_ARRAY(mod->tokens, 1);
+			string_list_init_dup(mod->tokens);
+
+			add_auth_module(mod);
+
+			strbuf_list_free(p);
+			continue;
+		}
+		if (skip_prefix(arg, "--auth-token=", &v)) {
+			struct strbuf **p = strbuf_split_str(v, ':', 2);
+			if (!p[0]) {
+				error("invalid argument '%s'", v);
+				usage(test_http_auth_usage);
+			}
+
+			if (!p[1]) {
+				error("missing token value '%s'\n", v);
+				usage(test_http_auth_usage);
+			}
+
+			/* trim trailing ':' */
+			strbuf_setlen(p[0], p[0]->len - 1);
+
+			mod = get_auth_module(p[0]->buf);
+			if (!mod) {
+				error("auth scheme not defined '%s'\n", p[0]->buf);
+				usage(test_http_auth_usage);
+			}
+
+			string_list_append(mod->tokens, p[1]->buf);
+			strbuf_list_free(p);
+			continue;
+		}
 
 		fprintf(stderr, "error: unknown argument '%s'\n", arg);
 		usage(test_http_auth_usage);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* [PATCH v4 8/8] t5556: add HTTP authentication tests
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 7/8] test-http-server: add simple authentication Matthew John Cheetham via GitGitGadget
@ 2022-12-12 21:36       ` Matthew John Cheetham via GitGitGadget
  2022-12-14 23:48         ` Victoria Dye
  2023-01-11 22:13       ` [PATCH v5 00/10] Enhance credential helper protocol to include auth headers Matthew John Cheetham via GitGitGadget
  8 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-12-12 21:36 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham

From: Matthew John Cheetham <mjcheetham@outlook.com>

Add a series of tests to exercise the HTTP authentication header parsing
and the interop with credential helpers. Credential helpers will receive
WWW-Authenticate information in credential requests.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 t/helper/test-credential-helper-replay.sh |  14 +++
 t/t5556-http-auth.sh                      | 120 +++++++++++++++++++++-
 2 files changed, 133 insertions(+), 1 deletion(-)
 create mode 100755 t/helper/test-credential-helper-replay.sh

diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
new file mode 100755
index 00000000000..03e5e63dad6
--- /dev/null
+++ b/t/helper/test-credential-helper-replay.sh
@@ -0,0 +1,14 @@
+cmd=$1
+teefile=$cmd-actual.cred
+catfile=$cmd-response.cred
+rm -f $teefile
+while read line;
+do
+	if test -z "$line"; then
+		break;
+	fi
+	echo "$line" >> $teefile
+done
+if test "$cmd" = "get"; then
+	cat $catfile
+fi
diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
index 78da151f122..541fa32bd77 100755
--- a/t/t5556-http-auth.sh
+++ b/t/t5556-http-auth.sh
@@ -26,6 +26,8 @@ PID_FILE="$(pwd)"/pid-file.pid
 SERVER_LOG="$(pwd)"/OUT.server.log
 
 PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
+CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
+	&& export CREDENTIAL_HELPER
 
 test_expect_success 'setup repos' '
 	test_create_repo "$REPO_DIR" &&
@@ -91,7 +93,8 @@ start_http_server () {
 
 per_test_cleanup () {
 	stop_http_server &&
-	rm -f OUT.*
+	rm -f OUT.* &&
+	rm -f *.cred
 }
 
 test_expect_success 'http auth anonymous no challenge' '
@@ -102,4 +105,119 @@ test_expect_success 'http auth anonymous no challenge' '
 	git ls-remote $ORIGIN_URL
 '
 
+test_expect_success 'http auth www-auth headers to credential helper basic valid' '
+	test_when_finished "per_test_cleanup" &&
+	# base64("alice:secret-passwd")
+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
+	export USERPASS64 &&
+
+	start_http_server \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=basic:$USERPASS64 &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper custom schemes' '
+	test_when_finished "per_test_cleanup" &&
+	# base64("alice:secret-passwd")
+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
+	export USERPASS64 &&
+
+	start_http_server \
+		--auth=foobar:alg=test\ widget=1 \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=basic:$USERPASS64 &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=foobar alg=test widget=1
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >store-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=secret-passwd
+	EOF
+
+	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp store-expected.cred store-actual.cred
+'
+
+test_expect_success 'http auth www-auth headers to credential helper invalid' '
+	test_when_finished "per_test_cleanup" &&
+	# base64("alice:secret-passwd")
+	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
+	export USERPASS64 &&
+	start_http_server \
+		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
+		--auth=basic:realm=\"example.com\" \
+		--auth-token=basic:$USERPASS64 &&
+
+	cat >get-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >erase-expected.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-passwd
+	wwwauth[]=bearer authority="id.example.com" q=1 p=0
+	wwwauth[]=basic realm="example.com"
+	EOF
+
+	cat >get-response.cred <<-EOF &&
+	protocol=http
+	host=$HOST_PORT
+	username=alice
+	password=invalid-passwd
+	EOF
+
+	test_must_fail git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
+
+	test_cmp get-expected.cred get-actual.cred &&
+	test_cmp erase-expected.cred erase-actual.cred
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 05/11] http: set specific auth scheme depending on credential
  2022-11-09 23:40       ` Glen Choo
@ 2022-12-12 21:53         ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-12-12 21:53 UTC (permalink / raw)
  To: Glen Choo, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Matthew John Cheetham

On 2022-11-09 15:40, Glen Choo wrote:
> "Matthew John Cheetham via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
> 
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Introduce a new credential field `authtype` that can be used by
>> credential helpers to indicate the type of the credential or
>> authentication mechanism to use for a request.
>>
>> Modify http.c to now specify the correct authentication scheme or
>> credential type when authenticating the curl handle. If the new
>> `authtype` field in the credential structure is `NULL` or "Basic" then
>> use the existing username/password options. If the field is "Bearer"
>> then use the OAuth bearer token curl option. Otherwise, the `authtype`
>> field is the authentication scheme and the `password` field is the
>> raw, unencoded value.
>>
>> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
>> ---
>>  Documentation/git-credential.txt | 12 ++++++++++++
>>  credential.c                     |  5 +++++
>>  credential.h                     |  1 +
>>  git-curl-compat.h                | 10 ++++++++++
>>  http.c                           | 24 +++++++++++++++++++++---
>>  5 files changed, 49 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
>> index 791a57dddfb..9069bfb2d50 100644
>> --- a/Documentation/git-credential.txt
>> +++ b/Documentation/git-credential.txt
>> @@ -175,6 +175,18 @@ username in the example above) will be left unset.
>>  	attribute 'wwwauth[]', where the order of the attributes is the same as
>>  	they appear in the HTTP response.
>>  
>> +`authtype`::
>> +
>> +	Indicates the type of authentication scheme that should be used by Git.
>> +	Credential helpers may reply to a request from Git with this attribute,
>> +	such that subsequent authenticated requests include the correct
>> +	`Authorization` header.
>> +	If this attribute is not present, the default value is "Basic".
>> +	Known values include "Basic", "Digest", and "Bearer".
>> +	If an unknown value is provided, this is taken as the authentication
>> +	scheme for the `Authorization` header, and the `password` field is
>> +	used as the raw unencoded authorization parameters of the same header.
>> +
> 
> [...]
> 
>> @@ -525,8 +526,25 @@ static void init_curl_http_auth(struct active_request_slot *slot)
>>  
>>  	credential_fill(&http_auth);
>>  
>> -	curl_easy_setopt(slot->curl, CURLOPT_USERNAME, http_auth.username);
>> -	curl_easy_setopt(slot->curl, CURLOPT_PASSWORD, http_auth.password);
>> +	if (!http_auth.authtype || !strcasecmp(http_auth.authtype, "basic")
>> +				|| !strcasecmp(http_auth.authtype, "digest")) {
>> +		curl_easy_setopt(slot->curl, CURLOPT_USERNAME,
>> +			http_auth.username);
>> +		curl_easy_setopt(slot->curl, CURLOPT_PASSWORD,
>> +			http_auth.password);
>> +#ifdef GIT_CURL_HAVE_CURLAUTH_BEARER
>> +	} else if (!strcasecmp(http_auth.authtype, "bearer")) {
>> +		curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, CURLAUTH_BEARER);
>> +		curl_easy_setopt(slot->curl, CURLOPT_XOAUTH2_BEARER,
>> +			http_auth.password);
>> +#endif
>> +	} else {
>> +		struct strbuf auth = STRBUF_INIT;
>> +		strbuf_addf(&auth, "Authorization: %s %s",
>> +			http_auth.authtype, http_auth.password);
>> +		slot->headers = curl_slist_append(slot->headers, auth.buf);
>> +		strbuf_release(&auth);
>> +	}
> 
> As expected, a "Bearer" authtype doesn't require passing a username to
> curl, but as you noted in the cover letter, credential helpers were
> designed with username-password authentication in mind, which raises the
> question of what a credential helper should do with "Bearer"
> credentials.
> 
> e.g. it is not clear to me where the "username" comes from in the tests, e.g.
> 
>   +test_expect_success 'http auth www-auth headers to credential helper basic valid' '
>   +	test_when_finished "per_test_cleanup" &&
>   +	# base64("alice:secret-passwd")
>   +	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
>   +	export USERPASS64 &&
>   +
>   +	start_http_server \
>   +		--auth=bearer:authority=\"id.example.com\"\ q=1\ p=0 \
>   +		--auth=basic:realm=\"example.com\" \
>   +		--auth-token=basic:$USERPASS64 &&
>   +
>   +	cat >get-expected.cred <<-EOF &&
>   +	protocol=http
>   +	host=$HOST_PORT
>   +	wwwauth[]=bearer authority="id.example.com" q=1 p=0
>   +	wwwauth[]=basic realm="example.com"
>   +	EOF
>   +
>   +	cat >store-expected.cred <<-EOF &&
>   +	protocol=http
>   +	host=$HOST_PORT
>   +	username=alice
>   +	password=secret-passwd
>   +	authtype=basic
>   +	EOF
>   +
>   +	cat >get-response.cred <<-EOF &&
>   +	protocol=http
>   +	host=$HOST_PORT
>   +	username=alice
>   +	password=secret-passwd
>   +	authtype=basic
>   +	EOF
>   +
>   +	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
>   +
>   +	test_cmp get-expected.cred get-actual.cred &&
>   +	test_cmp store-expected.cred store-actual.cred
>   +'
> 
> I'm not sure how we plan to handle this. Some approaches I can see are:
> 
> - We require that credential helpers set a reasonable value for
>   "username". Presumably most credential helpers generating bearer
>   tokens have some idea of user identity, so this might be reasonable,
>   though it is wasteful, since we never use it in a meaningul way, e.g.
>   I don't think Git asks the credential helper for "username=alice" and
>   the credential helper decides to return the 'alice' credential instead
>   of the 'bob' credential (but I could be mistaken).
> 
> - We require that credential helpers set _some_ value for "username",
>   even if it is bogus. If so, we should communicate this explicitly.
> 
> - It is okay for "username" to be missing. This seems like the most
>   elegant approach for credential helpers. I'm not sure if we're there
>   yet with this series, e.g. http.c::handle_curl_result() reads:
> 
>     else if (results->http_code == 401) {
>       if (http_auth.username && http_auth.password) {
>         credential_reject(&http_auth);
>         return HTTP_NOAUTH;
> 
>   which seems to assume both a username _and_ password. If the username
>   is missing, we presumably don't send "erase", which might be a problem
>   for revoked access tokens (though presumably not an issue for OIDC id
>   tokens).
You are correct here that a missing username here may cause some unexpected
issues, and there should be more test coverage here.

My recent v4 iteration has actually dropped the `authtype` patches here,
and I'll pick these back up along with these concerns in a future series.
Splitting this in to a future series is probably a good idea as I feel
there's going to need to be several cleanup patches adjacent to the core
new-feature patch, so I wouldn't want to polute this series :)

Thanks!
Matthew


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-09 23:06     ` Glen Choo
@ 2022-12-12 22:03       ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-12-12 22:03 UTC (permalink / raw)
  To: Glen Choo, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Matthew John Cheetham

On 2022-11-09 15:06, Glen Choo wrote:
>> Proposed Changes
>> ================
>>
>>  1. Teach Git to read HTTP response headers, specifically the standard
>>     WWW-Authenticate (RFC 7235 Section 4.1) headers.
>>
>>  2. Teach Git to include extra information about HTTP responses that require
>>     authentication when calling credential helpers. Specifically the
>>     WWW-Authenticate header information.
>>     
>>     Because the extra information forms an ordered list, and the existing
>>     credential helper I/O format only provides for simple key=value pairs,
>>     we introduce a new convention for transmitting an ordered list of
>>     values. Key names that are suffixed with a C-style array syntax should
>>     have values considered to form an order list, i.e. key[]=value, where
>>     the order of the key=value pairs in the stream specifies the order.
>>     
>>     For the WWW-Authenticate header values we opt to use the key wwwauth[].
>>
>>  3. Teach Git to specify authentication schemes other than Basic in
>>     subsequent HTTP requests based on credential helper responses.
>>
> 
> From a reading of this section + the subject line, it's not immediately
> obvious that 3. also requires extending the credential helper protocol
> to include the "authtype" field. IMO it's significant enough to warrant
> an explicit call-out.
After some consideration I've decided to split out #3 here to a future patch
series. Parts 1 and 2 surround Git to credential helper contextual information
which is still useful in it's own right. Part 3 should really be expanded here
to better cover and explain the reverse helper-to-Git direction, whereby
helpers can modify Git's response headers to the remote.

With 1+2 most of the benefits of having an enlightened helper understand the
auth challenge, and intelligently select identities is still possible. Remotes
just need to continue to extract tokens from the basic Authorization header as
they do today until then.


Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v3 00/11] Enhance credential helper protocol to include auth headers
  2022-11-03 19:00     ` [PATCH v3 00/11] Enhance credential helper protocol to include auth headers M Hickford
@ 2022-12-12 22:07       ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2022-12-12 22:07 UTC (permalink / raw)
  To: M Hickford, Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Jeff Hostetler,
	Matthew John Cheetham

On 2022-11-03 12:00, M Hickford wrote:
> On Wed, 2 Nov 2022 at 22:09, Matthew John Cheetham via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> `authtype`::
>>
>> Indicates the type of authentication scheme that should be used by Git.
>> Credential helpers may reply to a request from Git with this attribute,
>> such that subsequent authenticated requests include the correct
>> `Authorization` header.
>> If this attribute is not present, the default value is "Basic".
>> Known values include "Basic", "Digest", and "Bearer".
>> If an unknown value is provided, this is taken as the authentication
>> scheme for the `Authorization` header, and the `password` field is
>> used as the raw unencoded authorization parameters of the same header.
> 
> Do you have an example using authtype=Digest? Would the helper
> populate the password field with the user's verbatim password or the
> Digest challenge response? Put another way, is the Digest
> challenge-response logic in Git (libcurl) or the helper?
> 
> https://www.rfc-editor.org/rfc/rfc7616#section-3.4
Digest should be handled by libcurl, but you've spotted that I missed
configuring libcurl here to select digest over basic for a returned
username and password.

You may have noticed I've dropped these `authtype`/response config
patches from the latest iteration (v4) as I intend to expand this part
in a separate future series. I'll be sure to specifically test and handle
digest here! Thanks for spotting :)

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers
  2022-12-12 21:36       ` [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:15         ` Victoria Dye
  2023-01-11 22:09           ` Matthew John Cheetham
  2022-12-15  9:27         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:15 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
> +{
> +	size_t size = eltsize * nmemb;
> +	struct strvec *values = &http_auth.wwwauth_headers;
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *val;
> +	const char *z = NULL;
> +
> +	/*
> +	 * Header lines may not come NULL-terminated from libcurl so we must
> +	 * limit all scans to the maximum length of the header line, or leverage
> +	 * strbufs for all operations.
> +	 *
> +	 * In addition, it is possible that header values can be split over
> +	 * multiple lines as per RFC 2616 (even though this has since been
> +	 * deprecated in RFC 7230). A continuation header field value is
> +	 * identified as starting with a space or horizontal tab.
> +	 *
> +	 * The formal definition of a header field as given in RFC 2616 is:
> +	 *
> +	 *   message-header = field-name ":" [ field-value ]
> +	 *   field-name     = token
> +	 *   field-value    = *( field-content | LWS )
> +	 *   field-content  = <the OCTETs making up the field-value
> +	 *                    and consisting of either *TEXT or combinations
> +	 *                    of token, separators, and quoted-string>
> +	 */
> +
> +	strbuf_add(&buf, ptr, size);
> +
> +	/* Strip the CRLF that should be present at the end of each field */
> +	strbuf_trim_trailing_newline(&buf);
> +
> +	/* Start of a new WWW-Authenticate header */
> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
> +		while (isspace(*val))
> +			val++;

Per the RFC [1]: 

> The field value MAY be preceded by any amount of LWS, though a single SP
> is preferred.

And LWS (linear whitespace) is defined as:

> CRLF           = CR LF 
> LWS            = [CRLF] 1*( SP | HT )

and 'isspace()' includes CR, LF, SP, and HT [2]. 

Looks good!

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-4-2
[2] https://linux.die.net/man/3/isspace

> +
> +		strvec_push(values, val);

I had the same question about "what happens with an empty 'val' here?" as
Stolee did earlier [3], but I *think* the "zero length" (i.e., single null
terminator) will be copied successfully. It's probably worth testing that
explicitly, though (I see you set up tests in later patches - ideally a 
"www-authenticate:<mix of whitespace>" line could be tested there).

[3] https://lore.kernel.org/git/9fded44b-c503-f8e5-c6a6-93e882d50e27@github.com/

> +		http_auth.header_is_last_match = 1;
> +		goto exit;
> +	}
> +
> +	/*
> +	 * This line could be a continuation of the previously matched header
> +	 * field. If this is the case then we should append this value to the
> +	 * end of the previously consumed value.
> +	 */
> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
> +		const char **v = values->v + values->nr - 1;
> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);

In this case (where the line is a continuation of a 'www-authenticate'
header), it looks like the code here expects *exactly* one LWS at the start
of the line ('isspace(*buf.buf)' requiring at least one space to append the
header, 'ptr + 1' skipping no more than one). But, according to the RFC, it
could be more than one:

> Header fields can be extended over multiple lines by preceding each extra
> line with at least one SP or HT.

So I think 'buf.buf' might need to have all preceding spaces removed, like
you did in the "Start of a new WWW-Authenticate header" block.

Also, if you're copying 'ptr' into 'buf' to avoid issues from a missing null
terminator, wouldn't you want to use 'buf.buf' (instead of 'ptr') in
'xstrfmt()'?

> +
> +		free((void*)*v);
> +		*v = append;

I was about to suggest (optionally) rewriting this to use 'strvec_pop()' and
'strvec_push_nodup()':

	strvec_pop(values); 
	strvec_push_nodup(values, append);

to maybe make this a bit easier to follow, but unfortunately
'strvec_push_nodup()' isn't available outside of 'strvec.c'. If you did want
to use 'strvec' functions, you could remove the 'static' from
'strvec_push_nodup()' and add it to 'strvec.h' it in a later reroll, but I
don't consider that change "blocking" or even important enough to warrant
its own reroll. 

> +
> +		goto exit;
> +	}
> +
> +	/* This is the start of a new header we don't care about */
> +	http_auth.header_is_last_match = 0;
> +
> +	/*
> +	 * If this is a HTTP status line and not a header field, this signals
> +	 * a different HTTP response. libcurl writes all the output of all
> +	 * response headers of all responses, including redirects.
> +	 * We only care about the last HTTP request response's headers so clear
> +	 * the existing array.
> +	 */
> +	if (skip_iprefix(buf.buf, "http/", &z))
> +		strvec_clear(values);

The comments describing the intended behavior (as well as the commit
message) are clear and explain the somewhat esoteric (at least to my
untrained eye ;) ) code. Thanks!

> +
> +exit:
> +	strbuf_release(&buf);
> +	return size;
> +}
> +
>  size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
>  {
>  	return nmemb;
> @@ -1864,6 +1940,8 @@ static int http_request(const char *url,
>  					 fwrite_buffer);
>  	}
>  
> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
> +
>  	accept_language = http_get_accept_language_header();
>  
>  	if (accept_language)


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests
  2022-12-12 21:36       ` [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:15         ` Victoria Dye
  2023-01-11 20:37           ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:15 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Add the value of the WWW-Authenticate response header to credential
> requests. Credential helpers that understand and support HTTP
> authentication and authorization can use this standard header (RFC 2616
> Section 14.47 [1]) to generate valid credentials.
> 
> WWW-Authenticate headers can contain information pertaining to the
> authority, authentication mechanism, or extra parameters/scopes that are
> required.
> 
> The current I/O format for credential helpers only allows for unique
> names for properties/attributes, so in order to transmit multiple header
> values (with a specific order) we introduce a new convention whereby a
> C-style array syntax is used in the property name to denote multiple
> ordered values for the same property.
> 
> In this case we send multiple `wwwauth[]` properties where the order
> that the repeated attributes appear in the conversation reflects the
> order that the WWW-Authenticate headers appeared in the HTTP response.
> 
> [1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

...

> +Attributes with keys that end with C-style array brackets `[]` can have
> +multiple values. Each instance of a multi-valued attribute forms an
> +ordered list of values - the order of the repeated attributes defines
> +the order of the values. An empty multi-valued attribute (`key[]=\n`)
> +acts to clear any previous entries and reset the list.
> +

The commit message & documentation changes (here and the 'www-auth[]'
definition below) are concise, easy-to-understand explanations of what
you're doing here with the 'www-authenticate' header values.

>  
> @@ -160,6 +166,16 @@ empty string.
>  Components which are missing from the URL (e.g., there is no
>  username in the example above) will be left unset.
>  
> +`wwwauth[]`::
> +
> +	When an HTTP response is received by Git that includes one or more
> +	'WWW-Authenticate' authentication headers, these will be passed by Git
> +	to credential helpers.
> +	Each 'WWW-Authenticate' header value is passed as a multi-valued
> +	attribute 'wwwauth[]', where the order of the attributes is the same as
> +	they appear in the HTTP response. This attribute is 'one-way' from Git
> +	to pass additional information to credential helpers.

nit: if you're trying to get a paragraph break between "...to credential
helpers." and "Each 'WWW-Authenticate' header value", you need to add an
explicit break:

-------- 8< --------

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index bf0de0e940..50759153ef 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -171,10 +171,11 @@ username in the example above) will be left unset.
 	When an HTTP response is received by Git that includes one or more
 	'WWW-Authenticate' authentication headers, these will be passed by Git
 	to credential helpers.
-	Each 'WWW-Authenticate' header value is passed as a multi-valued
-	attribute 'wwwauth[]', where the order of the attributes is the same as
-	they appear in the HTTP response. This attribute is 'one-way' from Git
-	to pass additional information to credential helpers.
++
+Each 'WWW-Authenticate' header value is passed as a multi-valued
+attribute 'wwwauth[]', where the order of the attributes is the same as
+they appear in the HTTP response. This attribute is 'one-way' from Git
+to pass additional information to credential helpers.
 
 Unrecognised attributes are silently discarded.
 
-------- >8 --------

You can test to see how the docs look by running 'make doc' from the
repository root and looking at the generated 'git-credential.html' (note
that, if you've installed Git dependencies with Homebrew, you might need to
specify 'XML_CATALOG_FILES=$(brew --prefix)/etc/xml/catalog' to get it to
work).

> +
>  Unrecognised attributes are silently discarded.
>  
>  GIT
> diff --git a/credential.c b/credential.c
> index 897b4679333..8a3ad6c0ae2 100644
> --- a/credential.c
> +++ b/credential.c
> @@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
>  	fprintf(fp, "%s=%s\n", key, value);
>  }
>  
> +static void credential_write_strvec(FILE *fp, const char *key,
> +				    const struct strvec *vec)
> +{
> +	int i = 0;
> +	const char *full_key = xstrfmt("%s[]", key);
> +	for (; i < vec->nr; i++) {
> +		credential_write_item(fp, full_key, vec->v[i], 0);
> +	}
> +	free((void*)full_key);
> +}
> +
>  void credential_write(const struct credential *c, FILE *fp)
>  {
>  	credential_write_item(fp, "protocol", c->protocol, 1);
> @@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
>  	credential_write_item(fp, "path", c->path, 0);
>  	credential_write_item(fp, "username", c->username, 0);
>  	credential_write_item(fp, "password", c->password, 0);
> +	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);

This implementation looks good to me.

>  }
>  
>  static int run_credential_helper(struct credential *c,


^ permalink raw reply related	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 3/8] test-http-server: add stub HTTP server test helper
  2022-12-12 21:36       ` [PATCH v4 3/8] test-http-server: add stub HTTP server test helper Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:16         ` Victoria Dye
  2023-01-11 20:46           ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:16 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Introduce a mini HTTP server helper that in the future will be enhanced
> to provide a frontend for the git-http-backend, with support for
> arbitrary authentication schemes.

I really like this approach, particularly because it opens up the
possibility of writing more fine-grained tests in other contexts (e.g.,
testing how a bundle-uri client handles different kinds of erroneous server
responses by intercepting and customizing those responses).

> 
> Right now, test-http-server is a pared-down copy of the git-daemon that
> always returns a 501 Not Implemented response to all callers.
> 
> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
> ---
>  Makefile                            |   2 +
>  contrib/buildsystems/CMakeLists.txt |  13 +
>  t/helper/.gitignore                 |   1 +
>  t/helper/test-http-server.c         | 685 ++++++++++++++++++++++++++++
>  4 files changed, 701 insertions(+)
>  create mode 100644 t/helper/test-http-server.c
> 
> diff --git a/Makefile b/Makefile
> index b258fdbed86..1eb795bbfd4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1611,6 +1611,8 @@ else
>  	endif
>  	BASIC_CFLAGS += $(CURL_CFLAGS)
>  
> +	TEST_PROGRAMS_NEED_X += test-http-server

This works because all usage of 'TEST_PROGRAMS_NEED_X' are either lazily
evaluated (in the case of 'TEST_PROGRAMS') or are assigned later in the
'Makefile' than the addition here (in the case of 'test_bindir_programs'). 

On a related note, I think it would be helpful to mention 'test-http-server'
in the "=== Optional library: libcurl ===" section of the documentation at
the top of the Makefile, to clarify that it (like 'git-http-fetch' and
'git-http-push') are not built.

> +
>  	REMOTE_CURL_PRIMARY = git-remote-http$X
>  	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
>  	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
> diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
> index 2f6e0197ffa..e9b9bfbb437 100644
> --- a/contrib/buildsystems/CMakeLists.txt
> +++ b/contrib/buildsystems/CMakeLists.txt
> @@ -989,6 +989,19 @@ set(wrapper_scripts
>  set(wrapper_test_scripts
>  	test-fake-ssh test-tool)
>  
> +if(CURL_FOUND)
> +       list(APPEND wrapper_test_scripts test-http-server)
> +
> +       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
> +       target_link_libraries(test-http-server common-main)
> +
> +       if(MSVC)
> +               set_target_properties(test-http-server
> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
> +               set_target_properties(test-http-server
> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
> +       endif()
> +endif()
>  
>  foreach(script ${wrapper_scripts})
>  	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
> diff --git a/t/helper/.gitignore b/t/helper/.gitignore
> index 8c2ddcce95f..9aa9c752997 100644
> --- a/t/helper/.gitignore
> +++ b/t/helper/.gitignore
> @@ -1,2 +1,3 @@
>  /test-tool
>  /test-fake-ssh
> +/test-http-server
> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
> new file mode 100644
> index 00000000000..18f1f741305
> --- /dev/null
> +++ b/t/helper/test-http-server.c

A lot of the functions in this file are modified versions of ones in
'daemon.c'. It would help reviewers/future readers to mention that in the
commit message. 

My comments are mostly going to be around the similarities/differences from
'daemon.c', hopefully to understand how 'test-http-server' is meant to be
used.

> +static void logreport(const char *label, const char *err, va_list params)
> +{
> +	struct strbuf msg = STRBUF_INIT;
> +
> +	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
> +	strbuf_vaddf(&msg, err, params);
> +	strbuf_addch(&msg, '\n');
> +
> +	fwrite(msg.buf, sizeof(char), msg.len, stderr);
> +	fflush(stderr);
> +
> +	strbuf_release(&msg);

This looks like the 'LOG_DESTINATION_STDERR' case of 'logreport()' in
'daemon.c', but adds a "label" to represent the priority. Makes sense; these
logs will be helpful to have in stderr when running tests, and the priority
will be captured as well.

> +}
> +
> +__attribute__((format (printf, 1, 2)))
> +static void logerror(const char *err, ...)
> +{
> +	va_list params;
> +	va_start(params, err);
> +	logreport("error", err, params);
> +	va_end(params);
> +}
> +
> +__attribute__((format (printf, 1, 2)))
> +static void loginfo(const char *err, ...)
> +{
> +	va_list params;
> +	if (!verbose)
> +		return;
> +	va_start(params, err);
> +	logreport("info", err, params);
> +	va_end(params);
> +}

These two functions replace the "priority" int with the "label" string, but
otherwise capture the same information.

> +
> +static void set_keep_alive(int sockfd)

This function is identical to its 'daemon.c' counterpart; its usage in
'test-http-server.c' doesn't indicate any need to differ.

> +
> +/*
> + * The code in this section is used by "worker" instances to service
> + * a single connection from a client.  The worker talks to the client
> + * on 0 and 1.
> + */
> +
> +enum worker_result {
> +	/*
> +	 * Operation successful.
> +	 * Caller *might* keep the socket open and allow keep-alive.
> +	 */
> +	WR_OK       = 0,
> +
> +	/*
> +	 * Various errors while processing the request and/or the response.
> +	 * Close the socket and clean up.
> +	 * Exit child-process with non-zero status.
> +	 */
> +	WR_IO_ERROR = 1<<0,
> +
> +	/*
> +	 * Close the socket and clean up.  Does not imply an error.
> +	 */
> +	WR_HANGUP   = 1<<1,
> +
> +	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),

As much as I love the name, I'm not sure having this value defined makes
much sense as its own "state". AFAICT, 'WR_IO_ERROR' means "error AND exit",
but 'WR_HANGUP' just means "exit", so the latter is a superset of the
former. Even if you interpret 'WR_HANGUP' as "*no* error and exit", that
makes it and 'WR_IO_ERROR' mutually exclusive, so the "combined" state
doesn't represent anything "real".

> +};
> +
> +static enum worker_result worker(void)
> +{
> +	const char *response = "HTTP/1.1 501 Not Implemented\r\n";

Here's the hardcoded 501 error, as mentioned in the commit message.

> +	char *client_addr = getenv("REMOTE_ADDR");
> +	char *client_port = getenv("REMOTE_PORT");
> +	enum worker_result wr = WR_OK;
> +
> +	if (client_addr)
> +		loginfo("Connection from %s:%s", client_addr, client_port);
> +
> +	set_keep_alive(0);
> +
> +	while (1) {
> +		if (write_in_full(1, response, strlen(response)) < 0) {
> +			logerror("unable to write response");
> +			wr = WR_IO_ERROR;
> +		}

This tries to write the response out to stdout (optional nit: you could use
'STDOUT_FILENO' instead of '1' to make this clearer), and sets 'WR_IO_ERROR'
if it fails. 

> +
> +		if (wr & WR_STOP_THE_MUSIC)
> +			break;

This will trigger if 'wr' is 'WR_HANGUP' *or* 'WR_IO_ERROR'. Is that
intentional? If it is, I think 'wr != 'WR_OK' might make that more obvious?

> +	}
> +
> +	close(0);
> +	close(1);
> +
> +	return !!(wr & WR_IO_ERROR);

Then finish by closing out 'stdin' and 'stdout', and returning '0' for "no
error", '1' for "error".

> +}
> +
> +/*
> + * This section contains the listener and child-process management
> + * code used by the primary instance to accept incoming connections
> + * and dispatch them to async child process "worker" instances.
> + */
> +
> +static int addrcmp(const struct sockaddr_storage *s1,


Identical to 'daemon.c'.

> +static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
> +{
> +	struct child *newborn, **cradle;
> +
> +	newborn = xcalloc(1, sizeof(*newborn));
> +	live_children++;
> +	memcpy(&newborn->cld, cld, sizeof(*cld));
> +	memcpy(&newborn->address, addr, addrlen);
> +	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
> +		if (!addrcmp(&(*cradle)->address, &newborn->address))
> +			break;
> +	newborn->next = *cradle;
> +	*cradle = newborn;
> +}

This is mostly the same as 'daemon.c', but uses 'xcalloc()' instead of
'CALLOC_ARRAY()'. The latter is an alias for the former, so this is fine.

> +static void kill_some_child(void)

...

> +static void check_dead_children(void)
Both of these are identical to 'daemon.c'.

> +
> +static struct strvec cld_argv = STRVEC_INIT;
> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)

This matches 'daemon.c' except for the addition of:

> +	if (cld.out < 0)
> +		logerror("could not dup() `incoming`");

The extra context provided by this message could be helpful in debugging. If
nothing else, it doesn't hurt.

> +	else if (start_command(&cld))
> +		logerror("unable to fork");
> +	else
> +		add_child(&cld, addr, addrlen);
> +}
> +
> +static void child_handler(int signo)

...

> +static int set_reuse_addr(int sockfd)

...

> +static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)

...

> +#ifndef NO_IPV6
> +
> +static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
...

> +#else /* NO_IPV6 */
> +
> +static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)

All of these functions match 'daemon.c' (save for some whitespace fixups).

> +
> +static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
> +{
> +	if (!listen_addr->nr)
> +		setup_named_sock("127.0.0.1", listen_port, socklist);

This is the only difference in this function from 'daemon.c' (there, the
first arg is 'NULL', which ends up mapping to 'INADDR_ANY'). Why the change
in default?

> +	else {
> +		int i, socknum;
> +		for (i = 0; i < listen_addr->nr; i++) {
> +			socknum = setup_named_sock(listen_addr->items[i].string,
> +						   listen_port, socklist);
> +
> +			if (socknum == 0)
> +				logerror("unable to allocate any listen sockets for host %s on port %u",
> +					 listen_addr->items[i].string, listen_port);
> +		}
> +	}
> +}
> +
> +static int service_loop(struct socketlist *socklist)

This function differs from 'daemon.c' by using removal of the 'pid_file' to
force a graceful shutdown of the server.

> +{
> +	struct pollfd *pfd;
> +	int i;
> +
> +	CALLOC_ARRAY(pfd, socklist->nr);
> +
> +	for (i = 0; i < socklist->nr; i++) {
> +		pfd[i].fd = socklist->list[i];
> +		pfd[i].events = POLLIN;
> +	}
> +
> +	signal(SIGCHLD, child_handler);
> +
> +	for (;;) {
> +		int i;
> +		int nr_ready;
> +		int timeout = (pid_file ? 100 : -1);
> +
> +		check_dead_children();
> +
> +		nr_ready = poll(pfd, socklist->nr, timeout);

Setting a timeout here (if 'pid_file' is present) allows us to operate in a
mode where the removal of a 'pid_file' indicates that the server should shut
down.

> +		if (nr_ready < 0) {

'nr_ready < 0' indicates an error [1]; handle the same way as 'daemon.c'.

[1] https://man7.org/linux/man-pages/man2/poll.2.html

> +			if (errno != EINTR) {
> +				logerror("Poll failed, resuming: %s",
> +				      strerror(errno));
> +				sleep(1);
> +			}
> +			continue;
> +		}
> +		else if (nr_ready == 0) {

'nr_ready == 0' indicates a polling timeout (see [1] above)...

> +			/*
> +			 * If we have a pid_file, then we watch it.
> +			 * If someone deletes it, we shutdown the service.
> +			 * The shell scripts in the test suite will use this.
> +			 */
> +			if (!pid_file || file_exists(pid_file))
> +				continue;
> +			goto shutdown;

...and that timeout exists so that we can check whether the 'pid_file' still
exists and, if so, shut down gracefully.

> +		}
> +

Otherwise, 'nr_ready > 1', so handle the polled events.

> +		for (i = 0; i < socklist->nr; i++) {
> +			if (pfd[i].revents & POLLIN) {
> +				union {
> +					struct sockaddr sa;
> +					struct sockaddr_in sai;
> +#ifndef NO_IPV6
> +					struct sockaddr_in6 sai6;
> +#endif
> +				} ss;
> +				socklen_t sslen = sizeof(ss);
> +				int incoming = accept(pfd[i].fd, &ss.sa, &sslen);
> +				if (incoming < 0) {
> +					switch (errno) {
> +					case EAGAIN:
> +					case EINTR:
> +					case ECONNABORTED:
> +						continue;
> +					default:
> +						die_errno("accept returned");
> +					}
> +				}
> +				handle(incoming, &ss.sa, sslen);
> +			}
> +		}
> +	}
> +
> +shutdown:
> +	loginfo("Starting graceful shutdown (pid-file gone)");
> +	for (i = 0; i < socklist->nr; i++)
> +		close(socklist->list[i]);
> +
> +	return 0;

This addition logs the shutdown and closes out sockets. Looks good!

> +}
> +
> +static int serve(struct string_list *listen_addr, int listen_port)
> +{
> +	struct socketlist socklist = { NULL, 0, 0 };
> +
> +	socksetup(listen_addr, listen_port, &socklist);
> +	if (socklist.nr == 0)
> +		die("unable to allocate any listen sockets on port %u",
> +		    listen_port);
> +
> +	loginfo("Ready to rumble");

I thought this was a leftover debug printout, but it turns out that
'serve()' in 'daemon.c' has the same message. :) 

> +
> +	/*
> +	 * Wait to create the pid-file until we've setup the sockets
> +	 * and are open for business.
> +	 */
> +	if (pid_file)
> +		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
> +
> +	return service_loop(&socklist);
> +}
> +
> +/*
> + * This section is executed by both the primary instance and all
> + * worker instances.  So, yes, each child-process re-parses the
> + * command line argument and re-discovers how it should behave.
> + */
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +	int listen_port = 0;
> +	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
> +	int worker_mode = 0;
> +	int i;
> +
> +	trace2_cmd_name("test-http-server");
> +	setup_git_directory_gently(NULL);

Since this isn't part of 'test-tool', it needs to do its own trace2 setup,
but it seems to be missing some of the relevant function calls. Could you
include 'trace2_cmd_list_config()' and 'trace2_cmd_list_env_vars()' as well? 

> +
> +	for (i = 1; i < argc; i++) {

Can this loop be replaced with 'parse_options()' and the appropriate 'struct
option[]'? Newer test helpers ('test-bundle-uri', 'test-cache-tree',
'test-getcwd') have been using it, and it generally seems much easier to
work with/more flexible than a custom 'if()' block (handling option
negation, interpreting both '--option=<value>' and '--option value' syntax
etc.).

That said, it looks this was mostly pulled from 'daemon.c' (which might
predate 'parse_options()'), so I'd also understand if you want to keep it as
similar to that as possible. Up to you!

> +	/* avoid splitting a message in the middle */
> +	setvbuf(stderr, NULL, _IOFBF, 4096);
> +
> +	if (listen_port == 0)
> +		listen_port = DEFAULT_GIT_PORT;
> +
> +	/*
> +	 * If no --listen=<addr> args are given, the setup_named_sock()
> +	 * code will use receive a NULL address and set INADDR_ANY.
> +	 * This exposes both internal and external interfaces on the
> +	 * port.
> +	 *
> +	 * Disallow that and default to the internal-use-only loopback
> +	 * address.
> +	 */
> +	if (!listen_addr.nr)
> +		string_list_append(&listen_addr, "127.0.0.1");
> +
> +	/*
> +	 * worker_mode is set in our own child process instances
> +	 * (that are bound to a connected socket from a client).
> +	 */
> +	if (worker_mode)
> +		return worker();
> +
> +	/*
> +	 * `cld_argv` is a bit of a clever hack. The top-level instance
> +	 * of test-http-server does the normal bind/listen/accept stuff.
> +	 * For each incoming socket, the top-level process spawns
> +	 * a child instance of test-http-server *WITH* the additional
> +	 * `--worker` argument. This causes the child to set `worker_mode`
> +	 * and immediately call `worker()` using the connected socket (and
> +	 * without the usual need for fork() or threads).
> +	 *
> +	 * The magic here is made possible because `cld_argv` is static
> +	 * and handle() (called by service_loop()) knows about it.
> +	 */
> +	strvec_push(&cld_argv, argv[0]);
> +	strvec_push(&cld_argv, "--worker");
> +	for (i = 1; i < argc; ++i)
> +		strvec_push(&cld_argv, argv[i]);
> +
> +	/*
> +	 * Setup primary instance to listen for connections.
> +	 */
> +	return serve(&listen_addr, listen_port);

The rest of the function is "new", but is well-documented and appears to
work as intended.

> +}

One last note/suggestion - while a lot of the functions in
'test-http-server.c' are modified from those in 'daemon.c', there are a fair
number of identical functions as well. Would it be possible to libify some
of 'daemon.c's functions (mainly by creating a 'daemon.h' and making the
functions non-static) so that they don't need to be copied?


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 4/8] test-http-server: add HTTP error response function
  2022-12-12 21:36       ` [PATCH v4 4/8] test-http-server: add HTTP error response function Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:17         ` Victoria Dye
  0 siblings, 0 replies; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:17 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> +static enum worker_result send_http_error(
> +	int fd,
> +	int http_code, const char *http_code_name,
> +	int retry_after_seconds, struct string_list *response_headers,
> +	enum worker_result wr_in)
> +{
> +	struct strbuf response_header = STRBUF_INIT;
> +	struct strbuf response_content = STRBUF_INIT;
> +	struct string_list_item *h;
> +	enum worker_result wr;
> +
> +	strbuf_addf(&response_content, "Error: %d %s\r\n",
> +		    http_code, http_code_name);
> +	if (retry_after_seconds > 0)
> +		strbuf_addf(&response_content, "Retry-After: %d\r\n",
> +			    retry_after_seconds);
> +
> +	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
> +	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
> +	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
> +	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
> +	if (retry_after_seconds > 0)
> +		strbuf_addf(&response_header, "Retry-After: %d\r\n", retry_after_seconds);
> +	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
> +	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
> +	if (response_headers)
> +		for_each_string_list_item(h, response_headers)
> +			strbuf_addf(&response_header, "%s\r\n", h->string);
> +	strbuf_addstr(&response_header, "\r\n");
> +
> +	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
> +		logerror("unable to write response header");
> +		wr = WR_IO_ERROR;
> +		goto done;
> +	}
> +
> +	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
> +		logerror("unable to write response content body");
> +		wr = WR_IO_ERROR;
> +		goto done;
> +	}
> +
> +	wr = wr_in;

By setting this here, if there's a 'goto done' added sometime in the future
that doesn't explicitly set 'wr' first, it'll trigger a compiler error.
That's good for a case like this, where we don't want to assume a "default"
for 'wr' before handling it.

> +
> +done:
> +	strbuf_release(&response_header);
> +	strbuf_release(&response_content);
> +
> +	return wr;
> +}
> +
>  static enum worker_result worker(void)
>  {
> -	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
>  	char *client_addr = getenv("REMOTE_ADDR");
>  	char *client_port = getenv("REMOTE_PORT");
>  	enum worker_result wr = WR_OK;
> @@ -110,11 +160,8 @@ static enum worker_result worker(void)
>  	set_keep_alive(0);
>  
>  	while (1) {
> -		if (write_in_full(1, response, strlen(response)) < 0) {
> -			logerror("unable to write response");
> -			wr = WR_IO_ERROR;
> -		}
> -
> +		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
> +			WR_OK | WR_HANGUP);

This is a nice incremental improvement on the original hardcoded response.

>  		if (wr & WR_STOP_THE_MUSIC)
>  			break;
>  	}


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 5/8] test-http-server: add HTTP request parsing
  2022-12-12 21:36       ` [PATCH v4 5/8] test-http-server: add HTTP request parsing Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:18         ` Victoria Dye
  2023-01-11 21:39           ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:18 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> +/*
> + * Read the HTTP request up to the start of the optional message-body.
> + * We do this byte-by-byte because we have keep-alive turned on and
> + * cannot rely on an EOF.
> + *
> + * https://tools.ietf.org/html/rfc7230
> + *
> + * We cannot call die() here because our caller needs to properly
> + * respond to the client and/or close the socket before this
> + * child exits so that the client doesn't get a connection reset
> + * by peer error.
> + */
> +static enum worker_result req__read(struct req *req, int fd)
> +{
> +	struct strbuf h = STRBUF_INIT;
> +	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
> +	int nr_start_line_fields;
> +	const char *uri_target;
> +	const char *query;
> +	char *hp;
> +	const char *hv;
> +
> +	enum worker_result result = WR_OK;
> +
> +	/*
> +	 * Read line 0 of the request and split it into component parts:
> +	 *
> +	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
> +	 *
> +	 */
> +	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
> +		result = WR_OK | WR_HANGUP;
> +		goto done;
> +	}
> +
> +	strbuf_trim_trailing_newline(&req->start_line);
> +
> +	nr_start_line_fields = string_list_split(&start_line_fields,
> +						 req->start_line.buf,
> +						 ' ', -1);
> +	if (nr_start_line_fields != 3) {
> +		logerror("could not parse request start-line '%s'",
> +			 req->start_line.buf);
> +		result = WR_IO_ERROR;
> +		goto done;
> +	}
> +
> +	req->method = xstrdup(start_line_fields.items[0].string);
> +	req->http_version = xstrdup(start_line_fields.items[2].string);
> +
> +	uri_target = start_line_fields.items[1].string;
> +
> +	if (strcmp(req->http_version, "HTTP/1.1")) {
> +		logerror("unsupported version '%s' (expecting HTTP/1.1)",
> +			 req->http_version);
> +		result = WR_IO_ERROR;
> +		goto done;
> +	}
> +
> +	query = strchr(uri_target, '?');
> +
> +	if (query) {
> +		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
> +		strbuf_trim_trailing_dir_sep(&req->uri_path);
> +		strbuf_addstr(&req->query_args, query + 1);
> +	} else {
> +		strbuf_addstr(&req->uri_path, uri_target);
> +		strbuf_trim_trailing_dir_sep(&req->uri_path);
> +	}

This "line 0" parsing looks good, and aligns with the RFC you linked
(specifically section 3.1.1 [1]).

[1] https://www.rfc-editor.org/rfc/rfc7230#section-3.1.1

> +
> +	/*
> +	 * Read the set of HTTP headers into a string-list.
> +	 */
> +	while (1) {
> +		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
> +			goto done;
> +		strbuf_trim_trailing_newline(&h);
> +
> +		if (!h.len)
> +			goto done; /* a blank line ends the header */
> +
> +		hp = strbuf_detach(&h, NULL);
> +		string_list_append(&req->header_list, hp);
> +
> +		/* store common request headers separately */
> +		if (skip_prefix(hp, "Content-Type: ", &hv)) {
> +			req->content_type = hv;
> +		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
> +			req->content_length = strtol(hv, &hp, 10);
> +		}

The "separately" is somewhat confusing - you unconditionally add 'hp' to
'req->header_list', so the "Content-Type" and "Content-Length" headers are
included there as well. If that's the desired behavior, a comment like "Also
store common headers as 'req' fields" might be clearer.

> +	}
> +
> +	/*
> +	 * We do not attempt to read the <message-body>, if it exists.
> +	 * We let our caller read/chunk it in as appropriate.
> +	 */
> +
> +done:
> +	string_list_clear(&start_line_fields, 0);
> +
> +	/*
> +	 * This is useful for debugging the request, but very noisy.
> +	 */
> +	if (trace2_is_enabled()) {

'trace2_printf()' is gated internally by 'trace2_enabled' anyway, so I don't
think this 'if()' is necessary. You could add a 'DEBUG_HTTP_SERVER'
preprocessor directive (like 'DEBUG_CACHE_TREE' in 'cache-tree.c') if you
wanted to prevent these printouts unless a developer sets it to '1'.

> +		struct string_list_item *item;
> +		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
> +		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
> +		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
> +		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
> +		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
> +		if (req->content_length >= 0)
> +			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
> +		if (req->content_type)
> +			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
> +		for_each_string_list_item(item, &req->header_list)
> +			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
> +	}
> +
> +	return result;
> +}
> +
> +static enum worker_result dispatch(struct req *req)
> +{
> +	return send_http_error(1, 501, "Not Implemented", -1, NULL,
> +			       WR_OK | WR_HANGUP);

Although the request is now being read & parsed, the response creation code
is still a hardcoded "Not Implemented". This means that the now-parsed 'req'
is be temporarily unused, but I think that's reasonable (since it allows for
breaking up the implementation of 'test-http-server' into multiple, less
overwhelming patches).

> +}
> +
>  static enum worker_result worker(void)
>  {
> +	struct req req = REQ__INIT;
>  	char *client_addr = getenv("REMOTE_ADDR");
>  	char *client_port = getenv("REMOTE_PORT");
>  	enum worker_result wr = WR_OK;
> @@ -160,8 +324,16 @@ static enum worker_result worker(void)
>  	set_keep_alive(0);
>  
>  	while (1) {
> -		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
> -			WR_OK | WR_HANGUP);
> +		req__release(&req);
> +
> +		alarm(init_timeout ? init_timeout : timeout);
> +		wr = req__read(&req, 0);
> +		alarm(0);

I know 'init_timeout' and 'timeout' were pulled from 'daemon.c', but what's
the difference between them/why do they both exist? It looks like
'init_timeout' just acts as a permanent override to the value of 'timeout'.

> +
> +		if (wr & WR_STOP_THE_MUSIC)
> +			break;
> +
> +		wr = dispatch(&req);
>  		if (wr & WR_STOP_THE_MUSIC)
>  			break;
>  	}


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 6/8] test-http-server: pass Git requests to http-backend
  2022-12-12 21:36       ` [PATCH v4 6/8] test-http-server: pass Git requests to http-backend Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:20         ` Victoria Dye
  2023-01-11 21:45           ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:20 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Teach the test-http-sever test helper to forward Git requests to the
> `git-http-backend`.
> 
> Introduce a new test script t5556-http-auth.sh that spins up the test
> HTTP server and attempts an `ls-remote` on the served repository,
> without any authentication.
> 
> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
> ---
>  t/helper/test-http-server.c |  56 +++++++++++++++++++
>  t/t5556-http-auth.sh        | 105 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 161 insertions(+)
>  create mode 100755 t/t5556-http-auth.sh
> 
> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
> index 7bde678e264..9f1d6b58067 100644
> --- a/t/helper/test-http-server.c
> +++ b/t/helper/test-http-server.c
> @@ -305,8 +305,64 @@ done:
>  	return result;
>  }
>  
> +static int is_git_request(struct req *req)
> +{
> +	static regex_t *smart_http_regex;
> +	static int initialized;
> +
> +	if (!initialized) {
> +		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
> +		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
> +			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
> +			    REG_EXTENDED)) {

Could you explain the reasoning behind this regex (e.g., in a comment)? What
sorts of valid/invalid requests does it represent? Is that the full set of
requests that are "valid" to Git, or is it a test-specific subset?

> +			warning("could not compile smart HTTP regex");
> +			smart_http_regex = NULL;
> +		}
> +		initialized = 1;
> +	}
> +
> +	return smart_http_regex &&
> +		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
> +}
> +
> +static enum worker_result do__git(struct req *req, const char *user)
> +{
> +	const char *ok = "HTTP/1.1 200 OK\r\n";
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	int res;
> +
> +	if (write(1, ok, strlen(ok)) < 0)
> +		return error(_("could not send '%s'"), ok);

Is it correct to hardcode the response status to '200 OK'? Even when
'http-backend' exits with an error?

> +
> +	if (user)
> +		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);

I'm guessing that 'user' isn't used until a later patch? I think it might be
better to not introduce that arg at all until it's needed (it'll put the
usage of 'user' in context with how its value is determined), rather than
hardcode it to 'NULL' for now.

> +
> +	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
> +	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
> +			req->uri_path.buf);
> +	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
> +	if (req->query_args.len)
> +		strvec_pushf(&cp.env, "QUERY_STRING=%s",
> +				req->query_args.buf);
> +	if (req->content_type)
> +		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
> +				req->content_type);
> +	if (req->content_length >= 0)
> +		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
> +				(intmax_t)req->content_length);
> +	cp.git_cmd = 1;
> +	strvec_push(&cp.args, "http-backend");
> +	res = run_command(&cp);

I'm not super familiar with 'http-backend' but as long as it 1) uses the
content passed into the environment to parse the request, and 2) writes the
response to stdout, I think this is right.

> +	close(1);
> +	close(0);
> +	return !!res;
> +}
> +
>  static enum worker_result dispatch(struct req *req)
>  {
> +	if (is_git_request(req))
> +		return do__git(req, NULL);
> +
>  	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>  			       WR_OK | WR_HANGUP);
>  }
> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
> new file mode 100755
> index 00000000000..78da151f122
> --- /dev/null
> +++ b/t/t5556-http-auth.sh
> @@ -0,0 +1,105 @@
> +#!/bin/sh
> +
> +test_description='test http auth header and credential helper interop'
> +
> +. ./test-lib.sh
> +
> +test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
> +
> +# Setup a repository
> +#
> +REPO_DIR="$(pwd)"/repo

nit: '$TEST_OUTPUT_DIRECTORY' instead of '$(pwd)' is more consistent with
what I see in other tests. 

Also, if you're creating a repo in its own subdirectory ('repo'), you can
set 'TEST_NO_CREATE_REPO=1' before importing './test-lib' to avoid creating
a repo at the root level of the test output dir - it can help avoid
potential weird/unexpected behavior as a result of being in a repo inside of
another repo.

> +
> +# Setup some lookback URLs where test-http-server will be listening.
> +# We will spawn it directly inside the repo directory, so we avoid
> +# any need to configure directory mappings etc - we only serve this
> +# repository from the root '/' of the server.
> +#
> +HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
> +ORIGIN_URL=http://$HOST_PORT/
> +
> +# The pid-file is created by test-http-server when it starts.
> +# The server will shutdown if/when we delete it (this is easier than
> +# killing it by PID).
> +#
> +PID_FILE="$(pwd)"/pid-file.pid
> +SERVER_LOG="$(pwd)"/OUT.server.log
> +
> +PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
> +
> +test_expect_success 'setup repos' '
> +	test_create_repo "$REPO_DIR" &&
> +	git -C "$REPO_DIR" branch -M main
> +'
> +
> +stop_http_server () {
> +	if ! test -f "$PID_FILE"
> +	then
> +		return 0
> +	fi
> +	#
> +	# The server will shutdown automatically when we delete the pid-file.
> +	#
> +	rm -f "$PID_FILE"
> +	#
> +	# Give it a few seconds to shutdown (mainly to completely release the
> +	# port before the next test start another instance and it attempts to
> +	# bind to it).
> +	#
> +	for k in 0 1 2 3 4
> +	do
> +		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
> +		then
> +			return 0
> +		fi
> +		sleep 1
> +	done
> +
> +	echo "stop_http_server: timeout waiting for server shutdown"
> +	return 1
> +}
> +
> +start_http_server () {
> +	#
> +	# Launch our server into the background in repo_dir.
> +	#
> +	(
> +		cd "$REPO_DIR"
> +		test-http-server --verbose \
> +			--listen=127.0.0.1 \
> +			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
> +			--reuseaddr \
> +			--pid-file="$PID_FILE" \
> +			"$@" \
> +			2>"$SERVER_LOG" &
> +	)
> +	#
> +	# Give it a few seconds to get started.
> +	#
> +	for k in 0 1 2 3 4
> +	do
> +		if test -f "$PID_FILE"
> +		then
> +			return 0
> +		fi
> +		sleep 1
> +	done
> +
> +	echo "start_http_server: timeout waiting for server startup"
> +	return 1
> +}

These start/stop functions look good to me!

> +
> +per_test_cleanup () {
> +	stop_http_server &&
> +	rm -f OUT.*
> +}
> +
> +test_expect_success 'http auth anonymous no challenge' '
> +	test_when_finished "per_test_cleanup" &&
> +	start_http_server --allow-anonymous &&

The '--allow-anonymous' option isn't added until patch 7 [1], so the test
will fail in this patch. I think the easiest way to solve that is to remove
it here (although I think it's fine to leave the title "anonymous no
challenge", though), then add it in patch 7. 

[1] https://lore.kernel.org/git/794256754c1f7d32e438dfb19a05444d423989aa.1670880984.git.gitgitgadget@gmail.com/

> +
> +	# Attempt to read from a protected repository
> +	git ls-remote $ORIGIN_URL
> +'
> +
> +test_done


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 7/8] test-http-server: add simple authentication
  2022-12-12 21:36       ` [PATCH v4 7/8] test-http-server: add simple authentication Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:23         ` Victoria Dye
  2023-01-11 22:00           ` Matthew John Cheetham
  0 siblings, 1 reply; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:23 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> +static int is_authed(struct req *req, const char **user, enum worker_result *wr)
> +{
> +	enum auth_result result = AUTH_UNKNOWN;
> +	struct string_list hdrs = STRING_LIST_INIT_NODUP;
> +	struct auth_module *mod;
> +
> +	struct string_list_item *hdr;
> +	struct string_list_item *token;
> +	const char *v;
> +	struct strbuf **split = NULL;
> +	int i;
> +	char *challenge;
> +
> +	/*
> +	 * Check all auth modules and try to validate the request.
> +	 * The first module that matches a valid token approves the request.
> +	 * If no module is found, or if there is no valid token, then 401 error.
> +	 * Otherwise, only permit the request if anonymous auth is enabled.
> +	 */
> +	for_each_string_list_item(hdr, &req->header_list) {
> +		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {

Is only one "Authorization:" header allowed? If so, adding a 'break;' at the
end of this if-statement would make that clearer. If not, what's the
expected allow/deny behavior if e.g. one header is ALLOW'd by one auth
module, and another header is DENY'd by a different auth module?

> +			split = strbuf_split_str(v, ' ', 2);
> +			if (!split[0] || !split[1]) continue;
> +
> +			/* trim trailing space ' ' */
> +			strbuf_setlen(split[0], split[0]->len - 1);
> +
> +			mod = get_auth_module(split[0]->buf);
> +			if (mod) {
> +				result = AUTH_DENY;
> +
> +				for_each_string_list_item(token, mod->tokens) {
> +					if (!strcmp(split[1]->buf, token->string)) {
> +						result = AUTH_ALLOW;
> +						break;
> +					}
> +				}
> +
> +				goto done;
> +			}
> +		}
> +	}
> +
> +done:
> +	switch (result) {
> +	case AUTH_ALLOW:
> +		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
> +		*user = "VALID_TEST_USER";
> +		*wr = WR_OK;
> +		break;
> +
> +	case AUTH_DENY:
> +		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
> +		/* fall-through */
> +
> +	case AUTH_UNKNOWN:
> +		if (result != AUTH_DENY && allow_anonymous)
> +			break;

I think this just needs to be 'if (allow_anonymous)' - we already know
'result' is 'AUTH_UNKNOWN' once we reach this block.

> +		for (i = 0; i < auth_modules_nr; i++) {
> +			mod = auth_modules[i];
> +			if (mod->challenge_params)
> +				challenge = xstrfmt("WWW-Authenticate: %s %s",
> +						    mod->scheme,
> +						    mod->challenge_params);
> +			else
> +				challenge = xstrfmt("WWW-Authenticate: %s",
> +						    mod->scheme);
> +			string_list_append(&hdrs, challenge);
> +		}
> +		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
> +	}
> +
> +	strbuf_list_free(split);
> +	string_list_clear(&hdrs, 0);
> +
> +	return result == AUTH_ALLOW ||
> +	      (result == AUTH_UNKNOWN && allow_anonymous);

So if a user is explicitly denied, even with 'allow_anonymous', this fails?
Is there a test case that uses that behavior and/or is that standard auth
behavior? Otherwise, it'd be simpler to skip the 'is_authed()' check (in
'dispatch()') altogether if 'allow_anonymous' is enabled.

> +}
> +
>  static enum worker_result dispatch(struct req *req)
>  {
> +	enum worker_result wr = WR_OK;
> +	const char *user = NULL;
> +
> +	if (!is_authed(req, &user, &wr))
> +		return wr;
> +
>  	if (is_git_request(req))
> -		return do__git(req, NULL);
> +		return do__git(req, user);
>  
>  	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>  			       WR_OK | WR_HANGUP);
> @@ -854,6 +982,7 @@ int cmd_main(int argc, const char **argv)
>  	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
>  	int worker_mode = 0;
>  	int i;
> +	struct auth_module *mod = NULL;
>  
>  	trace2_cmd_name("test-http-server");
>  	setup_git_directory_gently(NULL);
> @@ -906,6 +1035,63 @@ int cmd_main(int argc, const char **argv)
>  			pid_file = v;
>  			continue;
>  		}
> +		if (skip_prefix(arg, "--allow-anonymous", &v)) {
> +			allow_anonymous = 1;
> +			continue;
> +		}
> +		if (skip_prefix(arg, "--auth=", &v)) {
...

> +		}
> +		if (skip_prefix(arg, "--auth-token=", &v)) {
> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
> +			if (!p[0]) {
> +				error("invalid argument '%s'", v);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			if (!p[1]) {
> +				error("missing token value '%s'\n", v);
> +				usage(test_http_auth_usage);
> +			}
> +
> +			/* trim trailing ':' */
> +			strbuf_setlen(p[0], p[0]->len - 1);
> +
> +			mod = get_auth_module(p[0]->buf);
> +			if (!mod) {
> +				error("auth scheme not defined '%s'\n", p[0]->buf);
> +				usage(test_http_auth_usage);
> +			}

Does this mean that '--auth' needs to be specified before '--auth-token' to
avoid the "auth scheme not defined" error? If so, this could be made less
fragile by just setting the string value of the arg in this 'if()' block,
then processing the value after the option-parsing loop.

> +
> +			string_list_append(mod->tokens, p[1]->buf);
> +			strbuf_list_free(p);
> +			continue;
> +		}
>  
>  		fprintf(stderr, "error: unknown argument '%s'\n", arg);
>  		usage(test_http_auth_usage);

I think a test (in this patch) showing how the auth headers are handled by
this HTTP server would be really helpful in demonstrating/exercising the
intended behavior. 


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 8/8] t5556: add HTTP authentication tests
  2022-12-12 21:36       ` [PATCH v4 8/8] t5556: add HTTP authentication tests Matthew John Cheetham via GitGitGadget
@ 2022-12-14 23:48         ` Victoria Dye
  2022-12-15  0:21           ` Junio C Hamano
  2023-01-11 22:04           ` Matthew John Cheetham
  0 siblings, 2 replies; 171+ messages in thread
From: Victoria Dye @ 2022-12-14 23:48 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Matthew John Cheetham

Matthew John Cheetham via GitGitGadget wrote:
> From: Matthew John Cheetham <mjcheetham@outlook.com>
> 
> Add a series of tests to exercise the HTTP authentication header parsing
> and the interop with credential helpers. Credential helpers will receive
> WWW-Authenticate information in credential requests.

A general comment about this series - the way you have the patches organized
means that the "feature" content you're trying to integrate (the first two
patches) is contextually separated from these tests. For people that
learn/understand code via examples in tests, this makes it really difficult
to understand what's going on. To avoid that, I think you could rearrange
the patches pretty easily:

1. test-http-server: add stub HTTP server test helper (prev. patch 3)
  - t5556 could be introduced here with the basic "anonymous" test in patch
    6, but marked 'test_expect_failure'.
2. test-http-server: add HTTP error response function (prev. patch 4)
3. test-http-server: add HTTP request parsing (prev. patch 5)
4. test-http-server: pass Git requests to http-backend (prev. patch 6)
5. test-http-server: add simple authentication (prev. patch 7)
6. http: read HTTP WWW-Authenticate response headers (prev. patch 1)
7. credential: add WWW-Authenticate header to cred requests (prev patch 2)
  - Some/all of the tests from the current patch (patch 8) could be squashed
    into this one so that the tests exist directly alongside the new
    functionality they're testing.

> 
> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
> ---
>  t/helper/test-credential-helper-replay.sh |  14 +++
>  t/t5556-http-auth.sh                      | 120 +++++++++++++++++++++-
>  2 files changed, 133 insertions(+), 1 deletion(-)
>  create mode 100755 t/helper/test-credential-helper-replay.sh
> 
> diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
> new file mode 100755
> index 00000000000..03e5e63dad6
> --- /dev/null
> +++ b/t/helper/test-credential-helper-replay.sh

I'm not sure a 't/helper' file is the right place for this - it's a pretty
simple shell script, but it defines a lot of information (namely 'teefile',
'catfile') that is otherwise unexplained in 't5556'. 

What about something like 'lib-rebase.sh' and its 'set_fake_editor()'? You
could create a similar test lib ('lib-credential-helper.sh') and wrapper
function (' that writes out a custom credential helper. Something like
'set_fake_credential_helper()' could also take 'teefile' and 'catfile' as
arguments, making their names more transparent to 't5556'.

> @@ -0,0 +1,14 @@
> +cmd=$1
> +teefile=$cmd-actual.cred
> +catfile=$cmd-response.cred
> +rm -f $teefile
> +while read line;
> +do
> +	if test -z "$line"; then
> +		break;
> +	fi
> +	echo "$line" >> $teefile
> +done
> +if test "$cmd" = "get"; then
> +	cat $catfile
> +fi
> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
> index 78da151f122..541fa32bd77 100755
> --- a/t/t5556-http-auth.sh
> +++ b/t/t5556-http-auth.sh
> @@ -26,6 +26,8 @@ PID_FILE="$(pwd)"/pid-file.pid
>  SERVER_LOG="$(pwd)"/OUT.server.log
>  
>  PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
> +CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
> +	&& export CREDENTIAL_HELPER

I see - this is how you connect the "test" credential helper to the HTTP
server and header parsing (as implemented in patches 1 & 2), so that the
results can be compared for correctness.

nit: you can just 'export CREDENTIAL_HELPER="..."', rather than breaking it
into two lines. You also shouldn't need to 'export' at all - the value will
be set in the context of the test.

>  
>  test_expect_success 'setup repos' '
>  	test_create_repo "$REPO_DIR" &&
> @@ -91,7 +93,8 @@ start_http_server () {
>  
>  per_test_cleanup () {
>  	stop_http_server &&
> -	rm -f OUT.*
> +	rm -f OUT.* &&
> +	rm -f *.cred
>  }
>  
>  test_expect_success 'http auth anonymous no challenge' '
> @@ -102,4 +105,119 @@ test_expect_success 'http auth anonymous no challenge' '
>  	git ls-remote $ORIGIN_URL
>  '
>  
> +test_expect_success 'http auth www-auth headers to credential helper basic valid' '

...

> +test_expect_success 'http auth www-auth headers to credential helper custom schemes' '

...

> +test_expect_success 'http auth www-auth headers to credential helper invalid' '

These tests all look good. That said, is there any way to test more
bizarre/edge cases (headers too long to fit on one line, headers that end
with a long string of whitespace, etc.)?


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 8/8] t5556: add HTTP authentication tests
  2022-12-14 23:48         ` Victoria Dye
@ 2022-12-15  0:21           ` Junio C Hamano
  2023-01-11 22:05             ` Matthew John Cheetham
  2023-01-11 22:04           ` Matthew John Cheetham
  1 sibling, 1 reply; 171+ messages in thread
From: Junio C Hamano @ 2022-12-15  0:21 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Matthew John Cheetham via GitGitGadget, git, Derrick Stolee,
	Lessley Dennington, Matthew John Cheetham, M Hickford,
	Jeff Hostetler, Glen Choo, Matthew John Cheetham

Victoria Dye <vdye@github.com> writes:

> A general comment about this series - the way you have the patches organized
> means that the "feature" content you're trying to integrate (the first two
> patches) is contextually separated from these tests. For people that
> learn/understand code via examples in tests, this makes it really difficult
> to understand what's going on. To avoid that, I think you could rearrange
> the patches pretty easily:
> ...

Thanks for a thorough review of the entire series, with concrete
suggestions for improvements with encouragements sprinkled in.

Very much appreciated.


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers
  2022-12-12 21:36       ` [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers Matthew John Cheetham via GitGitGadget
  2022-12-14 23:15         ` Victoria Dye
@ 2022-12-15  9:27         ` Ævar Arnfjörð Bjarmason
  2023-01-11 22:11           ` Matthew John Cheetham
  1 sibling, 1 reply; 171+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  9:27 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, M Hickford,
	Jeff Hostetler, Glen Choo, Matthew John Cheetham,
	Matthew John Cheetham


On Mon, Dec 12 2022, Matthew John Cheetham via GitGitGadget wrote:

> From: Matthew John Cheetham <mjcheetham@outlook.com>
> [...]
>  /* Initialize a credential structure, setting all fields to empty. */
> diff --git a/http.c b/http.c
> index 8a5ba3f4776..c4e9cd73e14 100644
> --- a/http.c
> +++ b/http.c
> @@ -183,6 +183,82 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>  	return nmemb;
>  }
>  
> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
> +{
> +	size_t size = eltsize * nmemb;

Just out of general paranoia: use st_mult() here, not "*" (checks for
overflows)?

> +	struct strvec *values = &http_auth.wwwauth_headers;
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *val;
> +	const char *z = NULL;

Why NULL-init the "z" here, but not the "val"? Both look like they
should be un-init'd. We also tend to call a throw-away char pointer "p",
not "z", but anyway (more below).... 

> +
> +	/*
> +	 * Header lines may not come NULL-terminated from libcurl so we must
> +	 * limit all scans to the maximum length of the header line, or leverage
> +	 * strbufs for all operations.
> +	 *
> +	 * In addition, it is possible that header values can be split over
> +	 * multiple lines as per RFC 2616 (even though this has since been
> +	 * deprecated in RFC 7230). A continuation header field value is
> +	 * identified as starting with a space or horizontal tab.
> +	 *
> +	 * The formal definition of a header field as given in RFC 2616 is:
> +	 *
> +	 *   message-header = field-name ":" [ field-value ]
> +	 *   field-name     = token
> +	 *   field-value    = *( field-content | LWS )
> +	 *   field-content  = <the OCTETs making up the field-value
> +	 *                    and consisting of either *TEXT or combinations
> +	 *                    of token, separators, and quoted-string>
> +	 */
> +
> +	strbuf_add(&buf, ptr, size);
> +
> +	/* Strip the CRLF that should be present at the end of each field */
> +	strbuf_trim_trailing_newline(&buf);
> +
> +	/* Start of a new WWW-Authenticate header */
> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
> +		while (isspace(*val))
> +			val++;

As we already have a "struct strbuf" here, maybe we can instead
consistently use the strbuf functions, e.g. strbuf_ltrim() in this case.

I haven't reviewed this in detail, maybe it's not easy or worth it
here...

> +
> +		strvec_push(values, val);
> +		http_auth.header_is_last_match = 1;
> +		goto exit;
> +	}
> +
> +	/*
> +	 * This line could be a continuation of the previously matched header
> +	 * field. If this is the case then we should append this value to the
> +	 * end of the previously consumed value.
> +	 */
> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
> +		const char **v = values->v + values->nr - 1;

It makes no difference to the compiler, but perhaps using []-indexing
here is more idiomatic, for getting the nth member of this strvec?

> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
> +
> +		free((void*)*v);

Is this reaching into the strvec & manually memory-managing it
unavoidable, or can we use strvec_pop() etc?

> +		*v = append;
> +
> +		goto exit;
> +	}
> +
> +	/* This is the start of a new header we don't care about */
> +	http_auth.header_is_last_match = 0;
> +
> +	/*
> +	 * If this is a HTTP status line and not a header field, this signals
> +	 * a different HTTP response. libcurl writes all the output of all
> +	 * response headers of all responses, including redirects.
> +	 * We only care about the last HTTP request response's headers so clear
> +	 * the existing array.
> +	 */
> +	if (skip_iprefix(buf.buf, "http/", &z))

...Don't you want to just skip this "z" variable altogether and use
istarts_with() instead? All you seem to care about is whether it starts
with it, not what the offset is.


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 2/8] credential: add WWW-Authenticate header to cred requests
  2022-12-14 23:15         ` Victoria Dye
@ 2023-01-11 20:37           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 20:37 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:15, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Add the value of the WWW-Authenticate response header to credential
>> requests. Credential helpers that understand and support HTTP
>> authentication and authorization can use this standard header (RFC 2616
>> Section 14.47 [1]) to generate valid credentials.
>>
>> WWW-Authenticate headers can contain information pertaining to the
>> authority, authentication mechanism, or extra parameters/scopes that are
>> required.
>>
>> The current I/O format for credential helpers only allows for unique
>> names for properties/attributes, so in order to transmit multiple header
>> values (with a specific order) we introduce a new convention whereby a
>> C-style array syntax is used in the property name to denote multiple
>> ordered values for the same property.
>>
>> In this case we send multiple `wwwauth[]` properties where the order
>> that the repeated attributes appear in the conversation reflects the
>> order that the WWW-Authenticate headers appeared in the HTTP response.
>>
>> [1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47
> 
> ...
> 
>> +Attributes with keys that end with C-style array brackets `[]` can have
>> +multiple values. Each instance of a multi-valued attribute forms an
>> +ordered list of values - the order of the repeated attributes defines
>> +the order of the values. An empty multi-valued attribute (`key[]=\n`)
>> +acts to clear any previous entries and reset the list.
>> +
> 
> The commit message & documentation changes (here and the 'www-auth[]'
> definition below) are concise, easy-to-understand explanations of what
> you're doing here with the 'www-authenticate' header values.
> 
>>  
>> @@ -160,6 +166,16 @@ empty string.
>>  Components which are missing from the URL (e.g., there is no
>>  username in the example above) will be left unset.
>>  
>> +`wwwauth[]`::
>> +
>> +	When an HTTP response is received by Git that includes one or more
>> +	'WWW-Authenticate' authentication headers, these will be passed by Git
>> +	to credential helpers.
>> +	Each 'WWW-Authenticate' header value is passed as a multi-valued
>> +	attribute 'wwwauth[]', where the order of the attributes is the same as
>> +	they appear in the HTTP response. This attribute is 'one-way' from Git
>> +	to pass additional information to credential helpers.
> 
> nit: if you're trying to get a paragraph break between "...to credential
> helpers." and "Each 'WWW-Authenticate' header value", you need to add an
> explicit break:
> 
> -------- 8< --------
> 
> diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
> index bf0de0e940..50759153ef 100644
> --- a/Documentation/git-credential.txt
> +++ b/Documentation/git-credential.txt
> @@ -171,10 +171,11 @@ username in the example above) will be left unset.
>  	When an HTTP response is received by Git that includes one or more
>  	'WWW-Authenticate' authentication headers, these will be passed by Git
>  	to credential helpers.
> -	Each 'WWW-Authenticate' header value is passed as a multi-valued
> -	attribute 'wwwauth[]', where the order of the attributes is the same as
> -	they appear in the HTTP response. This attribute is 'one-way' from Git
> -	to pass additional information to credential helpers.
> ++
> +Each 'WWW-Authenticate' header value is passed as a multi-valued
> +attribute 'wwwauth[]', where the order of the attributes is the same as
> +they appear in the HTTP response. This attribute is 'one-way' from Git
> +to pass additional information to credential helpers.
>  
>  Unrecognised attributes are silently discarded.
>  
> -------- >8 --------
> 
> You can test to see how the docs look by running 'make doc' from the
> repository root and looking at the generated 'git-credential.html' (note
> that, if you've installed Git dependencies with Homebrew, you might need to
> specify 'XML_CATALOG_FILES=$(brew --prefix)/etc/xml/catalog' to get it to
> work).

Thanks! Yes, I was intending there to be a line break. Thanks for the tip;
will be addressed in the next iteration.

>> +
>>  Unrecognised attributes are silently discarded.
>>  
>>  GIT
>> diff --git a/credential.c b/credential.c
>> index 897b4679333..8a3ad6c0ae2 100644
>> --- a/credential.c
>> +++ b/credential.c
>> @@ -263,6 +263,17 @@ static void credential_write_item(FILE *fp, const char *key, const char *value,
>>  	fprintf(fp, "%s=%s\n", key, value);
>>  }
>>  
>> +static void credential_write_strvec(FILE *fp, const char *key,
>> +				    const struct strvec *vec)
>> +{
>> +	int i = 0;
>> +	const char *full_key = xstrfmt("%s[]", key);
>> +	for (; i < vec->nr; i++) {
>> +		credential_write_item(fp, full_key, vec->v[i], 0);
>> +	}
>> +	free((void*)full_key);
>> +}
>> +
>>  void credential_write(const struct credential *c, FILE *fp)
>>  {
>>  	credential_write_item(fp, "protocol", c->protocol, 1);
>> @@ -270,6 +281,7 @@ void credential_write(const struct credential *c, FILE *fp)
>>  	credential_write_item(fp, "path", c->path, 0);
>>  	credential_write_item(fp, "username", c->username, 0);
>>  	credential_write_item(fp, "password", c->password, 0);
>> +	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
> 
> This implementation looks good to me.
> 
>>  }
>>  
>>  static int run_credential_helper(struct credential *c,
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 3/8] test-http-server: add stub HTTP server test helper
  2022-12-14 23:16         ` Victoria Dye
@ 2023-01-11 20:46           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 20:46 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:16, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Introduce a mini HTTP server helper that in the future will be enhanced
>> to provide a frontend for the git-http-backend, with support for
>> arbitrary authentication schemes.
> 
> I really like this approach, particularly because it opens up the
> possibility of writing more fine-grained tests in other contexts (e.g.,
> testing how a bundle-uri client handles different kinds of erroneous server
> responses by intercepting and customizing those responses).

Having a mini server we can play around with makes it easier to simulate a
'bad' server, rather than use a real one like Apache and try and coerce it
in to doing 'bad' things.

>>
>> Right now, test-http-server is a pared-down copy of the git-daemon that
>> always returns a 501 Not Implemented response to all callers.
>>
>> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
>> ---
>>  Makefile                            |   2 +
>>  contrib/buildsystems/CMakeLists.txt |  13 +
>>  t/helper/.gitignore                 |   1 +
>>  t/helper/test-http-server.c         | 685 ++++++++++++++++++++++++++++
>>  4 files changed, 701 insertions(+)
>>  create mode 100644 t/helper/test-http-server.c
>>
>> diff --git a/Makefile b/Makefile
>> index b258fdbed86..1eb795bbfd4 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -1611,6 +1611,8 @@ else
>>  	endif
>>  	BASIC_CFLAGS += $(CURL_CFLAGS)
>>  
>> +	TEST_PROGRAMS_NEED_X += test-http-server
> 
> This works because all usage of 'TEST_PROGRAMS_NEED_X' are either lazily
> evaluated (in the case of 'TEST_PROGRAMS') or are assigned later in the
> 'Makefile' than the addition here (in the case of 'test_bindir_programs'). 
> 
> On a related note, I think it would be helpful to mention 'test-http-server'
> in the "=== Optional library: libcurl ===" section of the documentation at
> the top of the Makefile, to clarify that it (like 'git-http-fetch' and
> 'git-http-push') are not built.

Upon closer inspection I noticed we don't actuall depend on libcurl here.
In my next iteration I've reworked the test helper to share some code with
daemon.c and changed where we add `test-http-server` in the Makefiles to
be the same as `test-fake-ssh`.

>> +
>>  	REMOTE_CURL_PRIMARY = git-remote-http$X
>>  	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
>>  	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
>> diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
>> index 2f6e0197ffa..e9b9bfbb437 100644
>> --- a/contrib/buildsystems/CMakeLists.txt
>> +++ b/contrib/buildsystems/CMakeLists.txt
>> @@ -989,6 +989,19 @@ set(wrapper_scripts
>>  set(wrapper_test_scripts
>>  	test-fake-ssh test-tool)
>>  
>> +if(CURL_FOUND)
>> +       list(APPEND wrapper_test_scripts test-http-server)
>> +
>> +       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
>> +       target_link_libraries(test-http-server common-main)
>> +
>> +       if(MSVC)
>> +               set_target_properties(test-http-server
>> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
>> +               set_target_properties(test-http-server
>> +                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
>> +       endif()
>> +endif()
>>  
>>  foreach(script ${wrapper_scripts})
>>  	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
>> diff --git a/t/helper/.gitignore b/t/helper/.gitignore
>> index 8c2ddcce95f..9aa9c752997 100644
>> --- a/t/helper/.gitignore
>> +++ b/t/helper/.gitignore
>> @@ -1,2 +1,3 @@
>>  /test-tool
>>  /test-fake-ssh
>> +/test-http-server
>> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
>> new file mode 100644
>> index 00000000000..18f1f741305
>> --- /dev/null
>> +++ b/t/helper/test-http-server.c
> 
> A lot of the functions in this file are modified versions of ones in
> 'daemon.c'. It would help reviewers/future readers to mention that in the
> commit message. 

I appreciate the thorough effort here in understanding what those daemon.c
functions do. Hopefully the next iteration will help other reviewers as I'm
going to be extracting the identical functions to share them between daemon.c
and test-http-server.c.

> My comments are mostly going to be around the similarities/differences from
> 'daemon.c', hopefully to understand how 'test-http-server' is meant to be
> used.
> 
>> +static void logreport(const char *label, const char *err, va_list params)
>> +{
>> +	struct strbuf msg = STRBUF_INIT;
>> +
>> +	strbuf_addf(&msg, "[%"PRIuMAX"] %s: ", (uintmax_t)getpid(), label);
>> +	strbuf_vaddf(&msg, err, params);
>> +	strbuf_addch(&msg, '\n');
>> +
>> +	fwrite(msg.buf, sizeof(char), msg.len, stderr);
>> +	fflush(stderr);
>> +
>> +	strbuf_release(&msg);
> 
> This looks like the 'LOG_DESTINATION_STDERR' case of 'logreport()' in
> 'daemon.c', but adds a "label" to represent the priority. Makes sense; these
> logs will be helpful to have in stderr when running tests, and the priority
> will be captured as well.
> 
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void logerror(const char *err, ...)
>> +{
>> +	va_list params;
>> +	va_start(params, err);
>> +	logreport("error", err, params);
>> +	va_end(params);
>> +}
>> +
>> +__attribute__((format (printf, 1, 2)))
>> +static void loginfo(const char *err, ...)
>> +{
>> +	va_list params;
>> +	if (!verbose)
>> +		return;
>> +	va_start(params, err);
>> +	logreport("info", err, params);
>> +	va_end(params);
>> +}
> 
> These two functions replace the "priority" int with the "label" string, but
> otherwise capture the same information.
> 
>> +
>> +static void set_keep_alive(int sockfd)
> 
> This function is identical to its 'daemon.c' counterpart; its usage in
> 'test-http-server.c' doesn't indicate any need to differ.
> 
>> +
>> +/*
>> + * The code in this section is used by "worker" instances to service
>> + * a single connection from a client.  The worker talks to the client
>> + * on 0 and 1.
>> + */
>> +
>> +enum worker_result {
>> +	/*
>> +	 * Operation successful.
>> +	 * Caller *might* keep the socket open and allow keep-alive.
>> +	 */
>> +	WR_OK       = 0,
>> +
>> +	/*
>> +	 * Various errors while processing the request and/or the response.
>> +	 * Close the socket and clean up.
>> +	 * Exit child-process with non-zero status.
>> +	 */
>> +	WR_IO_ERROR = 1<<0,
>> +
>> +	/*
>> +	 * Close the socket and clean up.  Does not imply an error.
>> +	 */
>> +	WR_HANGUP   = 1<<1,
>> +
>> +	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
> 
> As much as I love the name, I'm not sure having this value defined makes
> much sense as its own "state". AFAICT, 'WR_IO_ERROR' means "error AND exit",
> but 'WR_HANGUP' just means "exit", so the latter is a superset of the
> former. Even if you interpret 'WR_HANGUP' as "*no* error and exit", that
> makes it and 'WR_IO_ERROR' mutually exclusive, so the "combined" state
> doesn't represent anything "real".

Fair point. Will remove this extra value in next iteration.

>> +};
>> +
>> +static enum worker_result worker(void)
>> +{
>> +	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
> 
> Here's the hardcoded 501 error, as mentioned in the commit message.
> 
>> +	char *client_addr = getenv("REMOTE_ADDR");
>> +	char *client_port = getenv("REMOTE_PORT");
>> +	enum worker_result wr = WR_OK;
>> +
>> +	if (client_addr)
>> +		loginfo("Connection from %s:%s", client_addr, client_port);
>> +
>> +	set_keep_alive(0);
>> +
>> +	while (1) {
>> +		if (write_in_full(1, response, strlen(response)) < 0) {
>> +			logerror("unable to write response");
>> +			wr = WR_IO_ERROR;
>> +		}
> 
> This tries to write the response out to stdout (optional nit: you could use
> 'STDOUT_FILENO' instead of '1' to make this clearer), and sets 'WR_IO_ERROR'
> if it fails. 

Good point; will use `STDOUT_FILENO` in all applicable places in next iteration.

>> +
>> +		if (wr & WR_STOP_THE_MUSIC)
>> +			break;
> 
> This will trigger if 'wr' is 'WR_HANGUP' *or* 'WR_IO_ERROR'. Is that
> intentional? If it is, I think 'wr != 'WR_OK' might make that more obvious?
> 
>> +	}
>> +
>> +	close(0);
>> +	close(1);
>> +
>> +	return !!(wr & WR_IO_ERROR);
> 
> Then finish by closing out 'stdin' and 'stdout', and returning '0' for "no
> error", '1' for "error".
> 
>> +}
>> +
>> +/*
>> + * This section contains the listener and child-process management
>> + * code used by the primary instance to accept incoming connections
>> + * and dispatch them to async child process "worker" instances.
>> + */
>> +
>> +static int addrcmp(const struct sockaddr_storage *s1,
> 
> 
> Identical to 'daemon.c'.
> 
>> +static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
>> +{
>> +	struct child *newborn, **cradle;
>> +
>> +	newborn = xcalloc(1, sizeof(*newborn));
>> +	live_children++;
>> +	memcpy(&newborn->cld, cld, sizeof(*cld));
>> +	memcpy(&newborn->address, addr, addrlen);
>> +	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
>> +		if (!addrcmp(&(*cradle)->address, &newborn->address))
>> +			break;
>> +	newborn->next = *cradle;
>> +	*cradle = newborn;
>> +}
> 
> This is mostly the same as 'daemon.c', but uses 'xcalloc()' instead of
> 'CALLOC_ARRAY()'. The latter is an alias for the former, so this is fine.
> 
>> +static void kill_some_child(void)
> 
> ...
> 
>> +static void check_dead_children(void)
> Both of these are identical to 'daemon.c'.
> 
>> +
>> +static struct strvec cld_argv = STRVEC_INIT;
>> +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
> 
> This matches 'daemon.c' except for the addition of:
> 
>> +	if (cld.out < 0)
>> +		logerror("could not dup() `incoming`");
> 
> The extra context provided by this message could be helpful in debugging. If
> nothing else, it doesn't hurt.
> 
>> +	else if (start_command(&cld))
>> +		logerror("unable to fork");
>> +	else
>> +		add_child(&cld, addr, addrlen);
>> +}
>> +
>> +static void child_handler(int signo)
> 
> ...
> 
>> +static int set_reuse_addr(int sockfd)
> 
> ...
> 
>> +static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
> 
> ...
> 
>> +#ifndef NO_IPV6
>> +
>> +static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
> ...
> 
>> +#else /* NO_IPV6 */
>> +
>> +static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
> 
> All of these functions match 'daemon.c' (save for some whitespace fixups).
> 
>> +
>> +static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
>> +{
>> +	if (!listen_addr->nr)
>> +		setup_named_sock("127.0.0.1", listen_port, socklist);
> 
> This is the only difference in this function from 'daemon.c' (there, the
> first arg is 'NULL', which ends up mapping to 'INADDR_ANY'). Why the change
> in default?

Next iteration will share implementation with daemon.c.

>> +	else {
>> +		int i, socknum;
>> +		for (i = 0; i < listen_addr->nr; i++) {
>> +			socknum = setup_named_sock(listen_addr->items[i].string,
>> +						   listen_port, socklist);
>> +
>> +			if (socknum == 0)
>> +				logerror("unable to allocate any listen sockets for host %s on port %u",
>> +					 listen_addr->items[i].string, listen_port);
>> +		}
>> +	}
>> +}
>> +
>> +static int service_loop(struct socketlist *socklist)
> 
> This function differs from 'daemon.c' by using removal of the 'pid_file' to
> force a graceful shutdown of the server.
> 
>> +{
>> +	struct pollfd *pfd;
>> +	int i;
>> +
>> +	CALLOC_ARRAY(pfd, socklist->nr);
>> +
>> +	for (i = 0; i < socklist->nr; i++) {
>> +		pfd[i].fd = socklist->list[i];
>> +		pfd[i].events = POLLIN;
>> +	}
>> +
>> +	signal(SIGCHLD, child_handler);
>> +
>> +	for (;;) {
>> +		int i;
>> +		int nr_ready;
>> +		int timeout = (pid_file ? 100 : -1);
>> +
>> +		check_dead_children();
>> +
>> +		nr_ready = poll(pfd, socklist->nr, timeout);
> 
> Setting a timeout here (if 'pid_file' is present) allows us to operate in a
> mode where the removal of a 'pid_file' indicates that the server should shut
> down.
> 
>> +		if (nr_ready < 0) {
> 
> 'nr_ready < 0' indicates an error [1]; handle the same way as 'daemon.c'.
> 
> [1] https://man7.org/linux/man-pages/man2/poll.2.html
> 
>> +			if (errno != EINTR) {
>> +				logerror("Poll failed, resuming: %s",
>> +				      strerror(errno));
>> +				sleep(1);
>> +			}
>> +			continue;
>> +		}
>> +		else if (nr_ready == 0) {
> 
> 'nr_ready == 0' indicates a polling timeout (see [1] above)...
> 
>> +			/*
>> +			 * If we have a pid_file, then we watch it.
>> +			 * If someone deletes it, we shutdown the service.
>> +			 * The shell scripts in the test suite will use this.
>> +			 */
>> +			if (!pid_file || file_exists(pid_file))
>> +				continue;
>> +			goto shutdown;
> 
> ...and that timeout exists so that we can check whether the 'pid_file' still
> exists and, if so, shut down gracefully.
> 
>> +		}
>> +
> 
> Otherwise, 'nr_ready > 1', so handle the polled events.
> 
>> +		for (i = 0; i < socklist->nr; i++) {
>> +			if (pfd[i].revents & POLLIN) {
>> +				union {
>> +					struct sockaddr sa;
>> +					struct sockaddr_in sai;
>> +#ifndef NO_IPV6
>> +					struct sockaddr_in6 sai6;
>> +#endif
>> +				} ss;
>> +				socklen_t sslen = sizeof(ss);
>> +				int incoming = accept(pfd[i].fd, &ss.sa, &sslen);
>> +				if (incoming < 0) {
>> +					switch (errno) {
>> +					case EAGAIN:
>> +					case EINTR:
>> +					case ECONNABORTED:
>> +						continue;
>> +					default:
>> +						die_errno("accept returned");
>> +					}
>> +				}
>> +				handle(incoming, &ss.sa, sslen);
>> +			}
>> +		}
>> +	}
>> +
>> +shutdown:
>> +	loginfo("Starting graceful shutdown (pid-file gone)");
>> +	for (i = 0; i < socklist->nr; i++)
>> +		close(socklist->list[i]);
>> +
>> +	return 0;
> 
> This addition logs the shutdown and closes out sockets. Looks good!
> 
>> +}
>> +
>> +static int serve(struct string_list *listen_addr, int listen_port)
>> +{
>> +	struct socketlist socklist = { NULL, 0, 0 };
>> +
>> +	socksetup(listen_addr, listen_port, &socklist);
>> +	if (socklist.nr == 0)
>> +		die("unable to allocate any listen sockets on port %u",
>> +		    listen_port);
>> +
>> +	loginfo("Ready to rumble");
> 
> I thought this was a leftover debug printout, but it turns out that
> 'serve()' in 'daemon.c' has the same message. :) 

Indeed! This made me chuckle when I first saw it..

>> +
>> +	/*
>> +	 * Wait to create the pid-file until we've setup the sockets
>> +	 * and are open for business.
>> +	 */
>> +	if (pid_file)
>> +		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
>> +
>> +	return service_loop(&socklist);
>> +}
>> +
>> +/*
>> + * This section is executed by both the primary instance and all
>> + * worker instances.  So, yes, each child-process re-parses the
>> + * command line argument and re-discovers how it should behave.
>> + */
>> +
>> +int cmd_main(int argc, const char **argv)
>> +{
>> +	int listen_port = 0;
>> +	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
>> +	int worker_mode = 0;
>> +	int i;
>> +
>> +	trace2_cmd_name("test-http-server");
>> +	setup_git_directory_gently(NULL);
> 
> Since this isn't part of 'test-tool', it needs to do its own trace2 setup,
> but it seems to be missing some of the relevant function calls. Could you
> include 'trace2_cmd_list_config()' and 'trace2_cmd_list_env_vars()' as well? 

Sure!

>> +
>> +	for (i = 1; i < argc; i++) {
> 
> Can this loop be replaced with 'parse_options()' and the appropriate 'struct
> option[]'? Newer test helpers ('test-bundle-uri', 'test-cache-tree',
> 'test-getcwd') have been using it, and it generally seems much easier to
> work with/more flexible than a custom 'if()' block (handling option
> negation, interpreting both '--option=<value>' and '--option value' syntax
> etc.).
> 
> That said, it looks this was mostly pulled from 'daemon.c' (which might
> predate 'parse_options()'), so I'd also understand if you want to keep it as
> similar to that as possible. Up to you!

For now I think I'll keep it the same style as daemon.c.

>> +	/* avoid splitting a message in the middle */
>> +	setvbuf(stderr, NULL, _IOFBF, 4096);
>> +
>> +	if (listen_port == 0)
>> +		listen_port = DEFAULT_GIT_PORT;
>> +
>> +	/*
>> +	 * If no --listen=<addr> args are given, the setup_named_sock()
>> +	 * code will use receive a NULL address and set INADDR_ANY.
>> +	 * This exposes both internal and external interfaces on the
>> +	 * port.
>> +	 *
>> +	 * Disallow that and default to the internal-use-only loopback
>> +	 * address.
>> +	 */
>> +	if (!listen_addr.nr)
>> +		string_list_append(&listen_addr, "127.0.0.1");
>> +
>> +	/*
>> +	 * worker_mode is set in our own child process instances
>> +	 * (that are bound to a connected socket from a client).
>> +	 */
>> +	if (worker_mode)
>> +		return worker();
>> +
>> +	/*
>> +	 * `cld_argv` is a bit of a clever hack. The top-level instance
>> +	 * of test-http-server does the normal bind/listen/accept stuff.
>> +	 * For each incoming socket, the top-level process spawns
>> +	 * a child instance of test-http-server *WITH* the additional
>> +	 * `--worker` argument. This causes the child to set `worker_mode`
>> +	 * and immediately call `worker()` using the connected socket (and
>> +	 * without the usual need for fork() or threads).
>> +	 *
>> +	 * The magic here is made possible because `cld_argv` is static
>> +	 * and handle() (called by service_loop()) knows about it.
>> +	 */
>> +	strvec_push(&cld_argv, argv[0]);
>> +	strvec_push(&cld_argv, "--worker");
>> +	for (i = 1; i < argc; ++i)
>> +		strvec_push(&cld_argv, argv[i]);
>> +
>> +	/*
>> +	 * Setup primary instance to listen for connections.
>> +	 */
>> +	return serve(&listen_addr, listen_port);
> 
> The rest of the function is "new", but is well-documented and appears to
> work as intended.
> 
>> +}
> 
> One last note/suggestion - while a lot of the functions in
> 'test-http-server.c' are modified from those in 'daemon.c', there are a fair
> number of identical functions as well. Would it be possible to libify some
> of 'daemon.c's functions (mainly by creating a 'daemon.h' and making the
> functions non-static) so that they don't need to be copied?
> 

Watch for my next iteration for this!

Thanks,
Matthew


^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 5/8] test-http-server: add HTTP request parsing
  2022-12-14 23:18         ` Victoria Dye
@ 2023-01-11 21:39           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 21:39 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:18, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> +/*
>> + * Read the HTTP request up to the start of the optional message-body.
>> + * We do this byte-by-byte because we have keep-alive turned on and
>> + * cannot rely on an EOF.
>> + *
>> + * https://tools.ietf.org/html/rfc7230
>> + *
>> + * We cannot call die() here because our caller needs to properly
>> + * respond to the client and/or close the socket before this
>> + * child exits so that the client doesn't get a connection reset
>> + * by peer error.
>> + */
>> +static enum worker_result req__read(struct req *req, int fd)
>> +{
>> +	struct strbuf h = STRBUF_INIT;
>> +	struct string_list start_line_fields = STRING_LIST_INIT_DUP;
>> +	int nr_start_line_fields;
>> +	const char *uri_target;
>> +	const char *query;
>> +	char *hp;
>> +	const char *hv;
>> +
>> +	enum worker_result result = WR_OK;
>> +
>> +	/*
>> +	 * Read line 0 of the request and split it into component parts:
>> +	 *
>> +	 *    <method> SP <uri-target> SP <HTTP-version> CRLF
>> +	 *
>> +	 */
>> +	if (strbuf_getwholeline_fd(&req->start_line, fd, '\n') == EOF) {
>> +		result = WR_OK | WR_HANGUP;
>> +		goto done;
>> +	}
>> +
>> +	strbuf_trim_trailing_newline(&req->start_line);
>> +
>> +	nr_start_line_fields = string_list_split(&start_line_fields,
>> +						 req->start_line.buf,
>> +						 ' ', -1);
>> +	if (nr_start_line_fields != 3) {
>> +		logerror("could not parse request start-line '%s'",
>> +			 req->start_line.buf);
>> +		result = WR_IO_ERROR;
>> +		goto done;
>> +	}
>> +
>> +	req->method = xstrdup(start_line_fields.items[0].string);
>> +	req->http_version = xstrdup(start_line_fields.items[2].string);
>> +
>> +	uri_target = start_line_fields.items[1].string;
>> +
>> +	if (strcmp(req->http_version, "HTTP/1.1")) {
>> +		logerror("unsupported version '%s' (expecting HTTP/1.1)",
>> +			 req->http_version);
>> +		result = WR_IO_ERROR;
>> +		goto done;
>> +	}
>> +
>> +	query = strchr(uri_target, '?');
>> +
>> +	if (query) {
>> +		strbuf_add(&req->uri_path, uri_target, (query - uri_target));
>> +		strbuf_trim_trailing_dir_sep(&req->uri_path);
>> +		strbuf_addstr(&req->query_args, query + 1);
>> +	} else {
>> +		strbuf_addstr(&req->uri_path, uri_target);
>> +		strbuf_trim_trailing_dir_sep(&req->uri_path);
>> +	}
> 
> This "line 0" parsing looks good, and aligns with the RFC you linked
> (specifically section 3.1.1 [1]).
> 
> [1] https://www.rfc-editor.org/rfc/rfc7230#section-3.1.1
> 
>> +
>> +	/*
>> +	 * Read the set of HTTP headers into a string-list.
>> +	 */
>> +	while (1) {
>> +		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
>> +			goto done;
>> +		strbuf_trim_trailing_newline(&h);
>> +
>> +		if (!h.len)
>> +			goto done; /* a blank line ends the header */
>> +
>> +		hp = strbuf_detach(&h, NULL);
>> +		string_list_append(&req->header_list, hp);
>> +
>> +		/* store common request headers separately */
>> +		if (skip_prefix(hp, "Content-Type: ", &hv)) {
>> +			req->content_type = hv;
>> +		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
>> +			req->content_length = strtol(hv, &hp, 10);
>> +		}
> 
> The "separately" is somewhat confusing - you unconditionally add 'hp' to
> 'req->header_list', so the "Content-Type" and "Content-Length" headers are
> included there as well. If that's the desired behavior, a comment like "Also
> store common headers as 'req' fields" might be clearer.

Will clarify this comment in next roll. You are correct, we *also* store these
common headers on `struct req`.

>> +	}
>> +
>> +	/*
>> +	 * We do not attempt to read the <message-body>, if it exists.
>> +	 * We let our caller read/chunk it in as appropriate.
>> +	 */
>> +
>> +done:
>> +	string_list_clear(&start_line_fields, 0);
>> +
>> +	/*
>> +	 * This is useful for debugging the request, but very noisy.
>> +	 */
>> +	if (trace2_is_enabled()) {
> 
> 'trace2_printf()' is gated internally by 'trace2_enabled' anyway, so I don't
> think this 'if()' is necessary. You could add a 'DEBUG_HTTP_SERVER'
> preprocessor directive (like 'DEBUG_CACHE_TREE' in 'cache-tree.c') if you
> wanted to prevent these printouts unless a developer sets it to '1'.

The overarching `trace2_is_enabled()` call is to avoid any possible repeated
evaluation within `trace2_printf` for potentially multiple request headers.

>> +		struct string_list_item *item;
>> +		trace2_printf("%s: %s", TR2_CAT, req->start_line.buf);
>> +		trace2_printf("%s: hver: %s", TR2_CAT, req->http_version);
>> +		trace2_printf("%s: hmth: %s", TR2_CAT, req->method);
>> +		trace2_printf("%s: path: %s", TR2_CAT, req->uri_path.buf);
>> +		trace2_printf("%s: qury: %s", TR2_CAT, req->query_args.buf);
>> +		if (req->content_length >= 0)
>> +			trace2_printf("%s: clen: %d", TR2_CAT, req->content_length);
>> +		if (req->content_type)
>> +			trace2_printf("%s: ctyp: %s", TR2_CAT, req->content_type);
>> +		for_each_string_list_item(item, &req->header_list)
>> +			trace2_printf("%s: hdrs: %s", TR2_CAT, item->string);
>> +	}
>> +
>> +	return result;
>> +}
>> +
>> +static enum worker_result dispatch(struct req *req)
>> +{
>> +	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>> +			       WR_OK | WR_HANGUP);
> 
> Although the request is now being read & parsed, the response creation code
> is still a hardcoded "Not Implemented". This means that the now-parsed 'req'
> is be temporarily unused, but I think that's reasonable (since it allows for
> breaking up the implementation of 'test-http-server' into multiple, less
> overwhelming patches).
> 
>> +}
>> +
>>  static enum worker_result worker(void)
>>  {
>> +	struct req req = REQ__INIT;
>>  	char *client_addr = getenv("REMOTE_ADDR");
>>  	char *client_port = getenv("REMOTE_PORT");
>>  	enum worker_result wr = WR_OK;
>> @@ -160,8 +324,16 @@ static enum worker_result worker(void)
>>  	set_keep_alive(0);
>>  
>>  	while (1) {
>> -		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
>> -			WR_OK | WR_HANGUP);
>> +		req__release(&req);
>> +
>> +		alarm(init_timeout ? init_timeout : timeout);
>> +		wr = req__read(&req, 0);
>> +		alarm(0);
> 
> I know 'init_timeout' and 'timeout' were pulled from 'daemon.c', but what's
> the difference between them/why do they both exist? It looks like
> 'init_timeout' just acts as a permanent override to the value of 'timeout'.

Good catch. This split made sense in daemon.c whereby the `--timeout` are would
be passed to the `git-upload-pack` command, and `--init-timeout` is used as the
timeout value for the daemon server itself.

In the test HTTP server we don't need the differentiation so I'll just use the
simpler `--timeout` arg.

>> +
>> +		if (wr & WR_STOP_THE_MUSIC)
>> +			break;
>> +
>> +		wr = dispatch(&req);
>>  		if (wr & WR_STOP_THE_MUSIC)
>>  			break;
>>  	}
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 6/8] test-http-server: pass Git requests to http-backend
  2022-12-14 23:20         ` Victoria Dye
@ 2023-01-11 21:45           ` Matthew John Cheetham
  2023-01-12 20:54             ` Victoria Dye
  0 siblings, 1 reply; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 21:45 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham


On 2022-12-14 15:20, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Teach the test-http-sever test helper to forward Git requests to the
>> `git-http-backend`.
>>
>> Introduce a new test script t5556-http-auth.sh that spins up the test
>> HTTP server and attempts an `ls-remote` on the served repository,
>> without any authentication.
>>
>> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
>> ---
>>  t/helper/test-http-server.c |  56 +++++++++++++++++++
>>  t/t5556-http-auth.sh        | 105 ++++++++++++++++++++++++++++++++++++
>>  2 files changed, 161 insertions(+)
>>  create mode 100755 t/t5556-http-auth.sh
>>
>> diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
>> index 7bde678e264..9f1d6b58067 100644
>> --- a/t/helper/test-http-server.c
>> +++ b/t/helper/test-http-server.c
>> @@ -305,8 +305,64 @@ done:
>>  	return result;
>>  }
>>  
>> +static int is_git_request(struct req *req)
>> +{
>> +	static regex_t *smart_http_regex;
>> +	static int initialized;
>> +
>> +	if (!initialized) {
>> +		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
>> +		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
>> +			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
>> +			    REG_EXTENDED)) {
> 
> Could you explain the reasoning behind this regex (e.g., in a comment)? What
> sorts of valid/invalid requests does it represent? Is that the full set of
> requests that are "valid" to Git, or is it a test-specific subset?

Explanatory comment will be added in next iteration. These are the valid Git
endpoints for the dumb and smart HTTP protocols as specified in the tech docs.

>> +			warning("could not compile smart HTTP regex");
>> +			smart_http_regex = NULL;
>> +		}
>> +		initialized = 1;
>> +	}
>> +
>> +	return smart_http_regex &&
>> +		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
>> +}
>> +
>> +static enum worker_result do__git(struct req *req, const char *user)
>> +{
>> +	const char *ok = "HTTP/1.1 200 OK\r\n";
>> +	struct child_process cp = CHILD_PROCESS_INIT;
>> +	int res;
>> +
>> +	if (write(1, ok, strlen(ok)) < 0)
>> +		return error(_("could not send '%s'"), ok);
> 
> Is it correct to hardcode the response status to '200 OK'? Even when
> 'http-backend' exits with an error?

We always respond with a 200 OK response even if the http-backend process exits
with an error. This helper is intended only to be used to exercise the HTTP
auth handling in the Git client, and specifically around authentication (not
handled by http-backend).

If we wanted to respond with a more 'valid' HTTP response status then we'd need
to buffer the output of http-backend, wait for and grok the exit status of the
process, then write the HTTP status line followed by the http-backend output.
This is outside of the scope of this test helper's use at time of writing.

Important auth responses (401) we are handling prior to getting to this point.

The above will also be summarised in a comment on the next roll.

>> +
>> +	if (user)
>> +		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
> 
> I'm guessing that 'user' isn't used until a later patch? I think it might be
> better to not introduce that arg at all until it's needed (it'll put the
> usage of 'user' in context with how its value is determined), rather than
> hardcode it to 'NULL' for now.

Good point!

>> +
>> +	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
>> +	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
>> +			req->uri_path.buf);
>> +	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
>> +	if (req->query_args.len)
>> +		strvec_pushf(&cp.env, "QUERY_STRING=%s",
>> +				req->query_args.buf);
>> +	if (req->content_type)
>> +		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
>> +				req->content_type);
>> +	if (req->content_length >= 0)
>> +		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
>> +				(intmax_t)req->content_length);
>> +	cp.git_cmd = 1;
>> +	strvec_push(&cp.args, "http-backend");
>> +	res = run_command(&cp);
> 
> I'm not super familiar with 'http-backend' but as long as it 1) uses the
> content passed into the environment to parse the request, and 2) writes the
> response to stdout, I think this is right.
> 
>> +	close(1);
>> +	close(0);
>> +	return !!res;
>> +}
>> +
>>  static enum worker_result dispatch(struct req *req)
>>  {
>> +	if (is_git_request(req))
>> +		return do__git(req, NULL);
>> +
>>  	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>>  			       WR_OK | WR_HANGUP);
>>  }
>> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
>> new file mode 100755
>> index 00000000000..78da151f122
>> --- /dev/null
>> +++ b/t/t5556-http-auth.sh
>> @@ -0,0 +1,105 @@
>> +#!/bin/sh
>> +
>> +test_description='test http auth header and credential helper interop'
>> +
>> +. ./test-lib.sh
>> +
>> +test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
>> +
>> +# Setup a repository
>> +#
>> +REPO_DIR="$(pwd)"/repo
> 
> nit: '$TEST_OUTPUT_DIRECTORY' instead of '$(pwd)' is more consistent with
> what I see in other tests. 

I don't see this? In fact I see more usages of `$(pwd)` than your suggestion.

> Also, if you're creating a repo in its own subdirectory ('repo'), you can
> set 'TEST_NO_CREATE_REPO=1' before importing './test-lib' to avoid creating
> a repo at the root level of the test output dir - it can help avoid
> potential weird/unexpected behavior as a result of being in a repo inside of
> another repo.

However.. after setting `TEST_NO_CREATE_REPO=1` I was getting CI failures
around a missing PWD, so my next iteration uses the `$TRASH_DIRECTORY` variable
explicitly in paths instead :-)

>> +
>> +# Setup some lookback URLs where test-http-server will be listening.
>> +# We will spawn it directly inside the repo directory, so we avoid
>> +# any need to configure directory mappings etc - we only serve this
>> +# repository from the root '/' of the server.
>> +#
>> +HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
>> +ORIGIN_URL=http://$HOST_PORT/
>> +
>> +# The pid-file is created by test-http-server when it starts.
>> +# The server will shutdown if/when we delete it (this is easier than
>> +# killing it by PID).
>> +#
>> +PID_FILE="$(pwd)"/pid-file.pid
>> +SERVER_LOG="$(pwd)"/OUT.server.log
>> +
>> +PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
>> +
>> +test_expect_success 'setup repos' '
>> +	test_create_repo "$REPO_DIR" &&
>> +	git -C "$REPO_DIR" branch -M main
>> +'
>> +
>> +stop_http_server () {
>> +	if ! test -f "$PID_FILE"
>> +	then
>> +		return 0
>> +	fi
>> +	#
>> +	# The server will shutdown automatically when we delete the pid-file.
>> +	#
>> +	rm -f "$PID_FILE"
>> +	#
>> +	# Give it a few seconds to shutdown (mainly to completely release the
>> +	# port before the next test start another instance and it attempts to
>> +	# bind to it).
>> +	#
>> +	for k in 0 1 2 3 4
>> +	do
>> +		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
>> +		then
>> +			return 0
>> +		fi
>> +		sleep 1
>> +	done
>> +
>> +	echo "stop_http_server: timeout waiting for server shutdown"
>> +	return 1
>> +}
>> +
>> +start_http_server () {
>> +	#
>> +	# Launch our server into the background in repo_dir.
>> +	#
>> +	(
>> +		cd "$REPO_DIR"
>> +		test-http-server --verbose \
>> +			--listen=127.0.0.1 \
>> +			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
>> +			--reuseaddr \
>> +			--pid-file="$PID_FILE" \
>> +			"$@" \
>> +			2>"$SERVER_LOG" &
>> +	)
>> +	#
>> +	# Give it a few seconds to get started.
>> +	#
>> +	for k in 0 1 2 3 4
>> +	do
>> +		if test -f "$PID_FILE"
>> +		then
>> +			return 0
>> +		fi
>> +		sleep 1
>> +	done
>> +
>> +	echo "start_http_server: timeout waiting for server startup"
>> +	return 1
>> +}
> 
> These start/stop functions look good to me!
> 
>> +
>> +per_test_cleanup () {
>> +	stop_http_server &&
>> +	rm -f OUT.*
>> +}
>> +
>> +test_expect_success 'http auth anonymous no challenge' '
>> +	test_when_finished "per_test_cleanup" &&
>> +	start_http_server --allow-anonymous &&
> 
> The '--allow-anonymous' option isn't added until patch 7 [1], so the test
> will fail in this patch. I think the easiest way to solve that is to remove
> it here (although I think it's fine to leave the title "anonymous no
> challenge", though), then add it in patch 7. 
> 
> [1] https://lore.kernel.org/git/794256754c1f7d32e438dfb19a05444d423989aa.1670880984.git.gitgitgadget@gmail.com/

Good catch! Will fix.

>> +
>> +	# Attempt to read from a protected repository
>> +	git ls-remote $ORIGIN_URL
>> +'
>> +
>> +test_done
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 7/8] test-http-server: add simple authentication
  2022-12-14 23:23         ` Victoria Dye
@ 2023-01-11 22:00           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 22:00 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:23, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> +static int is_authed(struct req *req, const char **user, enum worker_result *wr)
>> +{
>> +	enum auth_result result = AUTH_UNKNOWN;
>> +	struct string_list hdrs = STRING_LIST_INIT_NODUP;
>> +	struct auth_module *mod;
>> +
>> +	struct string_list_item *hdr;
>> +	struct string_list_item *token;
>> +	const char *v;
>> +	struct strbuf **split = NULL;
>> +	int i;
>> +	char *challenge;
>> +
>> +	/*
>> +	 * Check all auth modules and try to validate the request.
>> +	 * The first module that matches a valid token approves the request.
>> +	 * If no module is found, or if there is no valid token, then 401 error.
>> +	 * Otherwise, only permit the request if anonymous auth is enabled.
>> +	 */
>> +	for_each_string_list_item(hdr, &req->header_list) {
>> +		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
> 
> Is only one "Authorization:" header allowed? If so, adding a 'break;' at the
> end of this if-statement would make that clearer. If not, what's the
> expected allow/deny behavior if e.g. one header is ALLOW'd by one auth
> module, and another header is DENY'd by a different auth module?

Yes, only one Authorization header *should* be passed.. but the RFCs are not very
explicit about that. The test server supports multiple, but will `ALLOW` or `DENY`
based on the first matching auth scheme (module).

> 
>> +			split = strbuf_split_str(v, ' ', 2);
>> +			if (!split[0] || !split[1]) continue;
>> +
>> +			/* trim trailing space ' ' */
>> +			strbuf_setlen(split[0], split[0]->len - 1);
>> +
>> +			mod = get_auth_module(split[0]->buf);
>> +			if (mod) {
>> +				result = AUTH_DENY;
>> +
>> +				for_each_string_list_item(token, mod->tokens) {
>> +					if (!strcmp(split[1]->buf, token->string)) {
>> +						result = AUTH_ALLOW;
>> +						break;
>> +					}
>> +				}
>> +
>> +				goto done;
>> +			}
>> +		}
>> +	}
>> +
>> +done:
>> +	switch (result) {
>> +	case AUTH_ALLOW:
>> +		trace2_printf("%s: auth '%s' ALLOW", TR2_CAT, mod->scheme);
>> +		*user = "VALID_TEST_USER";
>> +		*wr = WR_OK;
>> +		break;
>> +
>> +	case AUTH_DENY:
>> +		trace2_printf("%s: auth '%s' DENY", TR2_CAT, mod->scheme);
>> +		/* fall-through */
>> +
>> +	case AUTH_UNKNOWN:
>> +		if (result != AUTH_DENY && allow_anonymous)
>> +			break;
> 
> I think this just needs to be 'if (allow_anonymous)' - we already know
> 'result' is 'AUTH_UNKNOWN' once we reach this block.

Note that `AUTH_DENY` falls-through to the `AUTH_UNKNOWN` case.
The only time we *DON'T* want to output the auth challenge response headers is
when there was no challenge provided (`AUTH_UNKNOWN`) *and* we are permitting
anonymous users.

  result      | allow_anoymous | Output Challenge?
---------------------------------------------------
 AUTH_DENY    |       1        |       Yes
 AUTH_DENY    |       0        |       Yes
 AUTH_UNKNOWN |       1        |       No
 AUTH_UNKNOWN |       0        |       Yes

>> +		for (i = 0; i < auth_modules_nr; i++) {
>> +			mod = auth_modules[i];
>> +			if (mod->challenge_params)
>> +				challenge = xstrfmt("WWW-Authenticate: %s %s",
>> +						    mod->scheme,
>> +						    mod->challenge_params);
>> +			else
>> +				challenge = xstrfmt("WWW-Authenticate: %s",
>> +						    mod->scheme);
>> +			string_list_append(&hdrs, challenge);
>> +		}
>> +		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
>> +	}
>> +
>> +	strbuf_list_free(split);
>> +	string_list_clear(&hdrs, 0);
>> +
>> +	return result == AUTH_ALLOW ||
>> +	      (result == AUTH_UNKNOWN && allow_anonymous);
> 
> So if a user is explicitly denied, even with 'allow_anonymous', this fails?
> Is there a test case that uses that behavior and/or is that standard auth
> behavior? Otherwise, it'd be simpler to skip the 'is_authed()' check (in
> 'dispatch()') altogether if 'allow_anonymous' is enabled.

If the user is being denied by a module we should always deny access.

Admittedly, for this simple authentication scenario it's kind of silly to deny
a user who is trying to identify themselves, but permit an anoymous user.
However, if this was an authorization failure then denying a user based on their
token may be totally valid. Right now, we're only concerned about authentication
and not authorization, so I could move this check to `dispatch()` if you feel
strongly about it.

>> +}
>> +
>>  static enum worker_result dispatch(struct req *req)
>>  {
>> +	enum worker_result wr = WR_OK;
>> +	const char *user = NULL;
>> +
>> +	if (!is_authed(req, &user, &wr))
>> +		return wr;
>> +
>>  	if (is_git_request(req))
>> -		return do__git(req, NULL);
>> +		return do__git(req, user);
>>  
>>  	return send_http_error(1, 501, "Not Implemented", -1, NULL,
>>  			       WR_OK | WR_HANGUP);
>> @@ -854,6 +982,7 @@ int cmd_main(int argc, const char **argv)
>>  	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
>>  	int worker_mode = 0;
>>  	int i;
>> +	struct auth_module *mod = NULL;
>>  
>>  	trace2_cmd_name("test-http-server");
>>  	setup_git_directory_gently(NULL);
>> @@ -906,6 +1035,63 @@ int cmd_main(int argc, const char **argv)
>>  			pid_file = v;
>>  			continue;
>>  		}
>> +		if (skip_prefix(arg, "--allow-anonymous", &v)) {
>> +			allow_anonymous = 1;
>> +			continue;
>> +		}
>> +		if (skip_prefix(arg, "--auth=", &v)) {
> ...
> 
>> +		}
>> +		if (skip_prefix(arg, "--auth-token=", &v)) {
>> +			struct strbuf **p = strbuf_split_str(v, ':', 2);
>> +			if (!p[0]) {
>> +				error("invalid argument '%s'", v);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			if (!p[1]) {
>> +				error("missing token value '%s'\n", v);
>> +				usage(test_http_auth_usage);
>> +			}
>> +
>> +			/* trim trailing ':' */
>> +			strbuf_setlen(p[0], p[0]->len - 1);
>> +
>> +			mod = get_auth_module(p[0]->buf);
>> +			if (!mod) {
>> +				error("auth scheme not defined '%s'\n", p[0]->buf);
>> +				usage(test_http_auth_usage);
>> +			}
> 
> Does this mean that '--auth' needs to be specified before '--auth-token' to
> avoid the "auth scheme not defined" error? If so, this could be made less
> fragile by just setting the string value of the arg in this 'if()' block,
> then processing the value after the option-parsing loop.

Yes, `--auth` needs to come first and 'setup' the module and challenge.

>> +
>> +			string_list_append(mod->tokens, p[1]->buf);
>> +			strbuf_list_free(p);
>> +			continue;
>> +		}
>>  
>>  		fprintf(stderr, "error: unknown argument '%s'\n", arg);
>>  		usage(test_http_auth_usage);
> 
> I think a test (in this patch) showing how the auth headers are handled by
> this HTTP server would be really helpful in demonstrating/exercising the
> intended behavior. 
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 8/8] t5556: add HTTP authentication tests
  2022-12-14 23:48         ` Victoria Dye
  2022-12-15  0:21           ` Junio C Hamano
@ 2023-01-11 22:04           ` Matthew John Cheetham
  1 sibling, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 22:04 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:48, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>>
>> Add a series of tests to exercise the HTTP authentication header parsing
>> and the interop with credential helpers. Credential helpers will receive
>> WWW-Authenticate information in credential requests.
> 
> A general comment about this series - the way you have the patches organized
> means that the "feature" content you're trying to integrate (the first two
> patches) is contextually separated from these tests. For people that
> learn/understand code via examples in tests, this makes it really difficult
> to understand what's going on. To avoid that, I think you could rearrange
> the patches pretty easily:
> 
> 1. test-http-server: add stub HTTP server test helper (prev. patch 3)
>   - t5556 could be introduced here with the basic "anonymous" test in patch
>     6, but marked 'test_expect_failure'.
> 2. test-http-server: add HTTP error response function (prev. patch 4)
> 3. test-http-server: add HTTP request parsing (prev. patch 5)
> 4. test-http-server: pass Git requests to http-backend (prev. patch 6)
> 5. test-http-server: add simple authentication (prev. patch 7)
> 6. http: read HTTP WWW-Authenticate response headers (prev. patch 1)
> 7. credential: add WWW-Authenticate header to cred requests (prev patch 2)
>   - Some/all of the tests from the current patch (patch 8) could be squashed
>     into this one so that the tests exist directly alongside the new
>     functionality they're testing.


I think that order make sense - I'll rearrange for my next iteration.
Thanks!

>>
>> Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
>> ---
>>  t/helper/test-credential-helper-replay.sh |  14 +++
>>  t/t5556-http-auth.sh                      | 120 +++++++++++++++++++++-
>>  2 files changed, 133 insertions(+), 1 deletion(-)
>>  create mode 100755 t/helper/test-credential-helper-replay.sh
>>
>> diff --git a/t/helper/test-credential-helper-replay.sh b/t/helper/test-credential-helper-replay.sh
>> new file mode 100755
>> index 00000000000..03e5e63dad6
>> --- /dev/null
>> +++ b/t/helper/test-credential-helper-replay.sh
> 
> I'm not sure a 't/helper' file is the right place for this - it's a pretty
> simple shell script, but it defines a lot of information (namely 'teefile',
> 'catfile') that is otherwise unexplained in 't5556'. 
> 
> What about something like 'lib-rebase.sh' and its 'set_fake_editor()'? You
> could create a similar test lib ('lib-credential-helper.sh') and wrapper
> function (' that writes out a custom credential helper. Something like
> 'set_fake_credential_helper()' could also take 'teefile' and 'catfile' as
> arguments, making their names more transparent to 't5556'.

The `lib-rebase.sh` script sets the fake editor by setting an environment
variable (from what I can see). Credential helpers can only be set via config
or command-line arg. Would it be easier to move writing of the test credential
helper script to the t5556 test script setup?

>> @@ -0,0 +1,14 @@
>> +cmd=$1
>> +teefile=$cmd-actual.cred
>> +catfile=$cmd-response.cred
>> +rm -f $teefile
>> +while read line;
>> +do
>> +	if test -z "$line"; then
>> +		break;
>> +	fi
>> +	echo "$line" >> $teefile
>> +done
>> +if test "$cmd" = "get"; then
>> +	cat $catfile
>> +fi
>> diff --git a/t/t5556-http-auth.sh b/t/t5556-http-auth.sh
>> index 78da151f122..541fa32bd77 100755
>> --- a/t/t5556-http-auth.sh
>> +++ b/t/t5556-http-auth.sh
>> @@ -26,6 +26,8 @@ PID_FILE="$(pwd)"/pid-file.pid
>>  SERVER_LOG="$(pwd)"/OUT.server.log
>>  
>>  PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
>> +CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
>> +	&& export CREDENTIAL_HELPER
> 
> I see - this is how you connect the "test" credential helper to the HTTP
> server and header parsing (as implemented in patches 1 & 2), so that the
> results can be compared for correctness.
> 
> nit: you can just 'export CREDENTIAL_HELPER="..."', rather than breaking it
> into two lines. You also shouldn't need to 'export' at all - the value will
> be set in the context of the test.

I tried this originally, but got errors from one of the environments in CI that
this was not portable.

>>  
>>  test_expect_success 'setup repos' '
>>  	test_create_repo "$REPO_DIR" &&
>> @@ -91,7 +93,8 @@ start_http_server () {
>>  
>>  per_test_cleanup () {
>>  	stop_http_server &&
>> -	rm -f OUT.*
>> +	rm -f OUT.* &&
>> +	rm -f *.cred
>>  }
>>  
>>  test_expect_success 'http auth anonymous no challenge' '
>> @@ -102,4 +105,119 @@ test_expect_success 'http auth anonymous no challenge' '
>>  	git ls-remote $ORIGIN_URL
>>  '
>>  
>> +test_expect_success 'http auth www-auth headers to credential helper basic valid' '
> 
> ...
> 
>> +test_expect_success 'http auth www-auth headers to credential helper custom schemes' '
> 
> ...
> 
>> +test_expect_success 'http auth www-auth headers to credential helper invalid' '
> 
> These tests all look good. That said, is there any way to test more
> bizarre/edge cases (headers too long to fit on one line, headers that end
> with a long string of whitespace, etc.)?
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 8/8] t5556: add HTTP authentication tests
  2022-12-15  0:21           ` Junio C Hamano
@ 2023-01-11 22:05             ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 22:05 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye
  Cc: Matthew John Cheetham via GitGitGadget, git, Derrick Stolee,
	Lessley Dennington, M Hickford, Jeff Hostetler, Glen Choo,
	Matthew John Cheetham

On 2022-12-14 16:21, Junio C Hamano wrote:

> Victoria Dye <vdye@github.com> writes:
> 
>> A general comment about this series - the way you have the patches organized
>> means that the "feature" content you're trying to integrate (the first two
>> patches) is contextually separated from these tests. For people that
>> learn/understand code via examples in tests, this makes it really difficult
>> to understand what's going on. To avoid that, I think you could rearrange
>> the patches pretty easily:
>> ...
> 
> Thanks for a thorough review of the entire series, with concrete
> suggestions for improvements with encouragements sprinkled in.
> 
> Very much appreciated.
> 

Yes! Thank you Victoria for the detailed and thorough review.
I also too very much appreciate it :-)

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers
  2022-12-14 23:15         ` Victoria Dye
@ 2023-01-11 22:09           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 22:09 UTC (permalink / raw)
  To: Victoria Dye, Matthew John Cheetham via GitGitGadget, git
  Cc: Derrick Stolee, Lessley Dennington, M Hickford, Jeff Hostetler,
	Glen Choo, Matthew John Cheetham

On 2022-12-14 15:15, Victoria Dye wrote:

> Matthew John Cheetham via GitGitGadget wrote:
>> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
>> +{
>> +	size_t size = eltsize * nmemb;
>> +	struct strvec *values = &http_auth.wwwauth_headers;
>> +	struct strbuf buf = STRBUF_INIT;
>> +	const char *val;
>> +	const char *z = NULL;
>> +
>> +	/*
>> +	 * Header lines may not come NULL-terminated from libcurl so we must
>> +	 * limit all scans to the maximum length of the header line, or leverage
>> +	 * strbufs for all operations.
>> +	 *
>> +	 * In addition, it is possible that header values can be split over
>> +	 * multiple lines as per RFC 2616 (even though this has since been
>> +	 * deprecated in RFC 7230). A continuation header field value is
>> +	 * identified as starting with a space or horizontal tab.
>> +	 *
>> +	 * The formal definition of a header field as given in RFC 2616 is:
>> +	 *
>> +	 *   message-header = field-name ":" [ field-value ]
>> +	 *   field-name     = token
>> +	 *   field-value    = *( field-content | LWS )
>> +	 *   field-content  = <the OCTETs making up the field-value
>> +	 *                    and consisting of either *TEXT or combinations
>> +	 *                    of token, separators, and quoted-string>
>> +	 */
>> +
>> +	strbuf_add(&buf, ptr, size);
>> +
>> +	/* Strip the CRLF that should be present at the end of each field */
>> +	strbuf_trim_trailing_newline(&buf);
>> +
>> +	/* Start of a new WWW-Authenticate header */
>> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
>> +		while (isspace(*val))
>> +			val++;
> 
> Per the RFC [1]: 
> 
>> The field value MAY be preceded by any amount of LWS, though a single SP
>> is preferred.
> 
> And LWS (linear whitespace) is defined as:
> 
>> CRLF           = CR LF 
>> LWS            = [CRLF] 1*( SP | HT )
> 
> and 'isspace()' includes CR, LF, SP, and HT [2]. 
> 
> Looks good!
> 
> [1] https://datatracker.ietf.org/doc/html/rfc2616#section-4-2
> [2] https://linux.die.net/man/3/isspace
> 
>> +
>> +		strvec_push(values, val);
> 
> I had the same question about "what happens with an empty 'val' here?" as
> Stolee did earlier [3], but I *think* the "zero length" (i.e., single null
> terminator) will be copied successfully. It's probably worth testing that
> explicitly, though (I see you set up tests in later patches - ideally a 
> "www-authenticate:<mix of whitespace>" line could be tested there).
> 
> [3] https://lore.kernel.org/git/9fded44b-c503-f8e5-c6a6-93e882d50e27@github.com/

There is a bug here. Empty header values would indeed be appended
successfully, but this eventually results in empty values for `wwwauth[]`
being sent over to credential helpers (which should treat the empty value as
a reset of the existing list!!)

Really, empty values should be ignored.
My next iteration should hopefully be a bit more careful around these cases.

>> +		http_auth.header_is_last_match = 1;
>> +		goto exit;
>> +	}
>> +
>> +	/*
>> +	 * This line could be a continuation of the previously matched header
>> +	 * field. If this is the case then we should append this value to the
>> +	 * end of the previously consumed value.
>> +	 */
>> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
>> +		const char **v = values->v + values->nr - 1;
>> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
> 
> In this case (where the line is a continuation of a 'www-authenticate'
> header), it looks like the code here expects *exactly* one LWS at the start
> of the line ('isspace(*buf.buf)' requiring at least one space to append the
> header, 'ptr + 1' skipping no more than one). But, according to the RFC, it
> could be more than one:
> 
>> Header fields can be extended over multiple lines by preceding each extra
>> line with at least one SP or HT.
> 
> So I think 'buf.buf' might need to have all preceding spaces removed, like
> you did in the "Start of a new WWW-Authenticate header" block.
> 
> Also, if you're copying 'ptr' into 'buf' to avoid issues from a missing null
> terminator, wouldn't you want to use 'buf.buf' (instead of 'ptr') in
> 'xstrfmt()'?

Sure! Good points.

>> +
>> +		free((void*)*v);
>> +		*v = append;
> 
> I was about to suggest (optionally) rewriting this to use 'strvec_pop()' and
> 'strvec_push_nodup()':
> 
> 	strvec_pop(values); 
> 	strvec_push_nodup(values, append);
> 
> to maybe make this a bit easier to follow, but unfortunately
> 'strvec_push_nodup()' isn't available outside of 'strvec.c'. If you did want
> to use 'strvec' functions, you could remove the 'static' from
> 'strvec_push_nodup()' and add it to 'strvec.h' it in a later reroll, but I
> don't consider that change "blocking" or even important enough to warrant
> its own reroll. 

That wouldn't be too much effort, and would help simplify overall the move
to using `strbuf_` functions. Check my next iteration for this.

>> +
>> +		goto exit;
>> +	}
>> +
>> +	/* This is the start of a new header we don't care about */
>> +	http_auth.header_is_last_match = 0;
>> +
>> +	/*
>> +	 * If this is a HTTP status line and not a header field, this signals
>> +	 * a different HTTP response. libcurl writes all the output of all
>> +	 * response headers of all responses, including redirects.
>> +	 * We only care about the last HTTP request response's headers so clear
>> +	 * the existing array.
>> +	 */
>> +	if (skip_iprefix(buf.buf, "http/", &z))
>> +		strvec_clear(values);
> 
> The comments describing the intended behavior (as well as the commit
> message) are clear and explain the somewhat esoteric (at least to my
> untrained eye ;) ) code. Thanks!
> 
>> +
>> +exit:
>> +	strbuf_release(&buf);
>> +	return size;
>> +}
>> +
>>  size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
>>  {
>>  	return nmemb;
>> @@ -1864,6 +1940,8 @@ static int http_request(const char *url,
>>  					 fwrite_buffer);
>>  	}
>>  
>> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
>> +
>>  	accept_language = http_get_accept_language_header();
>>  
>>  	if (accept_language)
> 

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* Re: [PATCH v4 1/8] http: read HTTP WWW-Authenticate response headers
  2022-12-15  9:27         ` Ævar Arnfjörð Bjarmason
@ 2023-01-11 22:11           ` Matthew John Cheetham
  0 siblings, 0 replies; 171+ messages in thread
From: Matthew John Cheetham @ 2023-01-11 22:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, M Hickford,
	Jeff Hostetler, Glen Choo, Matthew John Cheetham

On 2022-12-15 01:27, Ævar Arnfjörð Bjarmason wrote:

> 
> On Mon, Dec 12 2022, Matthew John Cheetham via GitGitGadget wrote:
> 
>> From: Matthew John Cheetham <mjcheetham@outlook.com>
>> [...]
>>  /* Initialize a credential structure, setting all fields to empty. */
>> diff --git a/http.c b/http.c
>> index 8a5ba3f4776..c4e9cd73e14 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -183,6 +183,82 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>>  	return nmemb;
>>  }
>>  
>> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
>> +{
>> +	size_t size = eltsize * nmemb;
> 
> Just out of general paranoia: use st_mult() here, not "*" (checks for
> overflows)?

Sure! Good point.

>> +	struct strvec *values = &http_auth.wwwauth_headers;
>> +	struct strbuf buf = STRBUF_INIT;
>> +	const char *val;
>> +	const char *z = NULL;
> 
> Why NULL-init the "z" here, but not the "val"? Both look like they
> should be un-init'd. We also tend to call a throw-away char pointer "p",
> not "z", but anyway (more below).... 
> 
>> +
>> +	/*
>> +	 * Header lines may not come NULL-terminated from libcurl so we must
>> +	 * limit all scans to the maximum length of the header line, or leverage
>> +	 * strbufs for all operations.
>> +	 *
>> +	 * In addition, it is possible that header values can be split over
>> +	 * multiple lines as per RFC 2616 (even though this has since been
>> +	 * deprecated in RFC 7230). A continuation header field value is
>> +	 * identified as starting with a space or horizontal tab.
>> +	 *
>> +	 * The formal definition of a header field as given in RFC 2616 is:
>> +	 *
>> +	 *   message-header = field-name ":" [ field-value ]
>> +	 *   field-name     = token
>> +	 *   field-value    = *( field-content | LWS )
>> +	 *   field-content  = <the OCTETs making up the field-value
>> +	 *                    and consisting of either *TEXT or combinations
>> +	 *                    of token, separators, and quoted-string>
>> +	 */
>> +
>> +	strbuf_add(&buf, ptr, size);
>> +
>> +	/* Strip the CRLF that should be present at the end of each field */
>> +	strbuf_trim_trailing_newline(&buf);
>> +
>> +	/* Start of a new WWW-Authenticate header */
>> +	if (skip_iprefix(buf.buf, "www-authenticate:", &val)) {
>> +		while (isspace(*val))
>> +			val++;
> 
> As we already have a "struct strbuf" here, maybe we can instead
> consistently use the strbuf functions, e.g. strbuf_ltrim() in this case.

That's a good point. I can move to using strbuf functions entirely.

> I haven't reviewed this in detail, maybe it's not easy or worth it
> here...
> 
>> +
>> +		strvec_push(values, val);
>> +		http_auth.header_is_last_match = 1;
>> +		goto exit;
>> +	}
>> +
>> +	/*
>> +	 * This line could be a continuation of the previously matched header
>> +	 * field. If this is the case then we should append this value to the
>> +	 * end of the previously consumed value.
>> +	 */
>> +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
>> +		const char **v = values->v + values->nr - 1;
> 
> It makes no difference to the compiler, but perhaps using []-indexing
> here is more idiomatic, for getting the nth member of this strvec?

Sure!

>> +		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
>> +
>> +		free((void*)*v);
> 
> Is this reaching into the strvec & manually memory-managing it
> unavoidable, or can we use strvec_pop() etc?

Again, good point. I can rework this to pop and push a new, joined value.

>> +		*v = append;
>> +
>> +		goto exit;
>> +	}
>> +
>> +	/* This is the start of a new header we don't care about */
>> +	http_auth.header_is_last_match = 0;
>> +
>> +	/*
>> +	 * If this is a HTTP status line and not a header field, this signals
>> +	 * a different HTTP response. libcurl writes all the output of all
>> +	 * response headers of all responses, including redirects.
>> +	 * We only care about the last HTTP request response's headers so clear
>> +	 * the existing array.
>> +	 */
>> +	if (skip_iprefix(buf.buf, "http/", &z))
> 
> ...Don't you want to just skip this "z" variable altogether and use
> istarts_with() instead? All you seem to care about is whether it starts
> with it, not what the offset is.
> 

Again, a good point. Thanks for the suggestions. My next iteration will include
this.

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 171+ messages in thread

* [PATCH v5 00/10] Enhance credential helper protocol to include auth headers
  2022-12-12 21:36     ` [PATCH v4 0/8] " Matthew John Cheetham via GitGitGadget
                         ` (7 preceding siblings ...)
  2022-12-12 21:36       ` [PATCH v4 8/8] t5556: add HTTP authentication tests Matthew John Cheetham via GitGitGadget
@ 2023-01-11 22:13       ` Matthew John Cheetham via GitGitGadget
  2023-01-11 22:13         ` [PATCH v5 01/10] daemon: libify socket setup and option functions Matthew John Cheetham via GitGitGadget
                           ` (10 more replies)
  8 siblings, 11 replies; 171+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2023-01-11 22:13 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Victoria Dye,
	Matthew John Cheetham

Following from my original RFC submission [0], this submission is considered
ready for full review. This patch series is now based on top of current
master (9c32cfb49c60fa8173b9666db02efe3b45a8522f) that includes my now
separately submitted patches [1] to fix up the other credential helpers'
behaviour.

In this patch series I update the existing credential helper design in order
to allow for some new scenarios, and future evolution of auth methods that
Git hosts may wish to provide. I outline the background, summary of changes
and some challenges below.

Testing these new additions, I introduce a new test helper test-http-server
that acts as a frontend to git-http-backend; a mini HTTP server sharing code
with git-daemon, with simple authentication configurable by a config file.


Background
==========

Git uses a variety of protocols [2]: local, Smart HTTP, Dumb HTTP, SSH, and
Git. Here I focus on the Smart HTTP protocol, and attempt to enhance the
authentication capabilities of this protocol to address limitations (see
below).

The Smart HTTP protocol in Git supports a few different types of HTTP
authentication - Basic and Digest (RFC 2617) [3], and Negotiate (RFC 2478)
[4]. Git uses a extensible model where credential helpers can provide
credentials for protocols [5]. Several helpers support alternatives such as
OAuth authentication (RFC 6749) [6], but this is typically done as an
extension. For example, a helper might use basic auth and set the password
to an OAuth Bearer access token. Git uses standard input and output to
communicate with credential helpers.

After a HTTP 401 response, Git would call a credential helper with the
following over standard input:

protocol=https
host=example.com


And then a credential helper would return over standard output:

protocol=https
host=example.com
username=bob@id.example.com
password=<BEARER-TOKEN>


Git then the following request to the remote, including the standard HTTP
Authorization header (RFC 7235 Section 4.2) [7]:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: Basic base64(bob@id.example.com:<BEARER-TOKEN>)


Credential helpers are encouraged (see gitcredentials.txt) to return the
minimum information necessary.


Limitations
===========

Because this credential model was built mostly for password based
authentication systems, it's somewhat limited. In particular:

 1. To generate valid credentials, additional information about the request
    (or indeed the requestee and their device) may be required. For example,
    OAuth is based around scopes. A scope, like "git.read", might be
    required to read data from the remote. However, the remote cannot tell
    the credential helper what scope is required for this request.

 2. This system is not fully extensible. Each time a new type of
    authentication (like OAuth Bearer) is invented, Git needs updates before
    credential helpers can take advantage of it (or leverage a new
    capability in libcurl).


Goals
=====

 * As a user with multiple federated cloud identities:
   
   * Reach out to a remote and have my credential helper automatically
     prompt me for the correct identity.
   * Allow credential helpers to differentiate between different authorities
     or authentication/authorization challenge types, even from the same DNS
     hostname (and without needing to use credential.useHttpPath).
   * Leverage existing authentication systems built-in to many operating
     systems and devices to boost security and reduce reliance on passwords.

 * As a Git host and/or cloud identity provider:
   
   * Enforce security policies (like requiring two-factor authentication)
     dynamically.
   * Allow integration with third party standard based identity providers in
     enterprises allowing customers to have a single plane of control for
     critical identities with access to source code.


Design Principles
=================

 * Use the existing infrastructure. Git credential helpers are an
   already-working model.
 * Follow widely-adopted time-proven open standards, avoid net new ideas in
   the authentication space.
 * Minimize knowledge of authentication in Git; maintain modularity and
   extensibility.


Proposed Changes
================

 1. Teach Git to read HTTP response headers, specifically the standard
    WWW-Authenticate (RFC 7235 Section 4.1) headers.

 2. Teach Git to include extra information about HTTP responses that require
    authentication when calling credential helpers. Specifically the
    WWW-Authenticate header information.
    
    Because the extra information forms an ordered list, and the existing
    credential helper I/O format only provides for simple key=value pairs,
    we introduce a new convention for transmitting an ordered list of
    values. Key names that are suffixed with a C-style array syntax should
    have values considered to form an order list, i.e. key[]=value, where
    the order of the key=value pairs in the stream specifies the order.
    
    For the WWW-Authenticate header values we opt to use the key wwwauth[].


Handling the WWW-Authenticate header in detail
==============================================

RFC 6750 [8] envisions that OAuth Bearer resource servers would give
responses that include WWW-Authenticate headers, for example:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


Specifically, a WWW-Authenticate header consists of a scheme and arbitrary
attributes, depending on the scheme. This pattern enables generic OAuth or
OpenID Connect [9] authorities. Note that it is possible to have several
WWW-Authenticate challenges in a response.

First Git attempts to make a request, unauthenticated, which fails with a
401 response and includes WWW-Authenticate header(s).

Next, Git invokes a credential helper which may prompt the user. If the user
approves, a credential helper can generate a token (or any auth challenge
response) to be used for that request.

For example: with a remote that supports bearer tokens from an OpenID
Connect [9] authority, a credential helper can use OpenID Connect's
Discovery [10] and Dynamic Client Registration [11] to register a client and
make a request with the correct permissions to access the remote. In this
manner, a user can be dynamically sent to the right federated identity
provider for a remote without any up-front configuration or manual
processes.

Following from the principle of keeping authentication knowledge in Git to a
minimum, we modify Git to add all WWW-Authenticate values to the credential
helper call.

Git sends over standard input:

protocol=https
host=example.com
wwwauth[]=Bearer realm="login.example", scope="git.readwrite"
wwwauth[]=Basic realm="login.example"


A credential helper that understands the extra wwwauth[n] property can
decide on the "best" or correct authentication scheme, generate credentials
for the request, and interact with the user.

The credential helper would then return over standard output:

protocol=https
host=example.com
path=foo.git
username=bob@identity.example
password=<BEARER-TOKEN>


Note that WWW-Authenticate supports multiple challenges, either in one
header:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"


or in multiple headers:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="login.example", scope="git.readwrite"
WWW-Authenticate: Basic realm="login.example"


These have equivalent meaning (RFC 2616 Section 4.2 [12]). To simplify the
implementation, Git will not merge or split up any of these WWW-Authenticate
headers, and instead pass each header line as one credential helper
property. The credential helper is responsible for splitting, merging, and
otherwise parsing these header values.

An alternative option to sending the header fields individually would be to
merge the header values in to one key=value property, for example:

...
wwwauth=Bearer realm="login.example", scope="git.readwrite", Basic realm="login.example"



Future work
===========

In the future we can further expand the protocol to allow credential helpers
decide the best authentication scheme. Today credential helpers are still
only expected to return a username/password pair to Git, meaning the other
authentication schemes that may be offered still need challenge responses
sent via a Basic Authorization header. The changes outlined above still
permit helpers to select and configure an available authentication mode, but
require the remote for example to unpack a bearer token from a basic
challenge.

More careful consideration is required in the handling of custom
authentication schemes which may not have a username, or may require
arbitrary additional request header values be set.

For example imagine a new "FooBar" authentication scheme that is surfaced in
the following response:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: FooBar realm="login.example", algs="ES256 PS256"


With support for arbitrary authentication schemes, Git would call credential
helpers with the following over standard input:

protocol=https
host=example.com
wwwauth[]=FooBar realm="login.example", algs="ES256 PS256", nonce="abc123"


And then an enlightened credential helper could return over standard output:

protocol=https
host=example.com
authtype=FooBar
username=bob@id.example.com
password=<FooBar credential>
header[]=X-FooBar: 12345
header[]=X-FooBar-Alt: ABCDEF


Git would be expected to attach this authorization header to the next
request:

GET /info/refs?service=git-upload-pack HTTP/1.1
Host: git.example
Git-Protocol: version=2
Authorization: FooBar <FooBar credential>
X-FooBar: 12345
X-FooBar-Alt: ABCDEF



Why not SSH?
============

There's nothing wrong with SSH. However, Git's Smart HTTP transport is
widely used, often with OAuth Bearer tokens. Git's Smart HTTP transport
sometimes requires less client setup than SSH transport, and works in
environments when SSH ports may be blocked. As long as Git supports HTTP
transport, it should support common and popular HTTP authentication methods.


References
==========

 * [0] [PATCH 0/8] [RFC] Enhance credential helper protocol to include auth
   headers
   https://lore.kernel.org/git/pull.1352.git.1663097156.gitgitgadget@gmail.com/

 * [1] [PATCH 0/3] Correct credential helper discrepancies handling input
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * [2] Git on the Server - The Protocols
   https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols

 * [3] HTTP Authentication: Basic and Digest Access Authentication
   https://datatracker.ietf.org/doc/html/rfc2617

 * [4] The Simple and Protected GSS-API Negotiation Mechanism
   https://datatracker.ietf.org/doc/html/rfc2478

 * [5] Git Credentials - Custom Helpers
   https://git-scm.com/docs/gitcredentials#_custom_helpers

 * [6] The OAuth 2.0 Authorization Framework
   https://datatracker.ietf.org/doc/html/rfc6749

 * [7] Hypertext Transfer Protocol (HTTP/1.1): Authentication
   https://datatracker.ietf.org/doc/html/rfc7235

 * [8] The OAuth 2.0 Authorization Framework: Bearer Token Usage
   https://datatracker.ietf.org/doc/html/rfc6750

 * [9] OpenID Connect Core 1.0
   https://openid.net/specs/openid-connect-core-1_0.html

 * [10] OpenID Connect Discovery 1.0
   https://openid.net/specs/openid-connect-discovery-1_0.html

 * [11] OpenID Connect Dynamic Client Registration 1.0
   https://openid.net/specs/openid-connect-registration-1_0.html

 * [12] Hypertext Transfer Protocol (HTTP/1.1)
   https://datatracker.ietf.org/doc/html/rfc2616


Updates from RFC
================

 * Submitted first three patches as separate submission:
   https://lore.kernel.org/git/pull.1363.git.1663865974.gitgitgadget@gmail.com/

 * Various style fixes and updates to- and addition of comments.

 * Drop the explicit integer index in new 'array' style credential helper
   attrbiutes ("key[n]=value" becomes just "key[]=value").

 * Added test helper; a mini HTTP server, and several tests.


Updates in v3
=============

 * Split final patch that added the test-http-server in to several, easier
   to review patches.

 * Updated wording in git-credential.txt to clarify which side of the
   credential helper protocol is sending/receiving the new wwwauth and
   authtype attributes.


Updates in v4
=============

 * Drop authentication scheme selection authtype attribute patches to
   greatly simplify the series; auth scheme selection is punted to a future
   series. This series still allows credential helpers to generate
   credentials and intelligently select correct identities for a given auth
   challenge.


Updates in v5
=============

 * Libify parts of daemon.c and share implementation with test-http-server.

 * Clarify test-http-server Git request regex pattern and auth logic
   comments.

 * Use STD*_FILENO in place of 'magic' file descriptor numbers.

 * Use strbuf_* functions in continuation header parsing.

 * Use configuration file to configure auth for test-http-server rather than
   command-line arguments. Add ability to specify arbitrary extra headers
   that is useful for testing 'malformed' server responses.

 * Use st_mult over unchecked multiplication in http.c curl callback
   functions.

 * Fix some documentation line break issues.

 * Reorder some commits to bring in the tests and test-http-server helper
   first and, then the WWW-Authentication changes, alongside tests to cover.

 * Expose previously static strvec_push_nodup function.

 * Merge the two timeout args for test-http-server (--timeout and
   --init-timeout) that were a hang-over from the original daemon.c but are
   no longer required here.

 * Be more careful around continuation headers where they may be empty
   strings. Add more tests to cover these header types.

 * Include standard trace2 tracing calls at start of test-http-server
   helper.

Matthew John Cheetham (10):
  daemon: libify socket setup and option functions
  daemon: libify child process handling functions
  daemon: rename some esoteric/laboured terminology
  test-http-server: add stub HTTP server test helper
  test-http-server: add HTTP error response function
  test-http-server: add simple authentication
  http: replace unsafe size_t multiplication with st_mult
  strvec: expose strvec_push_nodup for external use
  http: read HTTP WWW-Authenticate response headers
  credential: add WWW-Authenticate header to cred requests

 Documentation/git-credential.txt          |  19 +-
 Makefile                                  |   2 +
 contrib/buildsystems/CMakeLists.txt       |  11 +-
 credential.c                              |  13 +
 credential.h                              |  15 +
 daemon-utils.c                            | 286 +++++++
 daemon-utils.h                            |  38 +
 daemon.c                                  | 306 +------
 http.c                                    | 102 ++-
 strvec.c                                  |   2 +-
 strvec.h                                  |   3 +
 t/helper/.gitignore                       |   1 +
 t/helper/test-credential-helper-replay.sh |  14 +
 t/helper/test-http-server.c               | 920 ++++++++++++++++++++++
 t/t5556-http-auth.sh                      | 372 +++++++++
 15 files changed, 1801 insertions(+), 303 deletions(-)
 create mode 100644 daemon-utils.c
 create mode 100644 daemon-utils.h
 create mode 100755 t/helper/test-credential-helper-replay.sh
 create mode 100644 t/helper/test-http-server.c
 create mode 100755 t/t5556-http-auth.sh


base-commit: c48035d29b4e524aed3a32f0403676f0d9128863
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1352%2Fmjcheetham%2Femu-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1352/mjcheetham/emu-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1352

Range-diff vs v4:

  -:  ----------- >  1:  74b0de14185 daemon: libify socket setup and option functions
  -:  ----------- >  2:  bc972fc8d3d daemon: libify child process handling functions
  -:  ----------- >  3:  8f176d5955d daemon: rename some esoteric/laboured terminology
  3:  07a1845ea56 !  4:  706fb3781bd test-http-server: add stub HTTP server test helper
     @@ Commit message
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Makefile ##
     -@@ Makefile: else
     - 	endif
     - 	BASIC_CFLAGS += $(CURL_CFLAGS)
     +@@ Makefile: TEST_BUILTINS_OBJS += test-xml-encode.o
     + # Do not add more tests here unless they have extra dependencies. Add
     + # them in TEST_BUILTINS_OBJS above.
     + TEST_PROGRAMS_NEED_X += test-fake-ssh
     ++TEST_PROGRAMS_NEED_X += test-http-server
     + TEST_PROGRAMS_NEED_X += test-tool
       
     -+	TEST_PROGRAMS_NEED_X += test-http-server
     -+
     - 	REMOTE_CURL_PRIMARY = git-remote-http$X
     - 	REMOTE_CURL_ALIASES = git-remote-https$X git-remote-ftp$X git-remote-ftps$X
     - 	REMOTE_CURL_NAMES = $(REMOTE_CURL_PRIMARY) $(REMOTE_CURL_ALIASES)
     + TEST_PROGRAMS = $(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
      
       ## contrib/buildsystems/CMakeLists.txt ##
     +@@ contrib/buildsystems/CMakeLists.txt: if(BUILD_TESTING)
     + add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
     + target_link_libraries(test-fake-ssh common-main)
     + 
     ++add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
     ++target_link_libraries(test-http-server common-main)
     ++
     + #reftable-tests
     + parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
     + list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
     +@@ contrib/buildsystems/CMakeLists.txt: if(MSVC)
     + 				PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
     + 	set_target_properties(test-fake-ssh test-tool
     + 				PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
     ++
     ++	set_target_properties(test-http-server
     ++			PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
     ++	set_target_properties(test-http-server
     ++			PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
     + endif()
     + 
     + #wrapper scripts
      @@ contrib/buildsystems/CMakeLists.txt: set(wrapper_scripts
     - set(wrapper_test_scripts
     - 	test-fake-ssh test-tool)
     + 	git git-upload-pack git-receive-pack git-upload-archive git-shell git-remote-ext scalar)
       
     -+if(CURL_FOUND)
     -+       list(APPEND wrapper_test_scripts test-http-server)
     -+
     -+       add_executable(test-http-server ${CMAKE_SOURCE_DIR}/t/helper/test-http-server.c)
     -+       target_link_libraries(test-http-server common-main)
     -+
     -+       if(MSVC)
     -+               set_target_properties(test-http-server
     -+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}/t/helper)
     -+               set_target_properties(test-http-server
     -+                                       PROPERTIES RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}/t/helper)
     -+       endif()
     -+endif()
     + set(wrapper_test_scripts
     +-	test-fake-ssh test-tool)
     +-
     ++	test-http-server test-fake-ssh test-tool)
       
       foreach(script ${wrapper_scripts})
       	file(STRINGS ${CMAKE_SOURCE_DIR}/wrap-for-bin.sh content NEWLINE_CONSUME)
     @@ t/helper/.gitignore
      
       ## t/helper/test-http-server.c (new) ##
      @@
     ++#include "daemon-utils.h"
      +#include "config.h"
      +#include "run-command.h"
      +#include "strbuf.h"
     @@ t/helper/test-http-server.c (new)
      +
      +static const char test_http_auth_usage[] =
      +"http-server [--verbose]\n"
     -+"           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
     ++"           [--timeout=<n>] [--max-connections=<n>]\n"
      +"           [--reuseaddr] [--pid-file=<file>]\n"
      +"           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
      +;
      +
     -+/* Timeout, and initial timeout */
      +static unsigned int timeout;
     -+static unsigned int init_timeout;
      +
      +static void logreport(const char *label, const char *err, va_list params)
      +{
     @@ t/helper/test-http-server.c (new)
      +	va_end(params);
      +}
      +
     -+static void set_keep_alive(int sockfd)
     -+{
     -+	int ka = 1;
     -+
     -+	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0) {
     -+		if (errno != ENOTSOCK)
     -+			logerror("unable to set SO_KEEPALIVE on socket: %s",
     -+				strerror(errno));
     -+	}
     -+}
     -+
      +/*
      + * The code in this section is used by "worker" instances to service
      + * a single connection from a client.  The worker talks to the client
     @@ t/helper/test-http-server.c (new)
      +	 * Close the socket and clean up.  Does not imply an error.
      +	 */
      +	WR_HANGUP   = 1<<1,
     -+
     -+	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
      +};
      +
      +static enum worker_result worker(void)
     @@ t/helper/test-http-server.c (new)
      +	if (client_addr)
      +		loginfo("Connection from %s:%s", client_addr, client_port);
      +
     -+	set_keep_alive(0);
     ++	set_keep_alive(0, logerror);
      +
      +	while (1) {
     -+		if (write_in_full(1, response, strlen(response)) < 0) {
     ++		if (write_in_full(STDOUT_FILENO, response, strlen(response)) < 0) {
      +			logerror("unable to write response");
      +			wr = WR_IO_ERROR;
      +		}
      +
     -+		if (wr & WR_STOP_THE_MUSIC)
     ++		if (wr != WR_OK)
      +			break;
      +	}
      +
     -+	close(0);
     -+	close(1);
     ++	close(STDIN_FILENO);
     ++	close(STDOUT_FILENO);
      +
      +	return !!(wr & WR_IO_ERROR);
      +}
      +
     -+/*
     -+ * This section contains the listener and child-process management
     -+ * code used by the primary instance to accept incoming connections
     -+ * and dispatch them to async child process "worker" instances.
     -+ */
     -+
     -+static int addrcmp(const struct sockaddr_storage *s1,
     -+		   const struct sockaddr_storage *s2)
     -+{
     -+	const struct sockaddr *sa1 = (const struct sockaddr*) s1;
     -+	const struct sockaddr *sa2 = (const struct sockaddr*) s2;
     -+
     -+	if (sa1->sa_family != sa2->sa_family)
     -+		return sa1->sa_family - sa2->sa_family;
     -+	if (sa1->sa_family == AF_INET)
     -+		return memcmp(&((struct sockaddr_in *)s1)->sin_addr,
     -+		    &((struct sockaddr_in *)s2)->sin_addr,
     -+		    sizeof(struct in_addr));
     -+#ifndef NO_IPV6
     -+	if (sa1->sa_family == AF_INET6)
     -+		return memcmp(&((struct sockaddr_in6 *)s1)->sin6_addr,
     -+		    &((struct sockaddr_in6 *)s2)->sin6_addr,
     -+		    sizeof(struct in6_addr));
     -+#endif
     -+	return 0;
     -+}
     -+
      +static int max_connections = 32;
      +
      +static unsigned int live_children;
      +
     -+static struct child {
     -+	struct child *next;
     -+	struct child_process cld;
     -+	struct sockaddr_storage address;
     -+} *firstborn;
     -+
     -+static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
     -+{
     -+	struct child *newborn, **cradle;
     -+
     -+	newborn = xcalloc(1, sizeof(*newborn));
     -+	live_children++;
     -+	memcpy(&newborn->cld, cld, sizeof(*cld));
     -+	memcpy(&newborn->address, addr, addrlen);
     -+	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
     -+		if (!addrcmp(&(*cradle)->address, &newborn->address))
     -+			break;
     -+	newborn->next = *cradle;
     -+	*cradle = newborn;
     -+}
     -+
     -+/*
     -+ * This gets called if the number of connections grows
     -+ * past "max_connections".
     -+ *
     -+ * We kill the newest connection from a duplicate IP.
     -+ */
     -+static void kill_some_child(void)
     -+{
     -+	const struct child *blanket, *next;
     -+
     -+	if (!(blanket = firstborn))
     -+		return;
     -+
     -+	for (; (next = blanket->next); blanket = next)
     -+		if (!addrcmp(&blanket->address, &next->address)) {
     -+			kill(blanket->cld.pid, SIGTERM);
     -+			break;
     -+		}
     -+}
     -+
     -+static void check_dead_children(void)
     -+{
     -+	int status;
     -+	pid_t pid;
     -+
     -+	struct child **cradle, *blanket;
     -+	for (cradle = &firstborn; (blanket = *cradle);)
     -+		if ((pid = waitpid(blanket->cld.pid, &status, WNOHANG)) > 1) {
     -+			const char *dead = "";
     -+			if (status)
     -+				dead = " (with error)";
     -+			loginfo("[%"PRIuMAX"] Disconnected%s", (uintmax_t)pid, dead);
     -+
     -+			/* remove the child */
     -+			*cradle = blanket->next;
     -+			live_children--;
     -+			child_process_clear(&blanket->cld);
     -+			free(blanket);
     -+		} else
     -+			cradle = &blanket->next;
     -+}
     ++static struct child *first_child;
      +
      +static struct strvec cld_argv = STRVEC_INIT;
      +static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
     @@ t/helper/test-http-server.c (new)
      +	struct child_process cld = CHILD_PROCESS_INIT;
      +
      +	if (max_connections && live_children >= max_connections) {
     -+		kill_some_child();
     ++		kill_some_child(first_child);
      +		sleep(1);  /* give it some time to die */
     -+		check_dead_children();
     ++		check_dead_children(first_child, &live_children, loginfo);
      +		if (live_children >= max_connections) {
      +			close(incoming);
      +			logerror("Too many children, dropping connection");
     @@ t/helper/test-http-server.c (new)
      +	else if (start_command(&cld))
      +		logerror("unable to fork");
      +	else
     -+		add_child(&cld, addr, addrlen);
     ++		add_child(&cld, addr, addrlen, first_child, &live_children);
      +}
      +
      +static void child_handler(int signo)
     @@ t/helper/test-http-server.c (new)
      +	signal(SIGCHLD, child_handler);
      +}
      +
     -+static int set_reuse_addr(int sockfd)
     -+{
     -+	int on = 1;
     -+
     -+	if (!reuseaddr)
     -+		return 0;
     -+	return setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR,
     -+			  &on, sizeof(on));
     -+}
     -+
     -+struct socketlist {
     -+	int *list;
     -+	size_t nr;
     -+	size_t alloc;
     -+};
     -+
     -+static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
     -+{
     -+#ifdef NO_IPV6
     -+	static char ip[INET_ADDRSTRLEN];
     -+#else
     -+	static char ip[INET6_ADDRSTRLEN];
     -+#endif
     -+
     -+	switch (family) {
     -+#ifndef NO_IPV6
     -+	case AF_INET6:
     -+		inet_ntop(family, &((struct sockaddr_in6*)sin)->sin6_addr, ip, len);
     -+		break;
     -+#endif
     -+	case AF_INET:
     -+		inet_ntop(family, &((struct sockaddr_in*)sin)->sin_addr, ip, len);
     -+		break;
     -+	default:
     -+		xsnprintf(ip, sizeof(ip), "<unknown>");
     -+	}
     -+	return ip;
     -+}
     -+
     -+#ifndef NO_IPV6
     -+
     -+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
     -+{
     -+	int socknum = 0;
     -+	char pbuf[NI_MAXSERV];
     -+	struct addrinfo hints, *ai0, *ai;
     -+	int gai;
     -+	long flags;
     -+
     -+	xsnprintf(pbuf, sizeof(pbuf), "%d", listen_port);
     -+	memset(&hints, 0, sizeof(hints));
     -+	hints.ai_family = AF_UNSPEC;
     -+	hints.ai_socktype = SOCK_STREAM;
     -+	hints.ai_protocol = IPPROTO_TCP;
     -+	hints.ai_flags = AI_PASSIVE;
     -+
     -+	gai = getaddrinfo(listen_addr, pbuf, &hints, &ai0);
     -+	if (gai) {
     -+		logerror("getaddrinfo() for %s failed: %s", listen_addr, gai_strerror(gai));
     -+		return 0;
     -+	}
     -+
     -+	for (ai = ai0; ai; ai = ai->ai_next) {
     -+		int sockfd;
     -+
     -+		sockfd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
     -+		if (sockfd < 0)
     -+			continue;
     -+		if (sockfd >= FD_SETSIZE) {
     -+			logerror("Socket descriptor too large");
     -+			close(sockfd);
     -+			continue;
     -+		}
     -+
     -+#ifdef IPV6_V6ONLY
     -+		if (ai->ai_family == AF_INET6) {
     -+			int on = 1;
     -+			setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY,
     -+				   &on, sizeof(on));
     -+			/* Note: error is not fatal */
     -+		}
     -+#endif
     -+
     -+		if (set_reuse_addr(sockfd)) {
     -+			logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
     -+			close(sockfd);
     -+			continue;
     -+		}
     -+
     -+		set_keep_alive(sockfd);
     -+
     -+		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
     -+			logerror("Could not bind to %s: %s",
     -+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
     -+				 strerror(errno));
     -+			close(sockfd);
     -+			continue;	/* not fatal */
     -+		}
     -+		if (listen(sockfd, 5) < 0) {
     -+			logerror("Could not listen to %s: %s",
     -+				 ip2str(ai->ai_family, ai->ai_addr, ai->ai_addrlen),
     -+				 strerror(errno));
     -+			close(sockfd);
     -+			continue;	/* not fatal */
     -+		}
     -+
     -+		flags = fcntl(sockfd, F_GETFD, 0);
     -+		if (flags >= 0)
     -+			fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
     -+
     -+		ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
     -+		socklist->list[socklist->nr++] = sockfd;
     -+		socknum++;
     -+	}
     -+
     -+	freeaddrinfo(ai0);
     -+
     -+	return socknum;
     -+}
     -+
     -+#else /* NO_IPV6 */
     -+
     -+static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
     -+{
     -+	struct sockaddr_in sin;
     -+	int sockfd;
     -+	long flags;
     -+
     -+	memset(&sin, 0, sizeof sin);
     -+	sin.sin_family = AF_INET;
     -+	sin.sin_port = htons(listen_port);
     -+
     -+	if (listen_addr) {
     -+		/* Well, host better be an IP address here. */
     -+		if (inet_pton(AF_INET, listen_addr, &sin.sin_addr.s_addr) <= 0)
     -+			return 0;
     -+	} else {
     -+		sin.sin_addr.s_addr = htonl(INADDR_ANY);
     -+	}
     -+
     -+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
     -+	if (sockfd < 0)
     -+		return 0;
     -+
     -+	if (set_reuse_addr(sockfd)) {
     -+		logerror("Could not set SO_REUSEADDR: %s", strerror(errno));
     -+		close(sockfd);
     -+		return 0;
     -+	}
     -+
     -+	set_keep_alive(sockfd);
     -+
     -+	if (bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0) {
     -+		logerror("Could not bind to %s: %s",
     -+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
     -+			 strerror(errno));
     -+		close(sockfd);
     -+		return 0;
     -+	}
     -+
     -+	if (listen(sockfd, 5) < 0) {
     -+		logerror("Could not listen to %s: %s",
     -+			 ip2str(AF_INET, (struct sockaddr *)&sin, sizeof(sin)),
     -+			 strerror(errno));
     -+		close(sockfd);
     -+		return 0;
     -+	}
     -+
     -+	flags = fcntl(sockfd, F_GETFD, 0);
     -+	if (flags >= 0)
     -+		fcntl(sockfd, F_SETFD, flags | FD_CLOEXEC);
     -+
     -+	ALLOC_GROW(socklist->list, socklist->nr + 1, socklist->alloc);
     -+	socklist->list[socklist->nr++] = sockfd;
     -+	return 1;
     -+}
     -+
     -+#endif
     -+
     -+static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
     -+{
     -+	if (!listen_addr->nr)
     -+		setup_named_sock("127.0.0.1", listen_port, socklist);
     -+	else {
     -+		int i, socknum;
     -+		for (i = 0; i < listen_addr->nr; i++) {
     -+			socknum = setup_named_sock(listen_addr->items[i].string,
     -+						   listen_port, socklist);
     -+
     -+			if (socknum == 0)
     -+				logerror("unable to allocate any listen sockets for host %s on port %u",
     -+					 listen_addr->items[i].string, listen_port);
     -+		}
     -+	}
     -+}
     -+
      +static int service_loop(struct socketlist *socklist)
      +{
      +	struct pollfd *pfd;
     @@ t/helper/test-http-server.c (new)
      +		int nr_ready;
      +		int timeout = (pid_file ? 100 : -1);
      +
     -+		check_dead_children();
     ++		check_dead_children(first_child, &live_children, loginfo);
      +
      +		nr_ready = poll(pfd, socklist->nr, timeout);
      +		if (nr_ready < 0) {
     @@ t/helper/test-http-server.c (new)
      +{
      +	struct socketlist socklist = { NULL, 0, 0 };
      +
     -+	socksetup(listen_addr, listen_port, &socklist);
     ++	socksetup(listen_addr, listen_port, &socklist, reuseaddr, logerror);
      +	if (socklist.nr == 0)
      +		die("unable to allocate any listen sockets on port %u",
      +		    listen_port);
     @@ t/helper/test-http-server.c (new)
      +	int i;
      +
      +	trace2_cmd_name("test-http-server");
     ++	trace2_cmd_list_config();
     ++	trace2_cmd_list_env_vars();
      +	setup_git_directory_gently(NULL);
      +
      +	for (i = 1; i < argc; i++) {
     @@ t/helper/test-http-server.c (new)
      +			timeout = atoi(v);
      +			continue;
      +		}
     -+		if (skip_prefix(arg, "--init-timeout=", &v)) {
     -+			init_timeout = atoi(v);
     -+			continue;
     -+		}
      +		if (skip_prefix(arg, "--max-connections=", &v)) {
      +			max_connections = atoi(v);
      +			if (max_connections < 0)
  5:  5c4e36e23ee !  5:  6f66bf146b4 test-http-server: add HTTP request parsing
     @@ Metadata
      Author: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Commit message ##
     -    test-http-server: add HTTP request parsing
     +    test-http-server: add HTTP error response function
      
     -    Add ability to parse HTTP requests to the test-http-server test helper.
     +    Introduce a function to the test-http-server test helper to write more
     +    full and valid HTTP error responses, including all the standard response
     +    headers like `Server` and `Date`.
      
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## t/helper/test-http-server.c ##
      @@ t/helper/test-http-server.c: enum worker_result {
     - 	WR_STOP_THE_MUSIC = (WR_IO_ERROR | WR_HANGUP),
     + 	WR_HANGUP   = 1<<1,
       };
       
      +/*
     @@ t/helper/test-http-server.c: enum worker_result {
      +	string_list_clear(&req->header_list, 0);
      +}
      +
     - static enum worker_result send_http_error(
     - 	int fd,
     - 	int http_code, const char *http_code_name,
     -@@ t/helper/test-http-server.c: done:
     - 	return wr;
     - }
     - 
     ++static enum worker_result send_http_error(
     ++	int fd,
     ++	int http_code, const char *http_code_name,
     ++	int retry_after_seconds, struct string_list *response_headers,
     ++	enum worker_result wr_in)
     ++{
     ++	struct strbuf response_header = STRBUF_INIT;
     ++	struct strbuf response_content = STRBUF_INIT;
     ++	struct string_list_item *h;
     ++	enum worker_result wr;
     ++
     ++	strbuf_addf(&response_content, "Error: %d %s\r\n",
     ++		    http_code, http_code_name);
     ++	if (retry_after_seconds > 0)
     ++		strbuf_addf(&response_content, "Retry-After: %d\r\n",
     ++			    retry_after_seconds);
     ++
     ++	strbuf_addf  (&response_header, "HTTP/1.1 %d %s\r\n", http_code, http_code_name);
     ++	strbuf_addstr(&response_header, "Cache-Control: private\r\n");
     ++	strbuf_addstr(&response_header,	"Content-Type: text/plain\r\n");
     ++	strbuf_addf  (&response_header,	"Content-Length: %d\r\n", (int)response_content.len);
     ++	if (retry_after_seconds > 0)
     ++		strbuf_addf(&response_header, "Retry-After: %d\r\n", retry_after_seconds);
     ++	strbuf_addf(  &response_header,	"Server: test-http-server/%s\r\n", git_version_string);
     ++	strbuf_addf(  &response_header, "Date: %s\r\n", show_date(time(NULL), 0, DATE_MODE(RFC2822)));
     ++	if (response_headers)
     ++		for_each_string_list_item(h, response_headers)
     ++			strbuf_addf(&response_header, "%s\r\n", h->string);
     ++	strbuf_addstr(&response_header, "\r\n");
     ++
     ++	if (write_in_full(fd, response_header.buf, response_header.len) < 0) {
     ++		logerror("unable to write response header");
     ++		wr = WR_IO_ERROR;
     ++		goto done;
     ++	}
     ++
     ++	if (write_in_full(fd, response_content.buf, response_content.len) < 0) {
     ++		logerror("unable to write response content body");
     ++		wr = WR_IO_ERROR;
     ++		goto done;
     ++	}
     ++
     ++	wr = wr_in;
     ++
     ++done:
     ++	strbuf_release(&response_header);
     ++	strbuf_release(&response_content);
     ++
     ++	return wr;
     ++}
     ++
      +/*
      + * Read the HTTP request up to the start of the optional message-body.
      + * We do this byte-by-byte because we have keep-alive turned on and
     @@ t/helper/test-http-server.c: done:
      +		hp = strbuf_detach(&h, NULL);
      +		string_list_append(&req->header_list, hp);
      +
     -+		/* store common request headers separately */
     ++		/* also store common request headers as struct req members */
      +		if (skip_prefix(hp, "Content-Type: ", &hv)) {
      +			req->content_type = hv;
      +		} else if (skip_prefix(hp, "Content-Length: ", &hv)) {
     @@ t/helper/test-http-server.c: done:
      +	return result;
      +}
      +
     ++static int is_git_request(struct req *req)
     ++{
     ++	static regex_t *smart_http_regex;
     ++	static int initialized;
     ++
     ++	if (!initialized) {
     ++		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
     ++		/*
     ++		 * This regular expression matches all dumb and smart HTTP
     ++		 * requests that are currently in use, and defined in
     ++		 * Documentation/gitprotocol-http.txt.
     ++		 *
     ++		 */
     ++		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
     ++			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
     ++			    REG_EXTENDED)) {
     ++			warning("could not compile smart HTTP regex");
     ++			smart_http_regex = NULL;
     ++		}
     ++		initialized = 1;
     ++	}
     ++
     ++	return smart_http_regex &&
     ++		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
     ++}
     ++
     ++static enum worker_result do__git(struct req *req)
     ++{
     ++	const char *ok = "HTTP/1.1 200 OK\r\n";
     ++	struct child_process cp = CHILD_PROCESS_INIT;
     ++	int res;
     ++
     ++	/*
     ++	 * Note that we always respond with a 200 OK response even if the
     ++	 * http-backend process exits with an error. This helper is intended
     ++	 * only to be used to exercise the HTTP auth handling in the Git client,
     ++	 * and specifically around authentication (not handled by http-backend).
     ++	 *
     ++	 * If we wanted to respond with a more 'valid' HTTP response status then
     ++	 * we'd need to buffer the output of http-backend, wait for and grok the
     ++	 * exit status of the process, then write the HTTP status line followed
     ++	 * by the http-backend output. This is outside of the scope of this test
     ++	 * helper's use at time of writing.
     ++	 *
     ++	 * The important auth responses (401) we are handling prior to getting
     ++	 * to this point.
     ++	 */
     ++	if (write(STDOUT_FILENO, ok, strlen(ok)) < 0)
     ++		return error(_("could not send '%s'"), ok);
     ++
     ++	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
     ++	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
     ++			req->uri_path.buf);
     ++	strvec_push(&cp.env, "SERVER_PROTOCOL=HTTP/1.1");
     ++	if (req->query_args.len)
     ++		strvec_pushf(&cp.env, "QUERY_STRING=%s",
     ++				req->query_args.buf);
     ++	if (req->content_type)
     ++		strvec_pushf(&cp.env, "CONTENT_TYPE=%s",
     ++				req->content_type);
     ++	if (req->content_length >= 0)
     ++		strvec_pushf(&cp.env, "CONTENT_LENGTH=%" PRIdMAX,
     ++				(intmax_t)req->content_length);
     ++	cp.git_cmd = 1;
     ++	strvec_push(&cp.args, "http-backend");
     ++	res = run_command(&cp);
     ++	close(STDOUT_FILENO);
     ++	close(STDIN_FILENO);
     ++	return !!res;
     ++}
     ++
      +static enum worker_result dispatch(struct req *req)
      +{
     -+	return send_http_error(1, 501, "Not Implemented", -1, NULL,
     ++	if (is_git_request(req))
     ++		return do__git(req);
     ++
     ++	return send_http_error(STDOUT_FILENO, 501, "Not Implemented", -1, NULL,
      +			       WR_OK | WR_HANGUP);
      +}
      +
       static enum worker_result worker(void)
       {
     +-	const char *response = "HTTP/1.1 501 Not Implemented\r\n";
      +	struct req req = REQ__INIT;
       	char *client_addr = getenv("REMOTE_ADDR");
       	char *client_port = getenv("REMOTE_PORT");
       	enum worker_result wr = WR_OK;
      @@ t/helper/test-http-server.c: static enum worker_result worker(void)
     - 	set_keep_alive(0);
     + 	set_keep_alive(0, logerror);
       
       	while (1) {
     --		wr = send_http_error(1, 501, "Not Implemented", -1, NULL,
     --			WR_OK | WR_HANGUP);
     +-		if (write_in_full(STDOUT_FILENO, response, strlen(response)) < 0) {
     +-			logerror("unable to write response");
     +-			wr = WR_IO_ERROR;
     +-		}
      +		req__release(&req);
      +
     -+		alarm(init_timeout ? init_timeout : timeout);
     ++		alarm(timeout);
      +		wr = req__read(&req, 0);
      +		alarm(0);
      +
     -+		if (wr & WR_STOP_THE_MUSIC)
     ++		if (wr != WR_OK)
      +			break;
     -+
     + 
      +		wr = dispatch(&req);
     - 		if (wr & WR_STOP_THE_MUSIC)
     + 		if (wr != WR_OK)
       			break;
       	}
     +
     + ## t/t5556-http-auth.sh (new) ##
     +@@
     ++#!/bin/sh
     ++
     ++test_description='test http auth header and credential helper interop'
     ++
     ++TEST_NO_CREATE_REPO=1
     ++. ./test-lib.sh
     ++
     ++test_set_port GIT_TEST_HTTP_PROTOCOL_PORT
     ++
     ++# Setup a repository
     ++#
     ++REPO_DIR="$TRASH_DIRECTORY"/repo
     ++
     ++# Setup some lookback URLs where test-http-server will be listening.
     ++# We will spawn it directly inside the repo directory, so we avoid
     ++# any need to configure directory mappings etc - we only serve this
     ++# repository from the root '/' of the server.
     ++#
     ++HOST_PORT=127.0.0.1:$GIT_TEST_HTTP_PROTOCOL_PORT
     ++ORIGIN_URL=http://$HOST_PORT/
     ++
     ++# The pid-file is created by test-http-server when it starts.
     ++# The server will shutdown if/when we delete it (this is easier than
     ++# killing it by PID).
     ++#
     ++PID_FILE="$TRASH_DIRECTORY"/pid-file.pid
     ++SERVER_LOG="$TRASH_DIRECTORY"/OUT.server.log
     ++
     ++PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
     ++
     ++test_expect_success 'setup repos' '
     ++	test_create_repo "$REPO_DIR" &&
     ++	git -C "$REPO_DIR" branch -M main
     ++'
     ++
     ++stop_http_server () {
     ++	if ! test -f "$PID_FILE"
     ++	then
     ++		return 0
     ++	fi
     ++	#
     ++	# The server will shutdown automatically when we delete the pid-file.
     ++	#
     ++	rm -f "$PID_FILE"
     ++	#
     ++	# Give it a few seconds to shutdown (mainly to completely release the
     ++	# port before the next test start another instance and it attempts to
     ++	# bind to it).
     ++	#
     ++	for k in 0 1 2 3 4
     ++	do
     ++		if grep -q "Starting graceful shutdown" "$SERVER_LOG"
     ++		then
     ++			return 0
     ++		fi
     ++		sleep 1
     ++	done
     ++
     ++	echo "stop_http_server: timeout waiting for server shutdown"
     ++	return 1
     ++}
     ++
     ++start_http_server () {
     ++	#
     ++	# Launch our server into the background in repo_dir.
     ++	#
     ++	(
     ++		cd "$REPO_DIR"
     ++		test-http-server --verbose \
     ++			--listen=127.0.0.1 \
     ++			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
     ++			--reuseaddr \
     ++			--pid-file="$PID_FILE" \
     ++			"$@" \
     ++			2>"$SERVER_LOG" &
     ++	)
     ++	#
     ++	# Give it a few seconds to get started.
     ++	#
     ++	for k in 0 1 2 3 4
     ++	do
     ++		if test -f "$PID_FILE"
     ++		then
     ++			return 0
     ++		fi
     ++		sleep 1
     ++	done
     ++
     ++	echo "start_http_server: timeout waiting for server startup"
     ++	return 1
     ++}
     ++
     ++per_test_cleanup () {
     ++	stop_http_server &&
     ++	rm -f OUT.*
     ++}
     ++
     ++test_expect_success 'http auth anonymous no challenge' '
     ++	test_when_finished "per_test_cleanup" &&
     ++	start_http_server &&
     ++
     ++	# Attempt to read from a protected repository
     ++	git ls-remote $ORIGIN_URL
     ++'
     ++
     ++test_done
  7:  794256754c1 !  6:  c3c3d17a688 test-http-server: add simple authentication
     @@ Commit message
      
          Add simple authentication to the test-http-server test helper.
          Authentication schemes and sets of valid tokens can be specified via
     -    command-line arguments. Incoming requests are compared against the set
     -    of valid schemes and tokens and only approved if a matching token is
     -    found, or if no auth was provided and anonymous auth is enabled.
     +    a configuration file (in the normal gitconfig file format).
     +    Incoming requests are compared against the set of valid schemes and
     +    tokens and only approved if a matching token is found, or if no auth
     +    was provided and anonymous auth is enabled.
      
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## t/helper/test-http-server.c ##
     +@@
     + #include "version.h"
     + #include "dir.h"
     + #include "date.h"
     ++#include "config.h"
     + 
     + #define TR2_CAT "test-http-server"
     + 
      @@ t/helper/test-http-server.c: static const char test_http_auth_usage[] =
     - "           [--timeout=<n>] [--init-timeout=<n>] [--max-connections=<n>]\n"
     + "           [--timeout=<n>] [--max-connections=<n>]\n"
       "           [--reuseaddr] [--pid-file=<file>]\n"
       "           [--listen=<host_or_ipaddr>]* [--port=<n>]\n"
     -+"           [--anonymous-allowed]\n"
     -+"           [--auth=<scheme>[:<params>] [--auth-token=<scheme>:<token>]]*\n"
     ++"           [--auth-config=<file>]\n"
       ;
       
     - /* Timeout, and initial timeout */
     -@@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req, const char *user)
     + static unsigned int timeout;
     +@@ t/helper/test-http-server.c: static int is_git_request(struct req *req)
     + 		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
     + }
     + 
     +-static enum worker_result do__git(struct req *req)
     ++static enum worker_result do__git(struct req *req, const char *user)
     + {
     + 	const char *ok = "HTTP/1.1 200 OK\r\n";
     + 	struct child_process cp = CHILD_PROCESS_INIT;
     +@@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req)
     + 	if (write(STDOUT_FILENO, ok, strlen(ok)) < 0)
     + 		return error(_("could not send '%s'"), ok);
     + 
     ++	if (user)
     ++		strvec_pushf(&cp.env, "REMOTE_USER=%s", user);
     ++
     + 	strvec_pushf(&cp.env, "REQUEST_METHOD=%s", req->method);
     + 	strvec_pushf(&cp.env, "PATH_TRANSLATED=%s",
     + 			req->uri_path.buf);
     +@@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req)
       	return !!res;
       }
       
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +static struct auth_module **auth_modules = NULL;
      +static size_t auth_modules_nr = 0;
      +static size_t auth_modules_alloc = 0;
     ++static struct strvec extra_headers = STRVEC_INIT;
     ++
     ++static struct auth_module *create_auth_module(const char *scheme,
     ++					      const char *challenge)
     ++{
     ++	struct auth_module *mod = xmalloc(sizeof(struct auth_module));
     ++	mod->scheme = xstrdup(scheme);
     ++	mod->challenge_params = challenge ? xstrdup(challenge) : NULL;
     ++	CALLOC_ARRAY(mod->tokens, 1);
     ++	string_list_init_dup(mod->tokens);
     ++	return mod;
     ++}
      +
      +static struct auth_module *get_auth_module(const char *scheme)
      +{
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +	return NULL;
      +}
      +
     -+static void add_auth_module(struct auth_module *mod)
     ++static int add_auth_module(struct auth_module *mod)
      +{
     ++	if (get_auth_module(mod->scheme))
     ++		return error("duplicate auth scheme '%s'\n", mod->scheme);
     ++
      +	ALLOC_GROW(auth_modules, auth_modules_nr + 1, auth_modules_alloc);
      +	auth_modules[auth_modules_nr++] = mod;
     ++
     ++	return 0;
      +}
      +
      +static int is_authed(struct req *req, const char **user, enum worker_result *wr)
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +
      +	/*
      +	 * Check all auth modules and try to validate the request.
     -+	 * The first module that matches a valid token approves the request.
     ++	 * The first Authorization header that matches a known auth module
     ++	 * scheme will be consulted to either approve or deny the request.
      +	 * If no module is found, or if there is no valid token, then 401 error.
      +	 * Otherwise, only permit the request if anonymous auth is enabled.
     ++	 * It's atypical for user agents/clients to send multiple Authorization
     ++	 * headers, but not explicitly forbidden or defined.
      +	 */
      +	for_each_string_list_item(hdr, &req->header_list) {
      +		if (skip_iprefix(hdr->string, "Authorization: ", &v)) {
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +	case AUTH_UNKNOWN:
      +		if (result != AUTH_DENY && allow_anonymous)
      +			break;
     ++
      +		for (i = 0; i < auth_modules_nr; i++) {
      +			mod = auth_modules[i];
      +			if (mod->challenge_params)
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +						    mod->scheme);
      +			string_list_append(&hdrs, challenge);
      +		}
     -+		*wr = send_http_error(1, 401, "Unauthorized", -1, &hdrs, *wr);
     ++
     ++		for (i = 0; i < extra_headers.nr; i++)
     ++			string_list_append(&hdrs, extra_headers.v[i]);
     ++
     ++		*wr = send_http_error(STDOUT_FILENO, 401, "Unauthorized", -1,
     ++				      &hdrs, *wr);
      +	}
      +
      +	strbuf_list_free(split);
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +	return result == AUTH_ALLOW ||
      +	      (result == AUTH_UNKNOWN && allow_anonymous);
      +}
     ++
     ++static int split_auth_param(const char *str, char **scheme, char **val, int required_val)
     ++{
     ++	struct strbuf **p = strbuf_split_str(str, ':', 2);
     ++
     ++	if (!p[0])
     ++		return -1;
     ++
     ++	/* trim trailing ':' */
     ++	if (p[1])
     ++		strbuf_setlen(p[0], p[0]->len - 1);
     ++
     ++	if (required_val && !p[1])
     ++		return -1;
     ++
     ++	*scheme = strbuf_detach(p[0], NULL);
     ++
     ++	if (p[1])
     ++		*val = strbuf_detach(p[1], NULL);
     ++
     ++	strbuf_list_free(p);
     ++	return 0;
     ++}
     ++
     ++static int read_auth_config(const char *name, const char *val, void *data)
     ++{
     ++	int ret = 0;
     ++	char *scheme = NULL;
     ++	char *token = NULL;
     ++	char *challenge = NULL;
     ++	struct auth_module *mod = NULL;
     ++
     ++	if (!strcmp(name, "auth.challenge")) {
     ++		if (split_auth_param(val, &scheme, &challenge, 0)) {
     ++			ret = error("invalid auth challenge '%s'", val);
     ++			goto cleanup;
     ++		}
     ++
     ++		mod = create_auth_module(scheme, challenge);
     ++		if (add_auth_module(mod)) {
     ++			ret = error("failed to add auth module '%s'", val);
     ++			goto cleanup;
     ++		}
     ++	}
     ++	if (!strcmp(name, "auth.token")) {
     ++		if (split_auth_param(val, &scheme, &token, 1)) {
     ++			ret = error("invalid auth token '%s'", val);
     ++			goto cleanup;
     ++		}
     ++
     ++		mod = get_auth_module(scheme);
     ++		if (!mod) {
     ++			ret = error("auth scheme not defined '%s'\n", scheme);
     ++			goto cleanup;
     ++		}
     ++
     ++		string_list_append(mod->tokens, token);
     ++	}
     ++	if (!strcmp(name, "auth.allowanonymous")) {
     ++		allow_anonymous = git_config_bool(name, val);
     ++	}
     ++	if (!strcmp(name, "auth.extraheader")) {
     ++		strvec_push(&extra_headers, val);
     ++	}
     ++
     ++cleanup:
     ++	free(scheme);
     ++	free(token);
     ++	free(challenge);
     ++
     ++	return ret;
     ++}
      +
       static enum worker_result dispatch(struct req *req)
       {
     @@ t/helper/test-http-server.c: static enum worker_result do__git(struct req *req,
      +		return wr;
      +
       	if (is_git_request(req))
     --		return do__git(req, NULL);
     +-		return do__git(req);
      +		return do__git(req, user);
       
     - 	return send_http_error(1, 501, "Not Implemented", -1, NULL,
     + 	return send_http_error(STDOUT_FILENO, 501, "Not Implemented", -1, NULL,
       			       WR_OK | WR_HANGUP);
     -@@ t/helper/test-http-server.c: int cmd_main(int argc, const char **argv)
     - 	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
     - 	int worker_mode = 0;
     - 	int i;
     -+	struct auth_module *mod = NULL;
     - 
     - 	trace2_cmd_name("test-http-server");
     - 	setup_git_directory_gently(NULL);
      @@ t/helper/test-http-server.c: int cmd_main(int argc, const char **argv)
       			pid_file = v;
       			continue;
       		}
     -+		if (skip_prefix(arg, "--allow-anonymous", &v)) {
     -+			allow_anonymous = 1;
     -+			continue;
     -+		}
     -+		if (skip_prefix(arg, "--auth=", &v)) {
     -+			struct strbuf **p = strbuf_split_str(v, ':', 2);
     -+
     -+			if (!p[0]) {
     -+				error("invalid argument '%s'", v);
     ++		if (skip_prefix(arg, "--auth-config=", &v)) {
     ++			if (!strlen(v)) {
     ++				error("invalid argument - missing file path");
      +				usage(test_http_auth_usage);
      +			}
      +
     -+			/* trim trailing ':' */
     -+			if (p[1])
     -+				strbuf_setlen(p[0], p[0]->len - 1);
     -+
     -+			if (get_auth_module(p[0]->buf)) {
     -+				error("duplicate auth scheme '%s'\n", p[0]->buf);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			mod = xmalloc(sizeof(struct auth_module));
     -+			mod->scheme = xstrdup(p[0]->buf);
     -+			mod->challenge_params = p[1] ? xstrdup(p[1]->buf) : NULL;
     -+			CALLOC_ARRAY(mod->tokens, 1);
     -+			string_list_init_dup(mod->tokens);
     -+
     -+			add_auth_module(mod);
     -+
     -+			strbuf_list_free(p);
     -+			continue;
     -+		}
     -+		if (skip_prefix(arg, "--auth-token=", &v)) {
     -+			struct strbuf **p = strbuf_split_str(v, ':', 2);
     -+			if (!p[0]) {
     -+				error("invalid argument '%s'", v);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			if (!p[1]) {
     -+				error("missing token value '%s'\n", v);
     -+				usage(test_http_auth_usage);
     -+			}
     -+
     -+			/* trim trailing ':' */
     -+			strbuf_setlen(p[0], p[0]->len - 1);
     -+
     -+			mod = get_auth_module(p[0]->buf);
     -+			if (!mod) {
     -+				error("auth scheme not defined '%s'\n", p[0]->buf);
     ++			if (git_config_from_file(read_auth_config, v, NULL)) {
     ++				error("failed to read auth config file '%s'", v);
      +				usage(test_http_auth_usage);
      +			}
      +
     -+			string_list_append(mod->tokens, p[1]->buf);
     -+			strbuf_list_free(p);
      +			continue;
      +		}
       
  -:  ----------- >  7:  9c4d25945dd http: replace unsafe size_t multiplication with st_mult
  -:  ----------- >  8:  65a620b08ef strvec: expose strvec_push_nodup for external use
  1:  b5b56ccd941 !  9:  bcfec529d95 http: read HTTP WWW-Authenticate response headers
     @@ http.c: size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buff
       
      +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
      +{
     -+	size_t size = eltsize * nmemb;
     ++	size_t size = st_mult(eltsize, nmemb);
      +	struct strvec *values = &http_auth.wwwauth_headers;
      +	struct strbuf buf = STRBUF_INIT;
      +	const char *val;
     -+	const char *z = NULL;
      +
      +	/*
      +	 * Header lines may not come NULL-terminated from libcurl so we must
     @@ http.c: size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buff
      +	 * This line could be a continuation of the previously matched header
      +	 * field. If this is the case then we should append this value to the
      +	 * end of the previously consumed value.
     ++	 * Continuation lines start with at least one whitespace, maybe more,
     ++	 * so we should collapse these down to a single SP (valid per the spec).
      +	 */
      +	if (http_auth.header_is_last_match && isspace(*buf.buf)) {
     -+		const char **v = values->v + values->nr - 1;
     -+		char *append = xstrfmt("%s%.*s", *v, (int)(size - 1), ptr + 1);
     ++		/* Trim leading whitespace from this continuation hdr line. */
     ++		strbuf_ltrim(&buf);
      +
     -+		free((void*)*v);
     -+		*v = append;
     ++		/*
     ++		 * At this point we should always have at least one existing
     ++		 * value, even if it is empty. Do not bother appending the new
     ++		 * value if this continuation header is itself empty.
     ++		 */
     ++		if (!values->nr) {
     ++			BUG("should have at least one existing header value");
     ++		} else if (buf.len) {
     ++			const char *prev = values->v[values->nr - 1];
     ++			struct strbuf append = STRBUF_INIT;
     ++			strbuf_addstr(&append, prev);
     ++
     ++			/* Join two non-empty values with a single space. */
     ++			if (append.len)
     ++				strbuf_addch(&append, ' ');
     ++
     ++			strbuf_addbuf(&append, &buf);
     ++
     ++			strvec_pop(values);
     ++			strvec_push_nodup(values, strbuf_detach(&append, NULL));
     ++		}
      +
      +		goto exit;
      +	}
     @@ http.c: size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buff
      +	 * We only care about the last HTTP request response's headers so clear
      +	 * the existing array.
      +	 */
     -+	if (skip_iprefix(buf.buf, "http/", &z))
     ++	if (istarts_with(buf.buf, "http/"))
      +		strvec_clear(values);
      +
      +exit:
  2:  d02875dda7c <  -:  ----------- credential: add WWW-Authenticate header to cred requests
  4:  98dd286db7c <  -:  ----------- test-http-server: add HTTP error response function
  6:  0a0f4fd10c8 <  -:  ----------- test-http-server: pass Git requests to http-backend
  8:  8ecf6383522 ! 10:  af66d2d2ede t5556: add HTTP authentication tests
     @@ Metadata
      Author: Matthew John Cheetham <mjcheetham@outlook.com>
      
       ## Commit message ##
     -    t5556: add HTTP authentication tests
     +    credential: add WWW-Authenticate header to cred requests
      
     -    Add a series of tests to exercise the HTTP authentication header parsing
     +    Add the value of the WWW-Authenticate response header to credential
     +    requests. Credential helpers that understand and support HTTP
     +    authentication and authorization can use this standard header (RFC 2616
     +    Section 14.47 [1]) to generate valid credentials.
     +
     +    WWW-Authenticate headers can contain information pertaining to the
     +    authority, authentication mechanism, or extra parameters/scopes that are
     +    required.
     +
     +    The current I/O format for credential helpers only allows for unique
     +    names for properties/attributes, so in order to transmit multiple header
     +    values (with a specific order) we introduce a new convention whereby a
     +    C-style array syntax is used in the property name to denote multiple
     +    ordered values for the same property.
     +
     +    In this case we send multiple `wwwauth[]` properties where the order
     +    that the repeated attributes appear in the conversation reflects the
     +    order that the WWW-Authenticate headers appeared in the HTTP response.
     +
     +    Add a set of tests to exercise the HTTP authentication header parsing
          and the interop with credential helpers. Credential helpers will receive
          WWW-Authenticate information in credential requests.
      
     +    [1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47
     +
          Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
      
     + ## Documentation/git-credential.txt ##
     +@@ Documentation/git-credential.txt: separated by an `=` (equals) sign, followed by a newline.
     + The key may contain any bytes except `=`, newline, or NUL. The value may
     + contain any bytes except newline or NUL.
     + 
     +-In both cases, all bytes are treated as-is (i.e., there is no quoting,
     ++Attributes with keys that end with C-style array brackets `[]` can have
     ++multiple values. Each instance of a multi-valued attribute forms an
     ++ordered list of values - the order of the repeated attributes defines
     ++the order of the values. An empty multi-valued attribute (`key[]=\n`)
     ++acts to clear any previous entries and reset the list.
     ++
     ++In all cases, all bytes are treated as-is (i.e., there is no quoting,
     + and one cannot transmit a value with newline or NUL in it). The list of
     + attributes is terminated by a blank line or end-of-file.
     + 
     +@@ Documentation/git-credential.txt: empty string.
     + Components which are missing from the URL (e.g., there is no
     + username in the example above) will be left unset.
     + 
     ++`wwwauth[]`::
     ++
     ++	When an HTTP response is received by Git that includes one or more
     ++	'WWW-Authenticate' authentication headers, these will be passed by Git
     ++	to credential helpers.
     +++
     ++Each 'WWW-Authenticate' header value is passed as a multi-valued
     ++attribute 'wwwauth[]', where the order of the attributes is the same as
     ++they appear in the HTTP response. This attribute is 'one-way' from Git
     ++to pass additional information to credential helpers.
     ++
     + Unrecognised attributes are silently discarded.
     + 
     + GIT
     +
     + ## credential.c ##
     +@@ credential.c: static void credential_write_item(FILE *fp, const char *key, const char *value,
     + 	fprintf(fp, "%s=%s\n", key, value);
     + }
     + 
     ++static void credential_write_strvec(FILE *fp, const char *key,
     ++				    const struct strvec *vec)
     ++{
     ++	int i = 0;
     ++	const char *full_key = xstrfmt("%s[]", key);
     ++	for (; i < vec->nr; i++) {
     ++		credential_write_item(fp, full_key, vec->v[i], 0);
     ++	}
     ++	free((void*)full_key);
     ++}
     ++
     + void credential_write(const struct credential *c, FILE *fp)
     + {
     + 	credential_write_item(fp, "protocol", c->protocol, 1);
     +@@ credential.c: void credential_write(const struct credential *c, FILE *fp)
     + 	credential_write_item(fp, "path", c->path, 0);
     + 	credential_write_item(fp, "username", c->username, 0);
     + 	credential_write_item(fp, "password", c->password, 0);
     ++	credential_write_strvec(fp, "wwwauth", &c->wwwauth_headers);
     + }
     + 
     + static int run_credential_helper(struct credential *c,
     +
       ## t/helper/test-credential-helper-replay.sh (new) ##
      @@
      +cmd=$1
     @@ t/helper/test-credential-helper-replay.sh (new)
      +fi
      
       ## t/t5556-http-auth.sh ##
     -@@ t/t5556-http-auth.sh: PID_FILE="$(pwd)"/pid-file.pid
     - SERVER_LOG="$(pwd)"/OUT.server.log
     +@@ t/t5556-http-auth.sh: PID_FILE="$TRASH_DIRECTORY"/pid-file.pid
     + SERVER_LOG="$TRASH_DIRECTORY"/OUT.server.log
       
       PATH="$GIT_BUILD_DIR/t/helper/:$PATH" && export PATH
      +CREDENTIAL_HELPER="$GIT_BUILD_DIR/t/helper/test-credential-helper-replay.sh" \
     @@ t/t5556-http-auth.sh: start_http_server () {
       	stop_http_server &&
      -	rm -f OUT.*
      +	rm -f OUT.* &&
     -+	rm -f *.cred
     ++	rm -f *.cred &&
     ++	rm -f auth.config
       }
       
       test_expect_success 'http auth anonymous no challenge' '
     -@@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
     + 	test_when_finished "per_test_cleanup" &&
     +-	start_http_server &&
     ++
     ++	cat >auth.config <<-EOF &&
     ++	[auth]
     ++	    allowAnonymous = true
     ++	EOF
     ++
     ++	start_http_server --auth-config="$TRASH_DIRECTORY/auth.config" &&
     + 
     + 	# Attempt to read from a protected repository
       	git ls-remote $ORIGIN_URL
       '
       
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
      +	export USERPASS64 &&
      +
     -+	start_http_server \
     -+		--auth=basic:realm=\"example.com\" \
     -+		--auth-token=basic:$USERPASS64 &&
     ++	cat >auth.config <<-EOF &&
     ++	[auth]
     ++	    challenge = basic:realm=\"example.com\"
     ++	    token = basic:$USERPASS64
     ++	EOF
     ++
     ++	start_http_server --auth-config="$TRASH_DIRECTORY/auth.config" &&
      +
      +	cat >get-expected.cred <<-EOF &&
      +	protocol=http
     @@ t/t5556-http-auth.sh: test_expect_success 'http auth anonymous no challenge' '
      +	test_cmp store-expected.cred store-actual.cred
      +'
      +
     ++test_expect_success 'http auth www-auth headers to credential helper ignore case valid' '
     ++	test_when_finished "per_test_cleanup" &&
     ++	# base64("alice:secret-passwd")
     ++	USERPASS64=YWxpY2U6c2VjcmV0LXBhc3N3ZA== &&
     ++	export USERPASS64 &&
     ++
     ++	cat >auth.config <<-EOF &&
     ++	[auth]
     ++	    challenge = basic:realm=\"example.com\"
     ++	    token = basic:$USERPASS64
     ++	    extraHeader = wWw-aUtHeNtIcAtE: bEaRer auThoRiTy=\"id.example.com\"
     ++	EOF
     ++
     ++	start_http_server --auth-config="$TRASH_DIRECTORY/auth.config" &&
     ++
     ++	cat >get-expected.cred <<-EOF &&
     ++	protocol=http
     ++	host=$HOST_PORT
     ++	wwwauth[]=basic realm="example.com"
     ++	wwwauth[]=bEaRer auThoRiTy="id.example.com"
     ++	EOF
     ++
     ++	cat >store-expected.cred <<-EOF &&
     ++	protocol=http
     ++	host=$HOST_PORT
     ++	username=alice
     ++	password=secret-passwd
     ++	EOF
     ++
     ++	cat >get-response.cred <<-EOF &&
     ++	protocol=http
     ++	host=$HOST_PORT
     ++	username=alice
     ++	password=secret-passwd
     ++	EOF
     ++
     ++	git -c credential.helper="$CREDENTIAL_HELPER" ls-remote $ORIGIN_URL &&
     ++
     ++	test_cmp get-expected.cred get-actual.cred &&
     ++	test_cmp store-expected.cred store-actual.cred
     ++'
     ++
     ++test_expect_success 'http auth www-auth headers to credential helper continuation hdr' '
     ++	test_when_finishe