git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Daniel Stenberg <daniel@haxx.se>
Subject: Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle
Date: Wed, 3 Apr 2024 08:34:37 +0200	[thread overview]
Message-ID: <Zgz4fTJg2iL07W_h@tanuki> (raw)
In-Reply-To: <Zgz3nvMLg4ts2rRI@tanuki>

[-- Attachment #1: Type: text/plain, Size: 5054 bytes --]

On Wed, Apr 03, 2024 at 08:30:54AM +0200, Patrick Steinhardt wrote:
> On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote:
> > In get_active_slot(), we return a CURL handle that may have been used
> > before (reusing them is good because it lets curl reuse the same
> > connection across many requests). We set a few curl options back to
> > defaults that may have been modified by previous requests.
> > 
> > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
> > defaults to "-1"). This usually doesn't matter because most POSTs will
> > set both fields together anyway. But there is one exception: when
> > handling a large request in remote-curl's post_rpc(), we don't set
> > _either_, and instead set a READFUNCTION to stream data into libcurl.
> > 
> > This can interact weirdly with a stale POSTFIELDSIZE setting, because
> > curl will assume it should read only some set number of bytes from our
> > READFUNCTION. However, it has worked in practice because we also
> > manually set a "Transfer-Encoding: chunked" header, which libcurl uses
> > as a clue to set the POSTFIELDSIZE to -1 itself.
> > 
> > So everything works, but we're better off resetting the size manually
> > for a few reasons:
> > 
> >   - there was a regression in curl 8.7.0 where the chunked header
> >     detection didn't kick in, causing any large HTTP requests made by
> >     Git to fail. This has since been fixed (but not yet released). In
> >     the issue, curl folks recommended setting it explicitly to -1:
> > 
> >       https://github.com/curl/curl/issues/13229#issuecomment-2029826058
> > 
> >     and it indeed works around the regression. So even though it won't
> >     be strictly necessary after the fix there, this will help folks who
> >     end up using the affected libcurl versions.
> > 
> >   - it's consistent with what a new curl handle would look like. Since
> >     get_active_slot() may or may not return a used handle, this reduces
> >     the possibility of heisenbugs that only appear with certain request
> >     patterns.
> > 
> > Note that the recommendation in the curl issue is to actually drop the
> > manual Transfer-Encoding header. Modern libcurl will add the header
> > itself when streaming from a READFUNCTION. However, that code wasn't
> > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
> > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
> > support back to 7.19.5, so those older versions still need the manual
> > header.
> > 
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> >  http.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/http.c b/http.c
> > index e73b136e58..3d80bd6116 100644
> > --- a/http.c
> > +++ b/http.c
> > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void)
> >  	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> >  	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> >  	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> > +	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L);
> >  	curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> >  	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
> >  	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
> 
> Can't we refactor this code to instead use `curl_easy_reset()`? That
> function already resets most of the data we want to reset and would also
> end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So
> wouldn't the following be a more sensible fix?
> 
> diff --git a/http.c b/http.c
> index e73b136e58..e5f5bc23db 100644
> --- a/http.c
> +++ b/http.c
> @@ -1442,20 +1442,14 @@ struct active_request_slot *get_active_slot(void)
>  	slot->finished = NULL;
>  	slot->callback_data = NULL;
>  	slot->callback_func = NULL;
> +	curl_easy_reset(slot->curl);
>  	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
>  	if (curl_save_cookies)
>  		curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
>  	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
>  	curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
>  	curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
> -	curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> -	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
>  	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
> -	curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL);
>  
>  	/*
>  	 * Default following to off unless "ALWAYS" is configured; this gives

Oh well, the answer is "no", or at least not as easily as this, as the
failing tests tell us. I guess it resets more data than we actually want
it to reset, but I didn't dig any deeper than that.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-04-03  6:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-30  0:02 tests broken with curl-8.7.0 Jeff King
2024-03-30  8:54 ` Daniel Stenberg
2024-04-02 20:02   ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King
2024-04-02 20:05     ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King
2024-04-02 20:27       ` Junio C Hamano
2024-04-03  3:20       ` Jeff King
2024-04-03  6:30       ` Patrick Steinhardt
2024-04-03  6:34         ` Patrick Steinhardt [this message]
2024-04-03 20:18           ` Jeff King
2024-04-02 20:06     ` [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 Jeff King
2024-04-02 20:21     ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker
2024-04-02 20:31       ` Jeff King
2024-04-05 20:04     ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King
2024-04-05 21:30       ` Daniel Stenberg
2024-04-05 21:49       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zgz4fTJg2iL07W_h@tanuki \
    --to=ps@pks.im \
    --cc=daniel@haxx.se \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).