From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Daniel Stenberg <daniel@haxx.se>
Subject: Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle
Date: Wed, 3 Apr 2024 08:34:37 +0200 [thread overview]
Message-ID: <Zgz4fTJg2iL07W_h@tanuki> (raw)
In-Reply-To: <Zgz3nvMLg4ts2rRI@tanuki>
[-- Attachment #1: Type: text/plain, Size: 5054 bytes --]
On Wed, Apr 03, 2024 at 08:30:54AM +0200, Patrick Steinhardt wrote:
> On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote:
> > In get_active_slot(), we return a CURL handle that may have been used
> > before (reusing them is good because it lets curl reuse the same
> > connection across many requests). We set a few curl options back to
> > defaults that may have been modified by previous requests.
> >
> > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
> > defaults to "-1"). This usually doesn't matter because most POSTs will
> > set both fields together anyway. But there is one exception: when
> > handling a large request in remote-curl's post_rpc(), we don't set
> > _either_, and instead set a READFUNCTION to stream data into libcurl.
> >
> > This can interact weirdly with a stale POSTFIELDSIZE setting, because
> > curl will assume it should read only some set number of bytes from our
> > READFUNCTION. However, it has worked in practice because we also
> > manually set a "Transfer-Encoding: chunked" header, which libcurl uses
> > as a clue to set the POSTFIELDSIZE to -1 itself.
> >
> > So everything works, but we're better off resetting the size manually
> > for a few reasons:
> >
> > - there was a regression in curl 8.7.0 where the chunked header
> > detection didn't kick in, causing any large HTTP requests made by
> > Git to fail. This has since been fixed (but not yet released). In
> > the issue, curl folks recommended setting it explicitly to -1:
> >
> > https://github.com/curl/curl/issues/13229#issuecomment-2029826058
> >
> > and it indeed works around the regression. So even though it won't
> > be strictly necessary after the fix there, this will help folks who
> > end up using the affected libcurl versions.
> >
> > - it's consistent with what a new curl handle would look like. Since
> > get_active_slot() may or may not return a used handle, this reduces
> > the possibility of heisenbugs that only appear with certain request
> > patterns.
> >
> > Note that the recommendation in the curl issue is to actually drop the
> > manual Transfer-Encoding header. Modern libcurl will add the header
> > itself when streaming from a READFUNCTION. However, that code wasn't
> > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
> > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
> > support back to 7.19.5, so those older versions still need the manual
> > header.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > http.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/http.c b/http.c
> > index e73b136e58..3d80bd6116 100644
> > --- a/http.c
> > +++ b/http.c
> > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void)
> > curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> > curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> > + curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L);
> > curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> > curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
> > curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
>
> Can't we refactor this code to instead use `curl_easy_reset()`? That
> function already resets most of the data we want to reset and would also
> end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So
> wouldn't the following be a more sensible fix?
>
> diff --git a/http.c b/http.c
> index e73b136e58..e5f5bc23db 100644
> --- a/http.c
> +++ b/http.c
> @@ -1442,20 +1442,14 @@ struct active_request_slot *get_active_slot(void)
> slot->finished = NULL;
> slot->callback_data = NULL;
> slot->callback_func = NULL;
> + curl_easy_reset(slot->curl);
> curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
> if (curl_save_cookies)
> curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
> curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
> curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
> curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
> - curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
> - curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> - curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> - curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> - curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> - curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
> curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
> - curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL);
>
> /*
> * Default following to off unless "ALWAYS" is configured; this gives
Oh well, the answer is "no", or at least not as easily as this, as the
failing tests tell us. I guess it resets more data than we actually want
it to reset, but I didn't dig any deeper than that.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-04-03 6:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-30 0:02 tests broken with curl-8.7.0 Jeff King
2024-03-30 8:54 ` Daniel Stenberg
2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King
2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King
2024-04-02 20:27 ` Junio C Hamano
2024-04-03 3:20 ` Jeff King
2024-04-03 6:30 ` Patrick Steinhardt
2024-04-03 6:34 ` Patrick Steinhardt [this message]
2024-04-03 20:18 ` Jeff King
2024-04-02 20:06 ` [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 Jeff King
2024-04-02 20:21 ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker
2024-04-02 20:31 ` Jeff King
2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King
2024-04-05 21:30 ` Daniel Stenberg
2024-04-05 21:49 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zgz4fTJg2iL07W_h@tanuki \
--to=ps@pks.im \
--cc=daniel@haxx.se \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).