From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v4 06/13] remote-curl: fix parsing of detached SHA256 heads
Date: Tue, 7 May 2024 06:53:10 +0200 [thread overview]
Message-ID: <e8876052ada5845e9a5ea4e910ff9da2c0837c07.1715057362.git.ps@pks.im> (raw)
In-Reply-To: <cover.1715057362.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 3906 bytes --]
The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.
This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.
Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
remote-curl.c | 19 ++++++++++++++++++-
t/t5550-http-fetch-dumb.sh | 15 +++++++++++++++
2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/remote-curl.c b/remote-curl.c
index 0b6d7815fd..004b707fdf 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -265,14 +265,25 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
return list;
}
+/*
+ * Try to detect the hash algorithm used by the remote repository when using
+ * the dumb HTTP transport. As dumb transports cannot tell us the object hash
+ * directly have to derive it from the advertised ref lengths.
+ */
static const struct git_hash_algo *detect_hash_algo(struct discovery *heads)
{
const char *p = memchr(heads->buf, '\t', heads->len);
int algo;
+
+ /*
+ * In case the remote has no refs we have no way to reliably determine
+ * the object hash used by that repository. In that case we simply fall
+ * back to SHA1, which may or may not be correct.
+ */
if (!p)
- return the_hash_algo;
+ return &hash_algos[GIT_HASH_SHA1];
algo = hash_algo_by_length((p - heads->buf) / 2);
if (algo == GIT_HASH_UNKNOWN)
return NULL;
@@ -294,8 +305,14 @@ static struct ref *parse_info_refs(struct discovery *heads)
die("%sinfo/refs not valid: could not determine hash algorithm; "
"is this a git repository?",
transport_anonymize_url(url.buf));
+ /*
+ * Set the repository's hash algo to whatever we have just detected.
+ * This ensures that we can correctly parse the remote references.
+ */
+ repo_set_hash_algo(the_repository, hash_algo_by_ptr(options.hash_algo));
+
data = heads->buf;
start = NULL;
mid = data;
while (i < heads->len) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 4c3b32785d..5f16cbc58d 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -54,8 +54,23 @@ test_expect_success 'list refs from outside any repository' '
nongit git ls-remote "$HTTPD_URL/dumb/repo.git" >actual &&
test_cmp expect actual
'
+
+test_expect_success 'list detached HEAD from outside any repository' '
+ git clone --mirror "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+ "$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" \
+ update-ref --no-deref HEAD refs/heads/main &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" update-server-info &&
+ cat >expect <<-EOF &&
+ $(git rev-parse main) HEAD
+ $(git rev-parse main) refs/heads/main
+ EOF
+ nongit git ls-remote "$HTTPD_URL/dumb/repo-detached.git" >actual &&
+ test_cmp expect actual
+'
+
test_expect_success 'create password-protected repository' '
mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" &&
cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
"$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/repo.git"
--
2.45.0
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-05-07 4:53 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-19 9:51 [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 01/11] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-19 19:03 ` brian m. carlson
2024-04-22 4:56 ` Patrick Steinhardt
2024-04-22 16:15 ` Junio C Hamano
2024-04-23 4:50 ` Patrick Steinhardt
2024-04-23 16:54 ` Junio C Hamano
2024-04-19 9:51 ` [PATCH 02/11] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23 0:30 ` Justin Tobler
2024-04-19 9:51 ` [PATCH 03/11] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23 0:32 ` Justin Tobler
2024-04-19 9:51 ` [PATCH 04/11] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 05/11] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 06/11] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 07/11] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 08/11] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-19 9:51 ` [PATCH 09/11] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-22 18:41 ` Junio C Hamano
2024-04-19 9:51 ` [PATCH 10/11] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23 0:35 ` Justin Tobler
2024-04-19 9:51 ` [PATCH 11/11] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-19 19:12 ` [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` brian m. carlson
2024-04-19 19:16 ` Junio C Hamano
2024-04-22 4:56 ` Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 00/12] " Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 01/12] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 02/12] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 03/12] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 04/12] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 05/12] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 06/12] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-23 5:07 ` [PATCH v2 07/12] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-23 5:08 ` [PATCH v2 08/12] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-23 5:08 ` [PATCH v2 09/12] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-23 5:08 ` [PATCH v2 10/12] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-23 5:08 ` [PATCH v2 11/12] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23 5:08 ` [PATCH v2 12/12] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-27 22:09 ` [PATCH v2 00/12] Stop relying on SHA1 fallback for `the_hash_algo` Junio C Hamano
2024-04-29 6:05 ` Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 00/13] " Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 06/13] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-29 6:34 ` [PATCH v3 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-29 6:35 ` [PATCH v3 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-29 6:35 ` [PATCH v3 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-04-29 6:35 ` [PATCH v3 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-05-07 4:52 ` [PATCH v4 00/13] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-05-07 4:52 ` [PATCH v4 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-05-07 4:52 ` [PATCH v4 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-05-07 4:52 ` [PATCH v4 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-05-07 4:53 ` Patrick Steinhardt [this message]
2024-05-07 4:53 ` [PATCH v4 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-05-07 4:53 ` [PATCH v4 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8876052ada5845e9a5ea4e910ff9da2c0837c07.1715057362.git.ps@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jltobler@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).