* [BUG] fetching sometimes doesn't update refs @ 2018-07-29 12:19 Jeff King 2018-07-30 17:53 ` Brandon Williams ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Jeff King @ 2018-07-29 12:19 UTC (permalink / raw) To: Brandon Williams; +Cc: git I've noticed for the past couple of weeks that some of my fetches don't seem to actually update refs, but a follow-up fetch will. I finally managed to catch it in the act and track it down. It bisects to your 989b8c4452 (fetch-pack: put shallow info in output parameter, 2018-06-27). A reproduction recipe is below. I can't imagine why this repo in particular triggers it, but it was the one where I initially saw the problem (and doing a tiny reproduction does not seem to work). I'm guessing it has something to do with the refs, since the main change in the offending commit is that we recompute the refmap. -- >8 -- # clone the repo as it is today git clone https://github.com/cmcaine/tridactyl.git cd tridactyl # roll back the refs so that there is something to fetch for i in refs/heads/master refs/remotes/origin/master; do git update-ref $i $i^ done # and delete the now-unreferenced objects, pretending we are an earlier # clone that had not yet fetched rm -rf .git/logs git repack -ad # now fetch; this will get the objects but fail to update refs git fetch # and fetching again will actually update the refs git fetch -- 8< -- -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] fetching sometimes doesn't update refs 2018-07-29 12:19 [BUG] fetching sometimes doesn't update refs Jeff King @ 2018-07-30 17:53 ` Brandon Williams 2018-07-30 22:56 ` [PATCH] transport: report refs only if transport does Jonathan Tan 2018-08-01 20:13 ` [PATCH] fetch-pack: unify ref in and out param Jonathan Tan 2 siblings, 0 replies; 13+ messages in thread From: Brandon Williams @ 2018-07-30 17:53 UTC (permalink / raw) To: Jeff King; +Cc: git On 07/29, Jeff King wrote: > I've noticed for the past couple of weeks that some of my fetches don't > seem to actually update refs, but a follow-up fetch will. I finally > managed to catch it in the act and track it down. It bisects to your > 989b8c4452 (fetch-pack: put shallow info in output parameter, > 2018-06-27). > > A reproduction recipe is below. I can't imagine why this repo in > particular triggers it, but it was the one where I initially saw the > problem (and doing a tiny reproduction does not seem to work). I'm > guessing it has something to do with the refs, since the main change in > the offending commit is that we recompute the refmap. I've noticed this behavior sporadically as well, though I've never been able to reliably reproduce it, so thanks for creating a reproduction recipe. I suspected that it had to do with the ref-in-want series so thanks for tracking that down too. We'll take a look. > > -- >8 -- > # clone the repo as it is today > git clone https://github.com/cmcaine/tridactyl.git > cd tridactyl > > # roll back the refs so that there is something to fetch > for i in refs/heads/master refs/remotes/origin/master; do > git update-ref $i $i^ > done > > # and delete the now-unreferenced objects, pretending we are an earlier > # clone that had not yet fetched > rm -rf .git/logs > git repack -ad > > # now fetch; this will get the objects but fail to update refs > git fetch > > # and fetching again will actually update the refs > git fetch > -- 8< -- > > -Peff -- Brandon Williams ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] transport: report refs only if transport does 2018-07-29 12:19 [BUG] fetching sometimes doesn't update refs Jeff King 2018-07-30 17:53 ` Brandon Williams @ 2018-07-30 22:56 ` Jonathan Tan 2018-07-31 19:24 ` Jeff King 2018-08-01 20:13 ` [PATCH] fetch-pack: unify ref in and out param Jonathan Tan 2 siblings, 1 reply; 13+ messages in thread From: Jonathan Tan @ 2018-07-30 22:56 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, peff Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", 2018-06-28) allows transports to report the refs that they have fetched in a new out-parameter "fetched_refs". If they do so, transport_fetch_refs() makes this information available to its caller. Because transport_fetch_refs() filters the refs sent to the transport, it cannot just report the transport's result directly, but first needs to readd the excluded refs, pretending that they are fetched. However, this results in a wrong result if the transport did not report the refs that they have fetched in "fetched_refs" - the excluded refs would be added and reported, presenting an incomplete picture to the caller. Instead, readd the excluded refs only if the transport reported fetched refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Thanks for the reproduction recipe, Peff. Here's a fix. It can be reproduced with something using a remote helper's fetch command (and not using "connect" or "stateless-connect"), fetching at least one ref that requires a ref update and at least one that does not (as you can see from the included test). --- t/t5551-http-fetch-smart.sh | 18 ++++++++++++++++++ transport.c | 32 ++++++++++++++++++++++++-------- 2 files changed, 42 insertions(+), 8 deletions(-) diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh index 913089b144..989d034acc 100755 --- a/t/t5551-http-fetch-smart.sh +++ b/t/t5551-http-fetch-smart.sh @@ -369,6 +369,24 @@ test_expect_success 'custom http headers' ' submodule update sub ' +test_expect_success 'using fetch command in remote-curl updates refs' ' + SERVER="$HTTPD_DOCUMENT_ROOT_PATH/twobranch" && + rm -rf "$SERVER" client && + + git init "$SERVER" && + test_commit -C "$SERVER" foo && + git -C "$SERVER" update-ref refs/heads/anotherbranch foo && + + git clone $HTTPD_URL/smart/twobranch client && + + test_commit -C "$SERVER" bar && + git -C client -c protocol.version=0 fetch && + + git -C "$SERVER" rev-parse master >expect && + git -C client rev-parse origin/master >actual && + test_cmp expect actual +' + test_expect_success 'GIT_REDACT_COOKIES redacts cookies' ' rm -rf clone && echo "Set-Cookie: Foo=1" >cookies && diff --git a/transport.c b/transport.c index fdd813f684..2a2415d79c 100644 --- a/transport.c +++ b/transport.c @@ -1230,17 +1230,18 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs, struct ref **heads = NULL; struct ref *nop_head = NULL, **nop_tail = &nop_head; struct ref *rm; + struct ref *fetched_by_transport = NULL; for (rm = refs; rm; rm = rm->next) { nr_refs++; if (rm->peer_ref && !is_null_oid(&rm->old_oid) && !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) { - /* - * These need to be reported as fetched, but we don't - * actually need to fetch them. - */ if (fetched_refs) { + /* + * These may need to be reported as fetched, + * but we don't actually need to fetch them. + */ struct ref *nop_ref = copy_ref(rm); *nop_tail = nop_ref; nop_tail = &nop_ref->next; @@ -1264,10 +1265,25 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs, heads[nr_heads++] = rm; } - rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); - if (fetched_refs && nop_head) { - *nop_tail = *fetched_refs; - *fetched_refs = nop_head; + rc = transport->vtable->fetch(transport, nr_heads, heads, + fetched_refs ? &fetched_by_transport : NULL); + if (fetched_refs) { + if (fetched_by_transport) { + /* + * The transport reported its fetched refs. Pretend + * that we also fetched the ones that we didn't need to + * fetch. + */ + *nop_tail = fetched_by_transport; + *fetched_refs = nop_head; + } else if (!fetched_by_transport) { + /* + * The transport didn't report its fetched refs, so + * this function will not report them either. We have + * no use for nop_head. + */ + free_refs(nop_head); + } } free(heads); -- 2.18.0.345.g5c9ce644c3-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-30 22:56 ` [PATCH] transport: report refs only if transport does Jonathan Tan @ 2018-07-31 19:24 ` Jeff King 2018-07-31 21:38 ` Junio C Hamano 2018-07-31 23:23 ` Jonathan Tan 0 siblings, 2 replies; 13+ messages in thread From: Jeff King @ 2018-07-31 19:24 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > 2018-06-28) allows transports to report the refs that they have fetched > in a new out-parameter "fetched_refs". If they do so, > transport_fetch_refs() makes this information available to its caller. > > Because transport_fetch_refs() filters the refs sent to the transport, > it cannot just report the transport's result directly, but first needs > to readd the excluded refs, pretending that they are fetched. However, > this results in a wrong result if the transport did not report the refs > that they have fetched in "fetched_refs" - the excluded refs would be > added and reported, presenting an incomplete picture to the caller. This part leaves me confused. If we are not fetching them, then why do we need to pretend that they are fetched? I think I am showing my lack of understanding about the reason for this whole "return the fetched refs" scheme from 989b8c4452, and probably reading the rest of that series would make it more clear. But from the perspective of somebody digging into history and finding just this commit, it probably needs to lay out a little more of the reasoning. > Thanks for the reproduction recipe, Peff. Here's a fix. It can be > reproduced with something using a remote helper's fetch command (and not > using "connect" or "stateless-connect"), fetching at least one ref that > requires a ref update and at least one that does not (as you can see > from the included test). Ah, that explains why I couldn't reproduce it with another repository; I was using a direct git-upload-pack fetch, which wouldn't trigger the remote helper code. > diff --git a/transport.c b/transport.c > index fdd813f684..2a2415d79c 100644 > --- a/transport.c > +++ b/transport.c > @@ -1230,17 +1230,18 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs, > struct ref **heads = NULL; > struct ref *nop_head = NULL, **nop_tail = &nop_head; > struct ref *rm; > + struct ref *fetched_by_transport = NULL; > > for (rm = refs; rm; rm = rm->next) { > nr_refs++; > if (rm->peer_ref && > !is_null_oid(&rm->old_oid) && > !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) { > - /* > - * These need to be reported as fetched, but we don't > - * actually need to fetch them. > - */ > if (fetched_refs) { > + /* > + * These may need to be reported as fetched, > + * but we don't actually need to fetch them. > + */ So it's really this comment that leaves me the most puzzled. > @@ -1264,10 +1265,25 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs, > heads[nr_heads++] = rm; > } > > - rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); > - if (fetched_refs && nop_head) { > - *nop_tail = *fetched_refs; > - *fetched_refs = nop_head; > + rc = transport->vtable->fetch(transport, nr_heads, heads, > + fetched_refs ? &fetched_by_transport : NULL); > + if (fetched_refs) { > + if (fetched_by_transport) { > + /* > + * The transport reported its fetched refs. Pretend > + * that we also fetched the ones that we didn't need to > + * fetch. > + */ > + *nop_tail = fetched_by_transport; > + *fetched_refs = nop_head; > + } else if (!fetched_by_transport) { > + /* > + * The transport didn't report its fetched refs, so > + * this function will not report them either. We have > + * no use for nop_head. > + */ > + free_refs(nop_head); > + } This part makes sense to me based on the description (and on the assumption that reporting those nop refs is useful in the first place ;) ). So I think your fix here is probably the right thing, but I'm just left confused by the background a bit. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-31 19:24 ` Jeff King @ 2018-07-31 21:38 ` Junio C Hamano 2018-07-31 23:29 ` Jonathan Tan 2018-07-31 23:23 ` Jonathan Tan 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2018-07-31 21:38 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King <peff@peff.net> writes: > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > >> Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", >> 2018-06-28) allows transports to report the refs that they have fetched >> in a new out-parameter "fetched_refs". If they do so, >> transport_fetch_refs() makes this information available to its caller. >> >> Because transport_fetch_refs() filters the refs sent to the transport, >> it cannot just report the transport's result directly, but first needs >> to readd the excluded refs, pretending that they are fetched. However, >> this results in a wrong result if the transport did not report the refs >> that they have fetched in "fetched_refs" - the excluded refs would be >> added and reported, presenting an incomplete picture to the caller. > > This part leaves me confused. If we are not fetching them, then why do > we need to pretend that they are fetched? What leaves me even more confused is that the entire log message does not make it clear what the end-user observable problem the patch is trying to solve. Is this "we sometimes follow and sometimes fail to follow refs while fetching"? Does it affect all protocol versions and transports, or only just selected few (and if so which ones)? In minds of those who reported an issue and wrote the fix, the issue may be fresh, but let's write the commit log message for ourselves 6 months down the road. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-31 21:38 ` Junio C Hamano @ 2018-07-31 23:29 ` Jonathan Tan 0 siblings, 0 replies; 13+ messages in thread From: Jonathan Tan @ 2018-07-31 23:29 UTC (permalink / raw) To: gitster; +Cc: peff, jonathantanmy, git > What leaves me even more confused is that the entire log message > does not make it clear what the end-user observable problem the > patch is trying to solve. > > Is this "we sometimes follow and sometimes fail to follow refs while > fetching"? Does it affect all protocol versions and transports, or > only just selected few (and if so which ones)? Normally I would respond by creating a new patch with the answer in its commit message, but I'm now not sure about whether it's better to revert back to the non-"fetched_refs" API entirely (as I explained in the reply to Peff I just sent [1]), so I'll answer your questions here for now: - Yes. We fail to follow when we fetch at least one ref that is up-to-date and one ref that is not, and when we're using the "fetch" command in a remote helper (for example, HTTP protocol v0). - I haven't checked exhaustively, but as far as I know, affects HTTP protocol v0, and does not affect anything using connect or stateless-connect (e.g. HTTP protocol v2, ssh). When I create a new patch, I'll also include these answers in its commit message. [1] https://public-inbox.org/git/20180731232343.184463-1-jonathantanmy@google.com/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-31 19:24 ` Jeff King 2018-07-31 21:38 ` Junio C Hamano @ 2018-07-31 23:23 ` Jonathan Tan 2018-08-01 17:18 ` Brandon Williams 2018-08-02 16:30 ` Jeff King 1 sibling, 2 replies; 13+ messages in thread From: Jonathan Tan @ 2018-07-31 23:23 UTC (permalink / raw) To: peff; +Cc: jonathantanmy, git, bmwill > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > > > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > > 2018-06-28) allows transports to report the refs that they have fetched > > in a new out-parameter "fetched_refs". If they do so, > > transport_fetch_refs() makes this information available to its caller. > > > > Because transport_fetch_refs() filters the refs sent to the transport, > > it cannot just report the transport's result directly, but first needs > > to readd the excluded refs, pretending that they are fetched. However, > > this results in a wrong result if the transport did not report the refs > > that they have fetched in "fetched_refs" - the excluded refs would be > > added and reported, presenting an incomplete picture to the caller. > > This part leaves me confused. If we are not fetching them, then why do > we need to pretend that they are fetched? The short answer is that we need: (1) the complete list of refs that was passed to transport_fetch_refs(), (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if relevant), and (3) with updated OIDs if ref-in-want was used. The fetched_refs out param already fulfils (2) and (3), and this patch makes it fulfil (1). As for calling them fetched_refs, perhaps that is a misnomer, but they do appear in FETCH_HEAD even though they are not truly fetched. Which raises the question...if completeness is so important, why not reuse the input list of refs and document that transport_fetch_refs() can mutate the input list? You ask the same question below, so I'll put the answer after quoting your paragraph. > I think I am showing my lack of understanding about the reason for this > whole "return the fetched refs" scheme from 989b8c4452, and probably > reading the rest of that series would make it more clear. But from the > perspective of somebody digging into history and finding just this > commit, it probably needs to lay out a little more of the reasoning. I think it's because 989b8c4452 is based on my earlier work [1] which also had a fetched_refs out param. Its main reason is to enable the invoker of transport_fetch_refs() to specify ref patterns (as you can see in a later commit in the same patch set [2]) - and if we specify patterns, the invoker of transport_fetch_refs() needs the resulting refs (which are provided through fetched_refs). In the version that made it to master, however, there was some debate about whether ref patterns need to be allowed. In the end, ref patterns were not allowed [3], but the fetched_refs out param was still left in. I think that reverting the API might work, but am on the fence about it. It would reduce the number of questions about the code (and would probably automatically fix the issue that I was fixing in the first place), but if we were to revert the API and then decide that we do want ref patterns in "want-ref" (or expand transport_fetch_refs in some similar way), we would need to revert our revert, causing code churn. [1] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/ [2] https://public-inbox.org/git/eef2b77d88df0db08e4a1505b06e0af2d40143d5.1485381677.git.jonathantanmy@google.com/ [3] https://public-inbox.org/git/20180620213235.10952-1-bmwill@google.com/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-31 23:23 ` Jonathan Tan @ 2018-08-01 17:18 ` Brandon Williams 2018-08-02 16:30 ` Jeff King 1 sibling, 0 replies; 13+ messages in thread From: Brandon Williams @ 2018-08-01 17:18 UTC (permalink / raw) To: Jonathan Tan; +Cc: peff, git On 07/31, Jonathan Tan wrote: > > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > > > > > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > > > 2018-06-28) allows transports to report the refs that they have fetched > > > in a new out-parameter "fetched_refs". If they do so, > > > transport_fetch_refs() makes this information available to its caller. > > > > > > Because transport_fetch_refs() filters the refs sent to the transport, > > > it cannot just report the transport's result directly, but first needs > > > to readd the excluded refs, pretending that they are fetched. However, > > > this results in a wrong result if the transport did not report the refs > > > that they have fetched in "fetched_refs" - the excluded refs would be > > > added and reported, presenting an incomplete picture to the caller. > > > > This part leaves me confused. If we are not fetching them, then why do > > we need to pretend that they are fetched? > > The short answer is that we need: > (1) the complete list of refs that was passed to > transport_fetch_refs(), > (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) with updated OIDs if ref-in-want was used. > > The fetched_refs out param already fulfils (2) and (3), and this patch > makes it fulfil (1). As for calling them fetched_refs, perhaps that is a > misnomer, but they do appear in FETCH_HEAD even though they are not > truly fetched. > > Which raises the question...if completeness is so important, why not > reuse the input list of refs and document that transport_fetch_refs() > can mutate the input list? You ask the same question below, so I'll put > the answer after quoting your paragraph. > > > I think I am showing my lack of understanding about the reason for this > > whole "return the fetched refs" scheme from 989b8c4452, and probably > > reading the rest of that series would make it more clear. But from the > > perspective of somebody digging into history and finding just this > > commit, it probably needs to lay out a little more of the reasoning. > > I think it's because 989b8c4452 is based on my earlier work [1] which > also had a fetched_refs out param. Its main reason is to enable the > invoker of transport_fetch_refs() to specify ref patterns (as you can > see in a later commit in the same patch set [2]) - and if we specify > patterns, the invoker of transport_fetch_refs() needs the resulting refs > (which are provided through fetched_refs). > > In the version that made it to master, however, there was some debate > about whether ref patterns need to be allowed. In the end, ref patterns > were not allowed [3], but the fetched_refs out param was still left in. > > I think that reverting the API might work, but am on the fence about it. > It would reduce the number of questions about the code (and would > probably automatically fix the issue that I was fixing in the first If you believe the API is difficult to work with (which given this bug it is) then perhaps we go with your suggestion and revert the API back to only providing a list of input refs and having the fetch operation mutate that input list. > place), but if we were to revert the API and then decide that we do want > ref patterns in "want-ref" (or expand transport_fetch_refs in some > similar way), we would need to revert our revert, causing code churn. I haven't thought too much about what we would need to do in the event we add patterns to ref-in-want, but couldn't we possible mutate the input list again in this case and just simply add the resulting refs to the input list? > > [1] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/ > [2] https://public-inbox.org/git/eef2b77d88df0db08e4a1505b06e0af2d40143d5.1485381677.git.jonathantanmy@google.com/ > [3] https://public-inbox.org/git/20180620213235.10952-1-bmwill@google.com/ -- Brandon Williams ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] transport: report refs only if transport does 2018-07-31 23:23 ` Jonathan Tan 2018-08-01 17:18 ` Brandon Williams @ 2018-08-02 16:30 ` Jeff King 1 sibling, 0 replies; 13+ messages in thread From: Jeff King @ 2018-08-02 16:30 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, bmwill On Tue, Jul 31, 2018 at 04:23:43PM -0700, Jonathan Tan wrote: > > > Because transport_fetch_refs() filters the refs sent to the transport, > > > it cannot just report the transport's result directly, but first needs > > > to readd the excluded refs, pretending that they are fetched. However, > > > this results in a wrong result if the transport did not report the refs > > > that they have fetched in "fetched_refs" - the excluded refs would be > > > added and reported, presenting an incomplete picture to the caller. > > > > This part leaves me confused. If we are not fetching them, then why do > > we need to pretend that they are fetched? > > The short answer is that we need: > (1) the complete list of refs that was passed to > transport_fetch_refs(), > (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) with updated OIDs if ref-in-want was used. > > The fetched_refs out param already fulfils (2) and (3), and this patch > makes it fulfil (1). As for calling them fetched_refs, perhaps that is a > misnomer, but they do appear in FETCH_HEAD even though they are not > truly fetched. Thanks for this explanation. It does make more sense to me now, and I agree that a lot of my confusion was from calling it "fetched_refs" (and the comment saying "reported as fetched, but not actually fetched"). > Which raises the question...if completeness is so important, why not > reuse the input list of refs and document that transport_fetch_refs() > can mutate the input list? You ask the same question below, so I'll put > the answer after quoting your paragraph. > [...] Thanks, the answer here was enlightening as well. I see you posted a patch to go back to mutating the list, and that seems reasonable to me. I'm fine with a separate "out" list, too. Its purpose and expectations just need to be reflected in the name (and possibly in a comment). -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] fetch-pack: unify ref in and out param 2018-07-29 12:19 [BUG] fetching sometimes doesn't update refs Jeff King 2018-07-30 17:53 ` Brandon Williams 2018-07-30 22:56 ` [PATCH] transport: report refs only if transport does Jonathan Tan @ 2018-08-01 20:13 ` Jonathan Tan 2018-08-01 21:38 ` Brandon Williams 2018-08-02 16:40 ` Jeff King 2 siblings, 2 replies; 13+ messages in thread From: Jonathan Tan @ 2018-08-01 20:13 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, peff, gitster, bmwill When a user fetches: - at least one up-to-date ref and at least one non-up-to-date ref, - using HTTP with protocol v0 (or something else that uses the fetch command of a remote helper) some refs might not be updated after the fetch. This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", 2018-06-28) which allowed transports to report the refs that they have fetched in a new out-parameter "fetched_refs". If they do so, transport_fetch_refs() makes this information available to its caller. Users of "fetched_refs" rely on the following 3 properties: (1) it is the complete list of refs that was passed to transport_fetch_refs(), (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if relevant), and (3) it has updated OIDs if ref-in-want was used (introduced after 989b8c4452). In an effort to satisfy (1), whenever transport_fetch_refs() filters the refs sent to the transport, it re-adds the filtered refs to whatever the transport supplies before returning it to the user. However, the implementation in 989b8c4452 unconditionally re-adds the filtered refs without checking if the transport refrained from reporting anything in "fetched_refs" (which it is allowed to do), resulting in an incomplete list, no longer satisfying (1). An earlier effort to resolve this [1] solved the issue by readding the filtered refs only if the transport did not refrain from reporting in "fetched_refs", but after further discussion, it seems that the better solution is to revert the API change that introduced "fetched_refs". This API change was first suggested as part of a ref-in-want implementation that allowed for ref patterns and, thus, there could be drastic differences between the input refs and the refs actually fetched [2]; we eventually decided to only allow exact ref names, but this API change remained even though its necessity was decreased. Therefore, revert this API change by reverting commit 989b8c4452, and make receive_wanted_refs() update the OIDs in the sought array (like how update_shallow() updates shallow information in the sought array) instead. A test is also included to show that the user-visible bug discussed at the beginning of this commit message no longer exists. [1] https://public-inbox.org/git/20180801171806.GA122458@google.com/ [2] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/ Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- I now think that it's better to revert the API change introducing "fetched_refs" (or as Peff describes it, "this whole 'return the fetched refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to have covered all of Peff's and Junio's questions in the commit message. As for Brandon's question: > I haven't thought too much about what we would need to do in the event > we add patterns to ref-in-want, but couldn't we possible mutate the > input list again in this case and just simply add the resulting refs to > the input list? If we support ref patterns, we would need to support deletion of refs, not just addition (because a ref might have existed in the initial ref advertisement, but not when the packfile is delivered). But it should be possible to add a flag stating "don't use this" to the ref, and document that transport_fetch_refs() can append additional refs to the tail of the input list. Upon hindsight, maybe this should have been the original API change instead of the "fetched_refs" mechanism. --- builtin/clone.c | 4 ++-- builtin/fetch.c | 28 ++++------------------------ fetch-object.c | 2 +- fetch-pack.c | 30 +++++++++++++++--------------- t/t5551-http-fetch-smart.sh | 18 ++++++++++++++++++ transport-helper.c | 6 ++---- transport-internal.h | 9 +-------- transport.c | 34 ++++++---------------------------- transport.h | 3 +-- 9 files changed, 50 insertions(+), 84 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 5c439f139..76f7db47e 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1156,7 +1156,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } if (!is_local && !complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs, NULL); + transport_fetch_refs(transport, mapped_refs); remote_head = find_ref_by_name(refs, "HEAD"); remote_head_points_at = @@ -1198,7 +1198,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs, NULL); + transport_fetch_refs(transport, mapped_refs); update_remote_refs(refs, mapped_refs, remote_head_points_at, branch_top.buf, reflog_msg.buf, transport, diff --git a/builtin/fetch.c b/builtin/fetch.c index ac06f6a57..d136ace87 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -936,13 +936,11 @@ static int quickfetch(struct ref *ref_map) return check_connected(iterate_ref_map, &rm, &opt); } -static int fetch_refs(struct transport *transport, struct ref *ref_map, - struct ref **updated_remote_refs) +static int fetch_refs(struct transport *transport, struct ref *ref_map) { int ret = quickfetch(ref_map); if (ret) - ret = transport_fetch_refs(transport, ref_map, - updated_remote_refs); + ret = transport_fetch_refs(transport, ref_map); if (!ret) /* * Keep the new pack's ".keep" file around to allow the caller @@ -1107,7 +1105,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - if (!fetch_refs(transport, ref_map, NULL)) + if (!fetch_refs(transport, ref_map)) consume_refs(transport, ref_map); if (gsecondary) { @@ -1123,7 +1121,6 @@ static int do_fetch(struct transport *transport, int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; const struct ref *remote_refs; - struct ref *updated_remote_refs = NULL; struct argv_array ref_prefixes = ARGV_ARRAY_INIT; if (tags == TAGS_DEFAULT) { @@ -1174,24 +1171,7 @@ static int do_fetch(struct transport *transport, transport->url); } } - - if (fetch_refs(transport, ref_map, &updated_remote_refs)) { - free_refs(ref_map); - retcode = 1; - goto cleanup; - } - if (updated_remote_refs) { - /* - * Regenerate ref_map using the updated remote refs. This is - * to account for additional information which may be provided - * by the transport (e.g. shallow info). - */ - free_refs(ref_map); - ref_map = get_ref_map(transport->remote, updated_remote_refs, rs, - tags, &autotags); - free_refs(updated_remote_refs); - } - if (consume_refs(transport, ref_map)) { + if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; diff --git a/fetch-object.c b/fetch-object.c index 48fe63dd6..853624f81 100644 --- a/fetch-object.c +++ b/fetch-object.c @@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref *ref) transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1"); - transport_fetch_refs(transport, ref, NULL); + transport_fetch_refs(transport, ref); fetch_if_missing = original_fetch_if_missing; } diff --git a/fetch-pack.c b/fetch-pack.c index 7ccb9c0d4..b80c13124 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1339,25 +1339,26 @@ static void receive_shallow_info(struct fetch_pack_args *args, args->deepen = 1; } -static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs) +static void receive_wanted_refs(struct packet_reader *reader, + struct ref **sought, int nr_sought) { process_section_header(reader, "wanted-refs", 0); while (packet_reader_read(reader) == PACKET_READ_NORMAL) { struct object_id oid; const char *end; - struct ref *r = NULL; + int i; if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ') die("expected wanted-ref, got '%s'", reader->line); - for (r = refs; r; r = r->next) { - if (!strcmp(end, r->name)) { - oidcpy(&r->old_oid, &oid); + for (i = 0; i < nr_sought; i++) { + if (!strcmp(end, sought[i]->name)) { + oidcpy(&sought[i]->old_oid, &oid); break; } } - if (!r) + if (i == nr_sought) die("unexpected wanted-ref: '%s'", reader->line); } @@ -1440,7 +1441,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, receive_shallow_info(args, &reader); if (process_section_header(&reader, "wanted-refs", 1)) - receive_wanted_refs(&reader, ref); + receive_wanted_refs(&reader, sought, nr_sought); /* get the pack */ process_section_header(&reader, "packfile", 0); @@ -1504,13 +1505,12 @@ static int remove_duplicates_in_refs(struct ref **ref, int nr) } static void update_shallow(struct fetch_pack_args *args, - struct ref *refs, + struct ref **sought, int nr_sought, struct shallow_info *si) { struct oid_array ref = OID_ARRAY_INIT; int *status; int i; - struct ref *r; if (args->deepen && alternate_shallow_file) { if (*alternate_shallow_file == '\0') { /* --unshallow */ @@ -1552,8 +1552,8 @@ static void update_shallow(struct fetch_pack_args *args, remove_nonexistent_theirs_shallow(si); if (!si->nr_ours && !si->nr_theirs) return; - for (r = refs; r; r = r->next) - oid_array_append(&ref, &r->old_oid); + for (i = 0; i < nr_sought; i++) + oid_array_append(&ref, &sought[i]->old_oid); si->ref = &ref; if (args->update_shallow) { @@ -1587,12 +1587,12 @@ static void update_shallow(struct fetch_pack_args *args, * remote is also shallow, check what ref is safe to update * without updating .git/shallow */ - status = xcalloc(ref.nr, sizeof(*status)); + status = xcalloc(nr_sought, sizeof(*status)); assign_shallow_commits_to_refs(si, NULL, status); if (si->nr_ours || si->nr_theirs) { - for (r = refs, i = 0; r; r = r->next, i++) + for (i = 0; i < nr_sought; i++) if (status[i]) - r->status = REF_STATUS_REJECT_SHALLOW; + sought[i]->status = REF_STATUS_REJECT_SHALLOW; } free(status); oid_array_clear(&ref); @@ -1655,7 +1655,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args, args->connectivity_checked = 1; } - update_shallow(args, ref_cpy, &si); + update_shallow(args, sought, nr_sought, &si); cleanup: clear_shallow_info(&si); return ref_cpy; diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh index 913089b14..989d034ac 100755 --- a/t/t5551-http-fetch-smart.sh +++ b/t/t5551-http-fetch-smart.sh @@ -369,6 +369,24 @@ test_expect_success 'custom http headers' ' submodule update sub ' +test_expect_success 'using fetch command in remote-curl updates refs' ' + SERVER="$HTTPD_DOCUMENT_ROOT_PATH/twobranch" && + rm -rf "$SERVER" client && + + git init "$SERVER" && + test_commit -C "$SERVER" foo && + git -C "$SERVER" update-ref refs/heads/anotherbranch foo && + + git clone $HTTPD_URL/smart/twobranch client && + + test_commit -C "$SERVER" bar && + git -C client -c protocol.version=0 fetch && + + git -C "$SERVER" rev-parse master >expect && + git -C client rev-parse origin/master >actual && + test_cmp expect actual +' + test_expect_success 'GIT_REDACT_COOKIES redacts cookies' ' rm -rf clone && echo "Set-Cookie: Foo=1" >cookies && diff --git a/transport-helper.c b/transport-helper.c index 8b5abca29..1f8ff7e94 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -651,16 +651,14 @@ static int connect_helper(struct transport *transport, const char *name, } static int fetch(struct transport *transport, - int nr_heads, struct ref **to_fetch, - struct ref **fetched_refs) + int nr_heads, struct ref **to_fetch) { struct helper_data *data = transport->data; int i, count; if (process_connect(transport, 0)) { do_take_over(transport); - return transport->vtable->fetch(transport, nr_heads, to_fetch, - fetched_refs); + return transport->vtable->fetch(transport, nr_heads, to_fetch); } count = 0; diff --git a/transport-internal.h b/transport-internal.h index eeb6c340e..1cde6258a 100644 --- a/transport-internal.h +++ b/transport-internal.h @@ -36,18 +36,11 @@ struct transport_vtable { * Fetch the objects for the given refs. Note that this gets * an array, and should ignore the list structure. * - * The transport *may* provide, in fetched_refs, the list of refs that - * it fetched. If the transport knows anything about the fetched refs - * that the caller does not know (for example, shallow status), it - * should provide that list of refs and include that information in the - * list. - * * If the transport did not get hashes for refs in * get_refs_list(), it should set the old_sha1 fields in the * provided refs now. **/ - int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs, - struct ref **fetched_refs); + int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs); /** * Push the objects and refs. Send the necessary objects, and diff --git a/transport.c b/transport.c index fdd813f68..af6e692db 100644 --- a/transport.c +++ b/transport.c @@ -151,8 +151,7 @@ static struct ref *get_refs_from_bundle(struct transport *transport, } static int fetch_refs_from_bundle(struct transport *transport, - int nr_heads, struct ref **to_fetch, - struct ref **fetched_refs) + int nr_heads, struct ref **to_fetch) { struct bundle_transport_data *data = transport->data; return unbundle(&data->header, data->fd, @@ -288,8 +287,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus } static int fetch_refs_via_pack(struct transport *transport, - int nr_heads, struct ref **to_fetch, - struct ref **fetched_refs) + int nr_heads, struct ref **to_fetch) { int ret = 0; struct git_transport_data *data = transport->data; @@ -357,12 +355,8 @@ static int fetch_refs_via_pack(struct transport *transport, if (report_unmatched_refs(to_fetch, nr_heads)) ret = -1; - if (fetched_refs) - *fetched_refs = refs; - else - free_refs(refs); - free_refs(refs_tmp); + free_refs(refs); free(dest); return ret; } @@ -1222,31 +1216,19 @@ const struct ref *transport_get_remote_refs(struct transport *transport, return transport->remote_refs; } -int transport_fetch_refs(struct transport *transport, struct ref *refs, - struct ref **fetched_refs) +int transport_fetch_refs(struct transport *transport, struct ref *refs) { int rc; int nr_heads = 0, nr_alloc = 0, nr_refs = 0; struct ref **heads = NULL; - struct ref *nop_head = NULL, **nop_tail = &nop_head; struct ref *rm; for (rm = refs; rm; rm = rm->next) { nr_refs++; if (rm->peer_ref && !is_null_oid(&rm->old_oid) && - !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) { - /* - * These need to be reported as fetched, but we don't - * actually need to fetch them. - */ - if (fetched_refs) { - struct ref *nop_ref = copy_ref(rm); - *nop_tail = nop_ref; - nop_tail = &nop_ref->next; - } + !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) continue; - } ALLOC_GROW(heads, nr_heads + 1, nr_alloc); heads[nr_heads++] = rm; } @@ -1264,11 +1246,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs, heads[nr_heads++] = rm; } - rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); - if (fetched_refs && nop_head) { - *nop_tail = *fetched_refs; - *fetched_refs = nop_head; - } + rc = transport->vtable->fetch(transport, nr_heads, heads); free(heads); return rc; diff --git a/transport.h b/transport.h index 7a9a7fcaf..c057c44d3 100644 --- a/transport.h +++ b/transport.h @@ -229,8 +229,7 @@ int transport_push(struct transport *connection, const struct ref *transport_get_remote_refs(struct transport *transport, const struct argv_array *ref_prefixes); -int transport_fetch_refs(struct transport *transport, struct ref *refs, - struct ref **fetched_refs); +int transport_fetch_refs(struct transport *transport, struct ref *refs); void transport_unlock_pack(struct transport *transport); int transport_disconnect(struct transport *transport); char *transport_anonymize_url(const char *url); -- 2.18.0.597.ga71716f1ad-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] fetch-pack: unify ref in and out param 2018-08-01 20:13 ` [PATCH] fetch-pack: unify ref in and out param Jonathan Tan @ 2018-08-01 21:38 ` Brandon Williams 2018-08-01 22:23 ` Junio C Hamano 2018-08-02 16:40 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Brandon Williams @ 2018-08-01 21:38 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, peff, gitster On 08/01, Jonathan Tan wrote: > When a user fetches: > - at least one up-to-date ref and at least one non-up-to-date ref, > - using HTTP with protocol v0 (or something else that uses the fetch > command of a remote helper) > some refs might not be updated after the fetch. > > This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow > info in output parameter", 2018-06-28) which allowed transports to > report the refs that they have fetched in a new out-parameter > "fetched_refs". If they do so, transport_fetch_refs() makes this > information available to its caller. > > Users of "fetched_refs" rely on the following 3 properties: > (1) it is the complete list of refs that was passed to > transport_fetch_refs(), > (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) it has updated OIDs if ref-in-want was used (introduced after > 989b8c4452). > > In an effort to satisfy (1), whenever transport_fetch_refs() > filters the refs sent to the transport, it re-adds the filtered refs to > whatever the transport supplies before returning it to the user. > However, the implementation in 989b8c4452 unconditionally re-adds the > filtered refs without checking if the transport refrained from reporting > anything in "fetched_refs" (which it is allowed to do), resulting in an > incomplete list, no longer satisfying (1). > > An earlier effort to resolve this [1] solved the issue by readding the > filtered refs only if the transport did not refrain from reporting in > "fetched_refs", but after further discussion, it seems that the better > solution is to revert the API change that introduced "fetched_refs". > This API change was first suggested as part of a ref-in-want > implementation that allowed for ref patterns and, thus, there could be > drastic differences between the input refs and the refs actually fetched > [2]; we eventually decided to only allow exact ref names, but this API > change remained even though its necessity was decreased. > > Therefore, revert this API change by reverting commit 989b8c4452, and > make receive_wanted_refs() update the OIDs in the sought array (like how > update_shallow() updates shallow information in the sought array) > instead. A test is also included to show that the user-visible bug > discussed at the beginning of this commit message no longer exists. > > [1] https://public-inbox.org/git/20180801171806.GA122458@google.com/ > [2] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/ > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > I now think that it's better to revert the API change introducing > "fetched_refs" (or as Peff describes it, "this whole 'return the fetched > refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to > have covered all of Peff's and Junio's questions in the commit message. > > As for Brandon's question: > > > I haven't thought too much about what we would need to do in the event > > we add patterns to ref-in-want, but couldn't we possible mutate the > > input list again in this case and just simply add the resulting refs to > > the input list? > > If we support ref patterns, we would need to support deletion of refs, > not just addition (because a ref might have existed in the initial ref > advertisement, but not when the packfile is delivered). But it should > be possible to add a flag stating "don't use this" to the ref, and > document that transport_fetch_refs() can append additional refs to the > tail of the input list. Upon hindsight, maybe this should have been the > original API change instead of the "fetched_refs" mechanism. Thanks for getting this out, it looks good to me. If we end up adding patterns to ref-in-want then we can explore what changes would need to be made then, I expect we may need to do a bit more work on the whole fetching stack to get what we'd want in that case (because we would want to avoid this issue again). -- Brandon Williams ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fetch-pack: unify ref in and out param 2018-08-01 21:38 ` Brandon Williams @ 2018-08-01 22:23 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2018-08-01 22:23 UTC (permalink / raw) To: Brandon Williams; +Cc: Jonathan Tan, git, peff Brandon Williams <bmwill@google.com> writes: > ..., I expect we may need to do a bit more work on the whole > fetching stack to get what we'd want in that case (because we would want > to avoid this issue again). Amen. Thanks all. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fetch-pack: unify ref in and out param 2018-08-01 20:13 ` [PATCH] fetch-pack: unify ref in and out param Jonathan Tan 2018-08-01 21:38 ` Brandon Williams @ 2018-08-02 16:40 ` Jeff King 1 sibling, 0 replies; 13+ messages in thread From: Jeff King @ 2018-08-02 16:40 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, gitster, bmwill On Wed, Aug 01, 2018 at 01:13:20PM -0700, Jonathan Tan wrote: > When a user fetches: > - at least one up-to-date ref and at least one non-up-to-date ref, > - using HTTP with protocol v0 (or something else that uses the fetch > command of a remote helper) > some refs might not be updated after the fetch. > > This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow > info in output parameter", 2018-06-28) which allowed transports to > report the refs that they have fetched in a new out-parameter > "fetched_refs". If they do so, transport_fetch_refs() makes this > information available to its caller. > > Users of "fetched_refs" rely on the following 3 properties: > (1) it is the complete list of refs that was passed to > transport_fetch_refs(), > (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) it has updated OIDs if ref-in-want was used (introduced after > 989b8c4452). > [...] Thanks, this is a very clear and well-organized commit message. It answers my questions, and I agree with the general notion of "we can figure out the right API for ref patterns later" approach. > builtin/clone.c | 4 ++-- > builtin/fetch.c | 28 ++++------------------------ > fetch-object.c | 2 +- > fetch-pack.c | 30 +++++++++++++++--------------- > t/t5551-http-fetch-smart.sh | 18 ++++++++++++++++++ > transport-helper.c | 6 ++---- > transport-internal.h | 9 +-------- > transport.c | 34 ++++++---------------------------- > transport.h | 3 +-- The patch itself looks sane to me, and obviously fixes the problem. I cannot offhand think of any reason that munging the existing list would be a problem (though it has been a while since I have dealt with this code, so take that with the appropriate grain of salt). -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-08-02 16:40 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-29 12:19 [BUG] fetching sometimes doesn't update refs Jeff King 2018-07-30 17:53 ` Brandon Williams 2018-07-30 22:56 ` [PATCH] transport: report refs only if transport does Jonathan Tan 2018-07-31 19:24 ` Jeff King 2018-07-31 21:38 ` Junio C Hamano 2018-07-31 23:29 ` Jonathan Tan 2018-07-31 23:23 ` Jonathan Tan 2018-08-01 17:18 ` Brandon Williams 2018-08-02 16:30 ` Jeff King 2018-08-01 20:13 ` [PATCH] fetch-pack: unify ref in and out param Jonathan Tan 2018-08-01 21:38 ` Brandon Williams 2018-08-01 22:23 ` Junio C Hamano 2018-08-02 16:40 ` Jeff King
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).