git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug with Git shallow clones and submodules
@ 2021-08-11 14:59 Kevin Phillips
  2021-08-15 14:31 ` Philippe Blain
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Phillips @ 2021-08-11 14:59 UTC (permalink / raw)
  To: git

We recently came across a bug in one of our Git projects which make
use of submodules. The basic problem is we have a Git repo that has 1
submodule within it which is set up to check out a specific branch,
and when we attempt to perform a shallow clone on the parent repo the
checkout fails with a "Server does not allow request for unadvertised
object" error. After doing a lot of testing and reading online I found
this StackOverflow thread (https://stackoverflow.com/a/61492323) that
explains our problem in great detail, which indicates to me that this
is likely not a problem specific to us, so I thought I'd report it to
see if it could be fixed.

In our specific use case, we found the problem specifically when
running a build of our project using Gitlab CI. This tool
automatically performs a shallow checkout of projects being built by
using a "--depth=50" parameter. This caused odd Git checkout errors to
appear on our CI builds which weren't directly reproducible by our
developers local builds. Upon closer examination we were able to
reproduce outside of the CI environment by performing a checkout as
follows:

    git clone --recurse-submodules --depth=50  git@<path_to_repo>.git .

Based on my initial research, which was helped greatly by the
StackOverflow thread I linked to above, I was able to deduce the
following contributing factors:

The submodule that was causing us grief was the dpdk library located
here: http://git.dpdk.org/dpdk-stable/

This repository has a default branch named "default" which has only 2
commits on it: one for initializing the repo and one additional commit
All releases of this library are created from branches named by the
release version, and they all seem disjoint from one another, and they
are all disjoint from the default branch (ie: changes from the release
branches are never merged into the default branch)

our parent repo, which is an internal / non-public repository that can
not be shared, has a single submodule defined in it with the following
.gitmodule definition:


[submodule "dpdk-stable"]
path = dpdk-stable
url = https://git@dpdk.org/git/dpdk-stable
branch = 19.11

the specific revision we are checking out has the OID
78bc27846101e481438a98c68cac47e4476085c0 which corresponds to the
19.11.8 tag
the following git clone operation succeeds:

git clone --recurse-submodules git@<path_to_repo>.git .

the following git clone operation fails:

git clone --recurse-submodules --depth=50  git@<path_to_repo>.git .

we can work around the problem by doing a shallow clone with the
no-single-branch parameter, as in:

git clone --recurse-submodules --depth=50 --no-single-branch
git@<path_to_repo>.git .

Based on my current level of understanding, Git is automatically
setting the --single-branch flag when cloning submodules but only when
doing a shallow clone of the parent. This seems wrong to me. In
addition, even if Git was going to enforce the single branch option on
submodules it stands to reason that it should be respecting the branch
defined in the .gitmodules file (ie: 19.11 in our case) instead of
using the default branch (ie: "default" in this case). My guess is, if
it were respecting the branch name then we wouldn't be getting the
"unadvertised object" error because the referenced OID would exist in
the submodule checkout because it would have the correct branch
checked out.

For reference, this problem appears to be reproducible on several
different platforms (ie: several different Linux distros and Mac OSX
at least) and several different Git versions up to v2.32.0 which is
the latest version at the moment. Hopefully this problem can be fixed
in a future release of the command line tools. In the meantime we've
had to hack around the problem in our Gitlab CI builds by disabling
the shallow checkout features, which is less than ideal, so again I am
hopeful that this can be fixed directly in Git itself.

Kevin Phillips
Lead Developer, Configuration Management Team, IBM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug with Git shallow clones and submodules
  2021-08-11 14:59 Bug with Git shallow clones and submodules Kevin Phillips
@ 2021-08-15 14:31 ` Philippe Blain
  2021-09-02 13:49   ` Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules) Philippe Blain
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Blain @ 2021-08-15 14:31 UTC (permalink / raw)
  To: Kevin Phillips, git, Jonathan Tan, Emily Shaffer

Hi Kevin,

Le 2021-08-11 à 10:59, Kevin Phillips a écrit :
> We recently came across a bug in one of our Git projects which make
> use of submodules. The basic problem is we have a Git repo that has 1
> submodule within it which is set up to check out a specific branch,

Just to make sure we are on the same page: you can't have a submodule
"set up to check out a specific branch", at least
not in the sense that the submodule is always "checked out" on a branch,
and the superproject knows that the submodule should always be "on that branch";
that's not how submodules work. As specified in [1], the config `submodule.<name>.branch`
which your repo has in '.gitmodules' is only ever used by
`git submodule update --remote`. The superproject itself only records at which *commit* the
submodule is at.

> and when we attempt to perform a shallow clone on the parent repo the
> checkout fails with a "Server does not allow request for unadvertised
> object" error. After doing a lot of testing and reading online I found
> this StackOverflow thread (https://stackoverflow.com/a/61492323) that
> explains our problem in great detail, which indicates to me that this
> is likely not a problem specific to us, so I thought I'd report it to
> see if it could be fixed.

I think your situation is related, but not exactly the same as the SO
post. But torek's answer sure explains a lot of important points that
are relevant in your situation.

> 
> In our specific use case, we found the problem specifically when
> running a build of our project using Gitlab CI. This tool
> automatically performs a shallow checkout of projects being built by
> using a "--depth=50" parameter. This caused odd Git checkout errors to
> appear on our CI builds which weren't directly reproducible by our
> developers local builds. Upon closer examination we were able to
> reproduce outside of the CI environment by performing a checkout as
> follows:
> 
>      git clone --recurse-submodules --depth=50  git@<path_to_repo>.git .
> 
> Based on my initial research, which was helped greatly by the
> StackOverflow thread I linked to above, I was able to deduce the
> following contributing factors:
> 
> The submodule that was causing us grief was the dpdk library located
> here: http://git.dpdk.org/dpdk-stable/
> 
> This repository has a default branch named "default" which has only 2
> commits on it: one for initializing the repo and one additional commit
> All releases of this library are created from branches named by the
> release version, and they all seem disjoint from one another, and they
> are all disjoint from the default branch (ie: changes from the release
> branches are never merged into the default branch)
> 
> our parent repo, which is an internal / non-public repository that can
> not be shared, has a single submodule defined in it with the following
> .gitmodule definition:
> 
> 
> [submodule "dpdk-stable"]
> path = dpdk-stable
> url = https://git@dpdk.org/git/dpdk-stable
> branch = 19.11
> 
> the specific revision we are checking out has the OID
> 78bc27846101e481438a98c68cac47e4476085c0 which corresponds to the
> 19.11.8 tag
> the following git clone operation succeeds:
> 
> git clone --recurse-submodules git@<path_to_repo>.git .
> 
> the following git clone operation fails:
> 
> git clone --recurse-submodules --depth=50  git@<path_to_repo>.git .

Thanks for the details here. I came up with this short reproducer script,
which I think shows the behaviour you are seeing:

~~~bash
#!/bin/bash

set -e

rm -rf test && mkdir test && cd test
rm -rf project clone

# Create project
git init project && cd project
echo "content">file
git add file
git commit -m "initial in proj"
git submodule add -b 19.11 https://dpdk.org/git/dpdk-stable
git -C dpdk-stable -q checkout 78bc27846101e481438a98c68cac47e4476085c0
git add dpdk-stable
git commit -m "add lib to project"
cd ../

# Clone project
GIT_TRACE2=3 git -c protocol.version=2 clone --no-local --recurse-submodules --depth=50 project clone 3> >(grep -E 'git clone|git fetch|git submodule|git rev-list' )
~~~

Note that in the "Create project" phase, I use the HTTPS URL
https://dpdk.org/git/dpdk-stable, which is the recommended HTTPS URL at [2].

> 
> we can work around the problem by doing a shallow clone with the
> no-single-branch parameter, as in:
> 
> git clone --recurse-submodules --depth=50 --no-single-branch
> git@<path_to_repo>.git .
> 
> Based on my current level of understanding, Git is automatically
> setting the --single-branch flag when cloning submodules but only when
> doing a shallow clone of the parent. This seems wrong to me. 

In fact, the logic in the code is twofold:

1. In 'git clone', '--depth' implies '--single-branch' [3]
2. In 'git clone', '--single-branch' is used to clone submodules if
    '--recurse-submodules' is used. This is the case since 132f600b06
    (clone: pass --single-branch during --recurse-submodules, 2020-02-20)

I'm not sure why you think this is wrong (in theory)...

> In
> addition, even if Git was going to enforce the single branch option on
> submodules it stands to reason that it should be respecting the branch
> defined in the .gitmodules file (ie: 19.11 in our case) instead of
> using the default branch (ie: "default" in this case). 

... in practice, here you are right: it is indeed unfortunate that the
branch specified in '.gitmodules' is not used for the initial clone of
the submodule if they are cloned with '--single-branch'.

> My guess is, if
> it were respecting the branch name then we wouldn't be getting the
> "unadvertised object" error because the referenced OID would exist in
> the submodule checkout because it would have the correct branch
> checked out.

I think your are right, since the following command does work:

     git clone --single-branch --branch 19.11 https://dpdk.org/git/dpdk-stable &&
     git -C dpdk-stable checkout 78bc27846101e481438a98c68cac47e4476085c0

> 
> For reference, this problem appears to be reproducible on several
> different platforms (ie: several different Linux distros and Mac OSX
> at least) and several different Git versions up to v2.32.0 which is
> the latest version at the moment. Hopefully this problem can be fixed
> in a future release of the command line tools. In the meantime we've
> had to hack around the problem in our Gitlab CI builds by disabling
> the shallow checkout features, which is less than ideal

If you run my reproducer script above, you should see the steps that the code
does to try to retrieve the commit you need, 78bc27846101e481438a98c68cac47e4476085c0:

1. It invokes 'git clone' for the submodule:

     child_start[0] git clone --no-checkout --progress --separate-git-dir $PWD/test/clone/.git/modules/dpdk-stable --single-branch -- https://dpdk.org/git/dpdk-stable $PWD/test/clone/dpdk-stable

2. It checks if the clone downloaded the needed object:

     start git rev-list -n 1 78bc27846101e481438a98c68cac47e4476085c0 --not --all

    We know that it did not, because as noted above, the branch from '.gitmodules'
    is not used in the single-branch clone, so only object from the 'default' branch
    are available.
3. It realizes the object is not there, so it runs "git fetch" in the submodule
4. It checks again with the same 'git rev-list command' if the fetch downloaded
    the needed object. Of course, it did not, because since the branch in '.gitmodules'
    is not used for the initial single-branch clone, at that point the fetch refspec
    in the submodule is '+refs/heads/default:refs/remotes/origin/default', and so a
    simple 'git fetch' does not bring any new object.
5. It falls back to fetching the needed commit by its hash:

     start git fetch origin 78bc27846101e481438a98c68cac47e4476085c0

6. The server at https://dpdk.org/git/dpdk-stable refuses this request and you get

     error: Server does not allow request for unadvertised object 78bc27846101e481438a98c68cac47e4476085c0

OK so why does the server refuses the request ? If you take a look at [4], you will
understand that it's probably because the configs 'uploadpack.allowReachableSHA1InWant'
and 'uploadpack.allowAnySHA1InWant' are both false (i.e. their default value) on the
https://dpdk.org/git/dpdk-stable server. So the behaviour here is not a bug, it's
working as it should (but leads to a bad experience for you).

What is not unfortunately not mentioned in the documentation at [4]
is that the two configs that I mention above only apply if version 0 of the Git transfer
protocol is in effect. So this would mean that even though we specify '-c protocol.version=2'
in the 'git clone' command in my script, the server uses protocol v0. Protocol v2 works
starting in Git 2.18, so let's check what version is running at dpdk.org:

$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://dpdk.org/git/dpdk-stable 2>&1 | grep agent
packet:          git< 6ee0521feb765d9105241a3f6693762c471655cf HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/default agent=git/2.20.1

OK, it's running Git 2.20.1. And the list of capabilities above implies that it's responding
with protocol v0, even if my local Git version if 2.29, which defaults to protocol v2. If
it were responding with protocol v2, we would see:

$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep agent
packet:          git< agent=git/github-g78b452413e8b  # server version
packet:          git< agent=git/github-g78b452413e8b
packet:          git> agent=git/2.29.2  # client version
packet:          git< agent=git/2.29.2

And grepping for "version" would confirm it:

$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep version
packet:          git< version 2
packet:          git< version 2

I can't explain why the server refuses to talk with protocol v2. Maybe Jonathan (CC-ed),
who worked on the implementation of protocol v2, would be able to explain that.

And it gets weirder. If you instead use the Git URL (instead of the HTTPS URL)
for the dpdk-stable submodule [2], then the server responds with protocol v2!

$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote git://dpdk.org/dpdk-stable 2>&1 | grep agent
packet:          git< agent=git/2.20.1
packet:          git> agent=git/2.29.2
$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote git://dpdk.org/dpdk-stable 2>&1 | grep version
packet:          git> git-upload-pack /dpdk-stable\0host=dpdk.org\0\0version=2\0
packet:          git< version 2

And indeed, if you use the 'git://' URL in the 'git submodule add' command in my reproducer
script, it succeeds !

It don't understand this behaviour either.

So a workaround for you would be to use the 'git://' URL in your '.gitmodules',
if the Git port (9418) is not blocked in your infrastructure.

I hope this helps,

Philippe.

P.S. for Jonathan I noticed some additional weirdness while debugging this:
Some servers respond with a different Git version depending on the protocol:

$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote  git://gcc.gnu.org/git/gcc.git 2>&1 | grep -E 'agent|version '
packet:          git< version 2
packet:          git< agent=git/2.26.2
packet:          git> agent=git/2.29.2
$ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote  https://gcc.gnu.org/git/gcc.git 2>&1 | grep agent -E 'agent|version '
packet:          git< 882f1d58bfa56737ff2de84c3cd1e0acfc318b86 HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master agent=git/2.27.0

Here again, with HTTPS we fall back to protocol v0. The same behaviour happens
with git://sourceware.org/git/binutils-gdb.git...

1. https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt--bltbranchgt
2. https://core.dpdk.org/contribute/#clone
3. https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---depthltdepthgt
4. https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackallowReachableSHA1InWant

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules)
  2021-08-15 14:31 ` Philippe Blain
@ 2021-09-02 13:49   ` Philippe Blain
  2021-09-04 12:57     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Blain @ 2021-09-02 13:49 UTC (permalink / raw)
  To: Kevin Phillips, git, Jonathan Tan, Emily Shaffer
  Cc: Jeff King, Brandon Williams, brian m . carlson

Hi everyone,

Le 2021-08-15 à 10:31, Philippe Blain a écrit :

> 5. It falls back to fetching the needed commit by its hash:
> 
>      start git fetch origin 78bc27846101e481438a98c68cac47e4476085c0
> 
> 6. The server at https://dpdk.org/git/dpdk-stable refuses this request and you get
> 
>      error: Server does not allow request for unadvertised object 78bc27846101e481438a98c68cac47e4476085c0
> 
> OK so why does the server refuses the request ? If you take a look at [4], you will
> understand that it's probably because the configs 'uploadpack.allowReachableSHA1InWant'
> and 'uploadpack.allowAnySHA1InWant' are both false (i.e. their default value) on the
> https://dpdk.org/git/dpdk-stable server. So the behaviour here is not a bug, it's
> working as it should (but leads to a bad experience for you).
> 
> What is not unfortunately not mentioned in the documentation at [4]
> is that the two configs that I mention above only apply if version 0 of the Git transfer
> protocol is in effect. So this would mean that even though we specify '-c protocol.version=2'
> in the 'git clone' command in my script, the server uses protocol v0. Protocol v2 works
> starting in Git 2.18, so let's check what version is running at dpdk.org:
> 
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://dpdk.org/git/dpdk-stable 2>&1 | grep agent
> packet:          git< 6ee0521feb765d9105241a3f6693762c471655cf HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/default agent=git/2.20.1
> 
> OK, it's running Git 2.20.1. And the list of capabilities above implies that it's responding
> with protocol v0, even if my local Git version if 2.29, which defaults to protocol v2. If
> it were responding with protocol v2, we would see:
> 
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep agent
> packet:          git< agent=git/github-g78b452413e8b  # server version
> packet:          git< agent=git/github-g78b452413e8b
> packet:          git> agent=git/2.29.2  # client version
> packet:          git< agent=git/2.29.2
> 
> And grepping for "version" would confirm it:
> 
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep version
> packet:          git< version 2
> packet:          git< version 2
> 
> I can't explain why the server refuses to talk with protocol v2. Maybe Jonathan (CC-ed),
> who worked on the implementation of protocol v2, would be able to explain that.

I'm sorry to bump this, but I would really like to understand what is going on here.
I'm not familiar at all with the protocol code, but my understanding was that after
Git 2.18, if a client requested protocol v2, the server would answer with v2 (at least
for git:// and http[s]://, and also for ssh:// if AcceptEnv is correctly configured).
Here, Git 2.20.1 at https://dpdk.org/git/dpdk-stable answers with v0 for https://,
but with v2 for git:// !

> 
> And it gets weirder. If you instead use the Git URL (instead of the HTTPS URL)
> for the dpdk-stable submodule [2], then the server responds with protocol v2!
> 
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote git://dpdk.org/dpdk-stable 2>&1 | grep agent
> packet:          git< agent=git/2.20.1
> packet:          git> agent=git/2.29.2
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote git://dpdk.org/dpdk-stable 2>&1 | grep version
> packet:          git> git-upload-pack /dpdk-stable\0host=dpdk.org\0\0version=2\0
> packet:          git< version 2
> 
> And indeed, if you use the 'git://' URL in the 'git submodule add' command in my reproducer
> script, it succeeds !
> 
> It don't understand this behaviour either.
> 
--- snip ---
> 
> P.S. for Jonathan I noticed some additional weirdness while debugging this:
> Some servers respond with a different Git version depending on the protocol:
> 
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote  git://gcc.gnu.org/git/gcc.git 2>&1 | grep -E 'agent|version '
> packet:          git< version 2
> packet:          git< agent=git/2.26.2
> packet:          git> agent=git/2.29.2
> $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote  https://gcc.gnu.org/git/gcc.git 2>&1 | grep agent -E 'agent|version '
> packet:          git< 882f1d58bfa56737ff2de84c3cd1e0acfc318b86 HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master agent=git/2.27.0
> 
> Here again, with HTTPS we fall back to protocol v0. The same behaviour happens
> with git://sourceware.org/git/binutils-gdb.git...

This is also puzzling to me...

Thanks,

Philippe.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules)
  2021-09-02 13:49   ` Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules) Philippe Blain
@ 2021-09-04 12:57     ` Jeff King
  2021-09-04 14:51       ` Konstantin Ryabitsev
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2021-09-04 12:57 UTC (permalink / raw)
  To: Philippe Blain
  Cc: Kevin Phillips, git, Jonathan Tan, Emily Shaffer,
	Brandon Williams, brian m . carlson

On Thu, Sep 02, 2021 at 09:49:38AM -0400, Philippe Blain wrote:

> > OK, it's running Git 2.20.1. And the list of capabilities above implies that it's responding
> > with protocol v0, even if my local Git version if 2.29, which defaults to protocol v2. If
> > it were responding with protocol v2, we would see:
> > 
> > $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep agent
> > packet:          git< agent=git/github-g78b452413e8b  # server version
> > packet:          git< agent=git/github-g78b452413e8b
> > packet:          git> agent=git/2.29.2  # client version
> > packet:          git< agent=git/2.29.2
> > 
> > And grepping for "version" would confirm it:
> > 
> > $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://github.com/git/git 2>&1 | grep version
> > packet:          git< version 2
> > packet:          git< version 2
> > 
> > I can't explain why the server refuses to talk with protocol v2. Maybe Jonathan (CC-ed),
> > who worked on the implementation of protocol v2, would be able to explain that.
> 
> I'm sorry to bump this, but I would really like to understand what is going on here.
> I'm not familiar at all with the protocol code, but my understanding was that after
> Git 2.18, if a client requested protocol v2, the server would answer with v2 (at least
> for git:// and http[s]://, and also for ssh:// if AcceptEnv is correctly configured).
> Here, Git 2.20.1 at https://dpdk.org/git/dpdk-stable answers with v0 for https://,
> but with v2 for git:// !

The server has to be configured correctly at the HTTP level, too. The v2
probe for HTTP is a "Git-Protocol: version=2" header in the HTTP
request. The webserver has to be configured to communicate that to the
rest of Git somehow. In our test setup for Apache, we do:

  SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0

If that doesn't happen, then the server doesn't see the header at all,
and just speaks v0 as usual.

I suspect this could be better documented. I have to admit that I have
not actually run a regular HTTP server with Git in a long time (at
GitHub, we have our own custom layer that terminates the HTTP
connections).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules)
  2021-09-04 12:57     ` Jeff King
@ 2021-09-04 14:51       ` Konstantin Ryabitsev
  0 siblings, 0 replies; 5+ messages in thread
From: Konstantin Ryabitsev @ 2021-09-04 14:51 UTC (permalink / raw)
  To: Jeff King
  Cc: Philippe Blain, Kevin Phillips, git, Jonathan Tan, Emily Shaffer,
	Brandon Williams, brian m . carlson

On Sat, Sep 04, 2021 at 08:57:56AM -0400, Jeff King wrote:
> The server has to be configured correctly at the HTTP level, too. The v2
> probe for HTTP is a "Git-Protocol: version=2" header in the HTTP
> request. The webserver has to be configured to communicate that to the
> rest of Git somehow. In our test setup for Apache, we do:
> 
>   SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
> 
> If that doesn't happen, then the server doesn't see the header at all,
> and just speaks v0 as usual.
> 
> I suspect this could be better documented.

Indeed, I was surprised to find out that all this time we were serving v0 for
http requests. /o\

For the record, the magic nginx+uwsgi incantation is:

    uwsgi_param GIT_PROTOCOL $http_git_protocol;

that gets us what we want:

    $ GIT_TRACE_PACKET=1 GIT_TRACE_BARE=1 git ls-remote https://git.kernel.org/pub/scm/git/git.git 2>&1 | grep version
    packet:          git< version 2
    packet:          git< version 2
    packet:          git< version 2

I will submit a patch for the docs to indicate that this is required, because
right now the only mention of GIT_PROTOCOL env variable are for the file://
and ssh:// schemes.

-K

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-04 14:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 14:59 Bug with Git shallow clones and submodules Kevin Phillips
2021-08-15 14:31 ` Philippe Blain
2021-09-02 13:49   ` Weird interaction between Git protocol versions and git:// and https:// URLs (was: Re: Bug with Git shallow clones and submodules) Philippe Blain
2021-09-04 12:57     ` Jeff King
2021-09-04 14:51       ` Konstantin Ryabitsev

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).