* Regarding the depreciation of ssh+git/git+ssh protocols @ 2021-03-15 16:27 Drew DeVault 2021-03-15 17:56 ` Jonathan Nieder 0 siblings, 1 reply; 28+ messages in thread From: Drew DeVault @ 2021-03-15 16:27 UTC (permalink / raw) To: git c05186cc38ca4605bff1f275619d7d0faeaf2fa5 introduced ssh+git, and 07c7782cc8e1f37c7255dfc69c5d0e3f4d4d728c admitted this was a mistake. I argue that it was not a mistake. The main use-case for the git-specific protocol is to disambiguate with other version control systems which also use SSH (or HTTPS), such as Mercurial, or simply downloading a tarball over HTTP. Some things that are affected by this include package manager source lists and configurations for CI tooling (the latter being my main interest in this). A lot of software already recognizes ssh+git or https+git for this purpose, and in the latter case, rewrites it to https before handing it off to git. I would like to see this feature un-disowned, and https+git support added as well. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-15 16:27 Regarding the depreciation of ssh+git/git+ssh protocols Drew DeVault @ 2021-03-15 17:56 ` Jonathan Nieder 2021-03-15 18:14 ` Drew DeVault 0 siblings, 1 reply; 28+ messages in thread From: Jonathan Nieder @ 2021-03-15 17:56 UTC (permalink / raw) To: Drew DeVault; +Cc: git Hi, Drew DeVault wrote: > c05186cc38ca4605bff1f275619d7d0faeaf2fa5 introduced ssh+git, and > 07c7782cc8e1f37c7255dfc69c5d0e3f4d4d728c admitted this was a mistake. I > argue that it was not a mistake. > > The main use-case for the git-specific protocol is to disambiguate with > other version control systems which also use SSH (or HTTPS), such as > Mercurial, or simply downloading a tarball over HTTP. Following the trail of links, I reach https://public-inbox.org/git/CA+55aFyWqK0bu2V1SYagrYCBGpj0=2orobK2vT-KRkqpq=kgtw@mail.gmail.com/, but that email mostly just makes assertions rather than explaining the rationale. So it's probably worth talking it through now. > Some things that are affected by this include package manager source > lists and configurations for CI tooling (the latter being my main > interest in this). The original idea of URI schemes like svn+https is that we can treat these version control URLs as part of the general category of uniform resource identifiers --- in other words, you might be able to type them in a browser's URL bar, browse the content of a repository, use an <img> tag to point to a file within a version control repository, and so on. _That_ idea, at least, does not work all that well. There's not an equivalent to a fragment identifier to refer to a particular file within a repository. Further, if I have an https URL referring to a Git repository, I'm better off viewing it without a "git+" prefix because then I can see the content of the repository using a web based repository browser. In other words, a "Git URL" is not a URI at all; it's simply the identifier that Git uses to clone a repository. A package manager or CI tool is perfectly within its rights to provide its own naming scheme for sources, such as "git::https://example.com/path/to/repo" or even the same with "git+" prefix; or it can use an https URL and infer from the content it gets there what version control system it uses. The missing piece is an HTTP header to unambiguously mark that URL as being usable by Git. I'm not aware of a standard way to do that; e.g. golang's "go get" tool[*] uses a custom 'meta name="go-import"' HTML element. Thanks and hope that helps, Jonathan [*] https://golang.org/cmd/go/#hdr-Remote_import_paths ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-15 17:56 ` Jonathan Nieder @ 2021-03-15 18:14 ` Drew DeVault 2021-03-15 22:01 ` brian m. carlson 0 siblings, 1 reply; 28+ messages in thread From: Drew DeVault @ 2021-03-15 18:14 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git On Mon Mar 15, 2021 at 1:56 PM EDT, Jonathan Nieder wrote: > The original idea of URI schemes like svn+https is that we can treat > these version control URLs as part of the general category of uniform > resource identifiers --- in other words, you might be able to type > them in a browser's URL bar, browse the content of a repository, use > an <img> tag to point to a file within a version control repository, > and so on. That was indeed the original idea, but I think it's fair to assume that it's evolved well beyond this. There are many schemes in common use which don't meet this criteria, such as mailto:, magnet:, bitcoin:, postgresql:, and so on. None of these examples make productive use of all of the URI, such as your fragment example, but they still make productive use of parts of the URI. To my mind, the contemporary purpose of a URI is to: 1. Identify a resource 2. Identify the protocol used to access it 3. Store domain-specific information that an implementation of that protocol can use to accomplish something > The missing piece is an HTTP header to unambiguously mark that URL as > being usable by Git. I'm not aware of a standard way to do that; e.g. > golang's "go get" tool[*] uses a custom 'meta name="go-import"' HTML > element. I don't agree that this is the case. It would be much better to be able to identify a URL as being useful for git without having to perform a network request to find out. A standard approach to the go-import kind of deal is also a meritous idea, but a separate matter - and one I'm also involved in trying to address! ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-15 18:14 ` Drew DeVault @ 2021-03-15 22:01 ` brian m. carlson 2021-03-16 0:52 ` Drew DeVault 2021-03-16 0:54 ` Drew DeVault 0 siblings, 2 replies; 28+ messages in thread From: brian m. carlson @ 2021-03-15 22:01 UTC (permalink / raw) To: Drew DeVault; +Cc: Jonathan Nieder, git [-- Attachment #1: Type: text/plain, Size: 1883 bytes --] On 2021-03-15 at 18:14:31, Drew DeVault wrote: > On Mon Mar 15, 2021 at 1:56 PM EDT, Jonathan Nieder wrote: > > The missing piece is an HTTP header to unambiguously mark that URL as > > being usable by Git. I'm not aware of a standard way to do that; e.g. > > golang's "go get" tool[*] uses a custom 'meta name="go-import"' HTML > > element. > > I don't agree that this is the case. It would be much better to be able > to identify a URL as being useful for git without having to perform a > network request to find out. But you can't find whether a URL is useful for a particular purpose in general. For example, if I see an HTTPS URL, that tells me nothing about the resources that one might find at that URL. One might find: * A plain dumb Git remote. * A plain smart Git remote. * A smart Git remote and Git LFS support. * A human-readable text response. * A machine-readable JSON response. * A binary document which is intended to be human intelligible. * Something else. * Nothing at all. In addition, it's possible that the data you want exists, but is not suitable for you in whatever way (not in a language you understand, in an unsuitable format, is illegal or offensive, etc.), or you are not authorized to access it. You can't know any of this without making some sort of request. All a URL can tell you is literally where a resource is located. Even if we saw a URL that used the hypothetical https+git as the scheme, we couldn't determine whether we could access the data, whether the data even still exists, or, even if we knew all of those things, whether it was using the smart or dumb protocol, without making a request. So I don't think this is a thing we can do, simply because in general URLs aren't suitable for sharing this kind of information. -- brian m. carlson (he/him or they/them) Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-15 22:01 ` brian m. carlson @ 2021-03-16 0:52 ` Drew DeVault 2021-03-16 1:02 ` Jonathan Nieder 2021-03-16 0:54 ` Drew DeVault 1 sibling, 1 reply; 28+ messages in thread From: Drew DeVault @ 2021-03-16 0:52 UTC (permalink / raw) To: brian m. carlson; +Cc: Jonathan Nieder, git On Mon Mar 15, 2021 at 6:01 PM EDT, brian m. carlson wrote: > But you can't find whether a URL is useful for a particular purpose in > general. For example, if I see an HTTPS URL, that tells me nothing > about the resources that one might find at that URL. > > In addition, it's possible that the data you want exists, but is not > suitable for you in whatever way (not in a language you understand, in > an unsuitable format, is illegal or offensive, etc.), or you are not > authorized to access it. You can't know any of this without making some > sort of request. > > All a URL can tell you is literally where a resource is located. Even > if we saw a URL that used the hypothetical https+git as the scheme, we > couldn't determine whether we could access the data, whether the data > even still exists, or, even if we knew all of those things, whether it > was using the smart or dumb protocol, without making a request. What we know is that we can pass it to git to deal with, and then git will determine the next steps. It will negotiate dumb or smart HTTP in-band, deal with errors that arise, and so on. It signals that git is the tool best equipped to deal with the situation, and without that we'd end up guessing. > So I don't think this is a thing we can do, simply because in general > URLs aren't suitable for sharing this kind of information. That's simply not true. They are quite capable at this task, and are fulfilling this duty for a wide varitety of applications today. I don't really understand the disconnect here. No, URLs are not magic, but they are perfectly sufficient for this use-case. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 0:52 ` Drew DeVault @ 2021-03-16 1:02 ` Jonathan Nieder 2021-03-16 1:05 ` Drew DeVault 2021-03-16 4:38 ` Eli Schwartz 0 siblings, 2 replies; 28+ messages in thread From: Jonathan Nieder @ 2021-03-16 1:02 UTC (permalink / raw) To: Drew DeVault; +Cc: brian m. carlson, git Drew DeVault wrote: > On Mon Mar 15, 2021 at 6:01 PM EDT, brian m. carlson wrote: >> So I don't think this is a thing we can do, simply because in general >> URLs aren't suitable for sharing this kind of information. > > That's simply not true. They are quite capable at this task, and are > fulfilling this duty for a wide varitety of applications today. > > I don't really understand the disconnect here. No, URLs are not magic, > but they are perfectly sufficient for this use-case. I'm not sure it's a disconnect; instead, it just looks like we disagree. That said, with more details about the use case it might be possible to sway me in another direction. To maintain the URI analogy: the URI does not tell me the content-type of what I can access from there. Until I know that content-type, I may not know what the best tool is to access it. The root of the disagreement, though, is "Git URLs" looking like a URI in the first place. They're not meant to be universal at all. They are specifically for Git. Thanks, Jonathan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 1:02 ` Jonathan Nieder @ 2021-03-16 1:05 ` Drew DeVault 2021-03-16 21:23 ` Jeff King 2021-03-16 4:38 ` Eli Schwartz 1 sibling, 1 reply; 28+ messages in thread From: Drew DeVault @ 2021-03-16 1:05 UTC (permalink / raw) To: Jonathan Nieder; +Cc: brian m. carlson, git On Mon Mar 15, 2021 at 9:02 PM EDT, Jonathan Nieder wrote: > I'm not sure it's a disconnect; instead, it just looks like we > disagree. That said, with more details about the use case it might be > possible to sway me in another direction. > > To maintain the URI analogy: the URI does not tell me the content-type > of what I can access from there. Until I know that content-type, I > may not know what the best tool is to access it. git isn't a content type, it's a protocol. git over HTTP or git over SSH is a protocol in its own right, distinct from these base protocols, in the same sense that SSH lives on top of TCP which lives on top of IP which is transmitted to your computer over ethernet or 802.11. It's turtles all the way down. > The root of the disagreement, though, is "Git URLs" looking like a URI > in the first place. They're not meant to be universal at all. They > are specifically for Git. At worst I would call this a happy coincidence. We have this convenient universal format at our disposal, and we would be wise to take advantage of it. Rejecting it on the premise that we never wanted to have it doesn't make sense when we consider that (1) we do have it and (2) it can be of good use to us. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 1:05 ` Drew DeVault @ 2021-03-16 21:23 ` Jeff King 2021-03-17 14:49 ` Drew DeVault 2021-03-18 21:30 ` Junio C Hamano 0 siblings, 2 replies; 28+ messages in thread From: Jeff King @ 2021-03-16 21:23 UTC (permalink / raw) To: Drew DeVault; +Cc: Jonathan Nieder, brian m. carlson, git On Mon, Mar 15, 2021 at 09:05:34PM -0400, Drew DeVault wrote: > On Mon Mar 15, 2021 at 9:02 PM EDT, Jonathan Nieder wrote: > > I'm not sure it's a disconnect; instead, it just looks like we > > disagree. That said, with more details about the use case it might be > > possible to sway me in another direction. > > > > To maintain the URI analogy: the URI does not tell me the content-type > > of what I can access from there. Until I know that content-type, I > > may not know what the best tool is to access it. > > git isn't a content type, it's a protocol. git over HTTP or git over SSH > is a protocol in its own right, distinct from these base protocols, in > the same sense that SSH lives on top of TCP which lives on top of IP > which is transmitted to your computer over ethernet or 802.11. It's > turtles all the way down. I think this is the key observation. A browser can access an HTTP URL, and then based on the content type, decide what to do with the result. But one cannot do so with a git-over-http URL. Git will not even directly access the resource specified in the URL! It will construct a related one (with appending "info/refs" and a "service" field) and request that. So you definitely need to "somehow" know that a URL is meant to be used with Git. And that makes me somewhat sympathetic to your request. The downsides I see are: - one of the advantages of straight http:// URLs is that they can accessed by multiple tools. Most "forge" tools let you use the same URL both for getting a human-readable page in a browser, as well as accessing the repository with the Git CLI. I'd hate to see https+git URLs become common, because they add friction there (though simply supporting them at all gives people the choice of whether to use them). - I'm also sympathetic to brian's point that there's a wider ecosystem. It's not just "git" that needs to learn them. It's jgit, and libgit2, and many tools that work with git remotes. -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 21:23 ` Jeff King @ 2021-03-17 14:49 ` Drew DeVault 2021-03-18 21:30 ` Junio C Hamano 1 sibling, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-17 14:49 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, brian m. carlson, git On Tue Mar 16, 2021 at 5:23 PM EDT, Jeff King wrote: > - one of the advantages of straight http:// URLs is that they can > accessed by multiple tools. Most "forge" tools let you use the same > URL both for getting a human-readable page in a browser, as well as > accessing the repository with the Git CLI. I'd hate to see https+git > URLs become common, because they add friction there (though simply > supporting them at all gives people the choice of whether to use > them). I think their main use-cases would be limited to places where the distinction is necessary, such as for those packaging or CI tools. I don't expect us to end up in a situation where users are passing each other git+https URLs in everyday conversation. > - I'm also sympathetic to brian's point that there's a wider > ecosystem. It's not just "git" that needs to learn them. It's jgit, > and libgit2, and many tools that work with git remotes. I would be happy to write the necessary patch for libgit2, at least. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 21:23 ` Jeff King 2021-03-17 14:49 ` Drew DeVault @ 2021-03-18 21:30 ` Junio C Hamano 2021-03-18 21:53 ` Drew DeVault 1 sibling, 1 reply; 28+ messages in thread From: Junio C Hamano @ 2021-03-18 21:30 UTC (permalink / raw) To: Jeff King; +Cc: Drew DeVault, Jonathan Nieder, brian m. carlson, git Jeff King <peff@peff.net> writes: > So you definitely need to "somehow" know that a URL is meant to be used > with Git. And that makes me somewhat sympathetic to your request. Nicely summarized. I am also sympathetic to the cause, but I do not see upside in tucking the information to the URL syntax. Even if we limit ourselves to the CI context, I do not see how the repository location alone is sufficient (e.g. "build the tip of this branch of that repository every time it gets updated" already needs more than the repository location). > The downsides I see are: > > - one of the advantages of straight http:// URLs is that they can > accessed by multiple tools. Most "forge" tools let you use the same > URL both for getting a human-readable page in a browser, as well as > accessing the repository with the Git CLI. I'd hate to see https+git > URLs become common, because they add friction there (though simply > supporting them at all gives people the choice of whether to use > them). > > - I'm also sympathetic to brian's point that there's a wider > ecosystem. It's not just "git" that needs to learn them. It's jgit, > and libgit2, and many tools that work with git remotes. Yup. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-18 21:30 ` Junio C Hamano @ 2021-03-18 21:53 ` Drew DeVault 0 siblings, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-18 21:53 UTC (permalink / raw) To: Junio C Hamano, Jeff King; +Cc: Jonathan Nieder, brian m. carlson, git The status quo is similarly frustrating. We have no choice but to allow these strange unofficial +git URLs to proliferate among package managers and build systems. It has already caused confusion with users, and it can only cause more the longer it remains unaddressed upstream. There are two options: 1. We make the change, users are confused for a while, and software has to be updated, but the confusion gradually diminishes over time as the ecosystem adjusts and people learn the change. 2. We don't make the change, and the inconsistency continues to require special cases in new tools, with no central organization for keeping them consistent from one to the next, and users will continue to stub their toe on it indefinitely. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 1:02 ` Jonathan Nieder 2021-03-16 1:05 ` Drew DeVault @ 2021-03-16 4:38 ` Eli Schwartz 2021-03-16 11:54 ` brian m. carlson 1 sibling, 1 reply; 28+ messages in thread From: Eli Schwartz @ 2021-03-16 4:38 UTC (permalink / raw) To: Jonathan Nieder, Drew DeVault; +Cc: brian m. carlson, git [-- Attachment #1.1: Type: text/plain, Size: 2162 bytes --] On 3/15/21 9:02 PM, Jonathan Nieder wrote: > Drew DeVault wrote: >> On Mon Mar 15, 2021 at 6:01 PM EDT, brian m. carlson wrote: > >>> So I don't think this is a thing we can do, simply because in general >>> URLs aren't suitable for sharing this kind of information. >> >> That's simply not true. They are quite capable at this task, and are >> fulfilling this duty for a wide varitety of applications today. >> >> I don't really understand the disconnect here. No, URLs are not magic, >> but they are perfectly sufficient for this use-case. > > I'm not sure it's a disconnect; instead, it just looks like we > disagree. That said, with more details about the use case it might be > possible to sway me in another direction. > > To maintain the URI analogy: the URI does not tell me the content-type > of what I can access from there. Until I know that content-type, I > may not know what the best tool is to access it. This is a pretty odd argument. Drew is recommending that the URI "git+https://" tells a person the right tool to obtain the resource ("do I use curl/wget, or git clone"), and now you're arguing that that it is somehow insufficient because "git+https://" doesn't tell the person which media viewer application is best suited to display the contents after it's been downloaded and no longer has an associated URI at all (but does exchange that particular variety of metadata for a mimetype). Why does this even matter? Again, the point here is the assertion by Drew that, for the purpose of listing a manifest of remotely fetchable resources, he sees a benefit to having some standard format for the URI itself, describing how it's intended to be fetched. - ftp:// -> use the `ftp` tool - scp:// -> use the `scp` tool - http:// -> use the `wget` tool - git+http:// -> use the `git` tool But instead of needing every program with a git integration to reimplement "recognize git+http and do substring prefix removal before passing to git", the suggestion is for git to do this. There is definitely a (strange) disconnect here. -- Eli Schwartz Arch Linux Bug Wrangler and Trusted User [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 4:38 ` Eli Schwartz @ 2021-03-16 11:54 ` brian m. carlson 2021-03-16 14:21 ` Drew DeVault 2021-03-16 18:03 ` Eli Schwartz 0 siblings, 2 replies; 28+ messages in thread From: brian m. carlson @ 2021-03-16 11:54 UTC (permalink / raw) To: Eli Schwartz; +Cc: Jonathan Nieder, Drew DeVault, git [-- Attachment #1: Type: text/plain, Size: 2145 bytes --] On 2021-03-16 at 04:38:08, Eli Schwartz wrote: > Why does this even matter? Again, the point here is the assertion by > Drew that, for the purpose of listing a manifest of remotely fetchable > resources, he sees a benefit to having some standard format for the URI > itself, describing how it's intended to be fetched. > > - ftp:// -> use the `ftp` tool > - scp:// -> use the `scp` tool > - http:// -> use the `wget` tool > - git+http:// -> use the `git` tool > > But instead of needing every program with a git integration to > reimplement "recognize git+http and do substring prefix removal before > passing to git", the suggestion is for git to do this. I believe this construct is nonstandard. It is better to use standard URL syntax when possible because it makes it much, much easier for people to use standard tooling to parse and handle URLs. Such tooling may have special cases for the HTTP syntax that it doesn't use in MAILTO syntax, so it's important to pick something that works automatically. It's difficult enough to handle parsing of SSH specifications and distinguish them uniformly from Windows paths (think of an alias named "c"), so I'd prefer we didn't add additional complexity to handle this case. Lest you think that only Git has to handle parsing these, the Git LFS project (and every other implementation compatible with Git) has to handle parsing them as well (and related things like url.*.insteadOf), and providing bug-for-bug compatible behavior is generally a hassle. We've run into numerous problems where things aren't exactly the same, and making things more complex by adding an esoteric syntax that few users are likely to use isn't helping. Despite the fact that ssh+git is specified as deprecated, we had people expect it to magically work and had to support it in Git LFS. So I'm very much opposed to adding, expanding, or giving any sort of official blessing to this syntax, especially when there are perfectly valid and equivalent schemes that are already blessed and registered with IANA. -- brian m. carlson (he/him or they/them) Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 11:54 ` brian m. carlson @ 2021-03-16 14:21 ` Drew DeVault 2021-03-16 21:28 ` Jeff King ` (2 more replies) 2021-03-16 18:03 ` Eli Schwartz 1 sibling, 3 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-16 14:21 UTC (permalink / raw) To: brian m. carlson, Eli Schwartz; +Cc: Jonathan Nieder, git On Tue Mar 16, 2021 at 7:54 AM EDT, brian m. carlson wrote: > I believe this construct is nonstandard. It is better to use standard > URL syntax when possible because it makes it much, much easier for > people to use standard tooling to parse and handle URLs. Such tooling > may have special cases for the HTTP syntax that it doesn't use in MAILTO > syntax, so it's important to pick something that works automatically. It is standard - RFC 3986 section 3.1 permits the + character in URI schemes. The use of protocol "composition", e.g. git+https, is a convention, but not a standard. > > So I'm very much opposed to adding, expanding, or giving any sort of > official blessing to this syntax, especially when there are perfectly > valid and equivalent schemes that are already blessed and registered > with IANA. This convention is blessed by the IANA, given that they have accepted protocol registrations which use this convention: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml > It's difficult enough to handle parsing of SSH specifications and > distinguish them uniformly from Windows paths (think of an alias named > "c"), so I'd prefer we didn't add additional complexity to handle this > case. There's no additional complexity here: git remotes are URIs, and any implementation which parses them as such already deals with this case correctly. Any implementation which doesn't may face all kinds of problems as a consequence: SSH without a user specified, HTTPS with Basic auth in the URI username/password fields (or just the password, which is also allowed), and so on. Any sane and correct implementation is pulling in a URI parser here, and if not, I don't think it's fair for git to constrain itself in order to work around some other project's bugs. > Lest you think that only Git has to handle parsing these I don't, given that my argument stems from making it easier for third-party applications to deal with git URIs :) > Despite the fact that ssh+git is specified as deprecated, we had > people expect it to magically work and had to support it in Git LFS. Aye, people do expect it to work. The problem is not going to go away. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 14:21 ` Drew DeVault @ 2021-03-16 21:28 ` Jeff King 2021-03-17 14:50 ` Drew DeVault 2021-03-17 0:45 ` Jakub Narębski 2021-03-17 22:06 ` brian m. carlson 2 siblings, 1 reply; 28+ messages in thread From: Jeff King @ 2021-03-16 21:28 UTC (permalink / raw) To: Drew DeVault; +Cc: brian m. carlson, Eli Schwartz, Jonathan Nieder, git On Tue, Mar 16, 2021 at 10:21:13AM -0400, Drew DeVault wrote: > > It's difficult enough to handle parsing of SSH specifications and > > distinguish them uniformly from Windows paths (think of an alias named > > "c"), so I'd prefer we didn't add additional complexity to handle this > > case. > > There's no additional complexity here: git remotes are URIs, and any > implementation which parses them as such already deals with this case > correctly. Any implementation which doesn't may face all kinds of > problems as a consequence: SSH without a user specified, HTTPS with > Basic auth in the URI username/password fields (or just the password, > which is also allowed), and so on. Any sane and correct implementation > is pulling in a URI parser here, and if not, I don't think it's fair for > git to constrain itself in order to work around some other project's > bugs. Git remotes are most definitely not just URIs. Some valid remotes are: ".", "foo", "/tmp/foo", "c:\foo", "example.com:foo". The parser inside Git has rules to distinguish these from actual rfc3986-compliant URIs. Now I don't know much about the parsing code in, say, git-lfs, or how much of pain it would be to add a new scheme for something that _does_ conform to rfc3986. But it's not necessarily as easy as "you should be using a compliant URI parser". -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 21:28 ` Jeff King @ 2021-03-17 14:50 ` Drew DeVault 0 siblings, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-17 14:50 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, Eli Schwartz, Jonathan Nieder, git On Tue Mar 16, 2021 at 5:28 PM EDT, Jeff King wrote: > Git remotes are most definitely not just URIs. Some valid remotes are: > ".", "foo", "/tmp/foo", "c:\foo", "example.com:foo". The parser inside > Git has rules to distinguish these from actual rfc3986-compliant URIs. > > Now I don't know much about the parsing code in, say, git-lfs, or how > much of pain it would be to add a new scheme for something that _does_ > conform to rfc3986. But it's not necessarily as easy as "you should be > using a compliant URI parser". Sorry, I meant to say that git remotes are a superset of URIs, so a conformant URI parser already has to be involved - I didn't mean that all git remotes are URIs. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 14:21 ` Drew DeVault 2021-03-16 21:28 ` Jeff King @ 2021-03-17 0:45 ` Jakub Narębski 2021-03-17 14:53 ` Drew DeVault 2021-03-17 22:06 ` brian m. carlson 2 siblings, 1 reply; 28+ messages in thread From: Jakub Narębski @ 2021-03-17 0:45 UTC (permalink / raw) To: Drew DeVault; +Cc: brian m. carlson, Eli Schwartz, Jonathan Nieder, git "Drew DeVault" <sir@cmpwn.com> writes: > On Tue Mar 16, 2021 at 7:54 AM EDT, brian m. carlson wrote: >> I believe this construct is nonstandard. It is better to use standard >> URL syntax when possible because it makes it much, much easier for >> people to use standard tooling to parse and handle URLs. Such tooling >> may have special cases for the HTTP syntax that it doesn't use in MAILTO >> syntax, so it's important to pick something that works automatically. > > It is standard - RFC 3986 section 3.1 permits the + character in > URI schemes. The use of protocol "composition", e.g. git+https, is a > convention, but not a standard. All right, that is true... but the Git itself and Git--related tools do not usually employ the full-fledged URI parser, as far as I know. They just check for the few schemas they support if the repository location is given as an URI / URL. That said, if the RFC states it, then it is a standard construct. >> So I'm very much opposed to adding, expanding, or giving any sort of >> official blessing to this syntax, especially when there are perfectly >> valid and equivalent schemes that are already blessed and registered >> with IANA. > > This convention is blessed by the IANA, given that they have > accepted protocol registrations which use this convention: > > https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml Well, thara is a total of one protocol (CoAP) that uses '+' based schemas, namely: coap+tcp, coap+ws, coaps+tcp, coaps+ws (well at least out of those protocols that made it into IANA). Though it in this case neither of those parts of schema joined by the '+' sign is an application name... >> It's difficult enough to handle parsing of SSH specifications and >> distinguish them uniformly from Windows paths (think of an alias named >> "c"), so I'd prefer we didn't add additional complexity to handle this >> case. > > There's no additional complexity here: git remotes are URIs, and any > implementation which parses them as such already deals with this case > correctly. Any implementation which doesn't may face all kinds of > problems as a consequence: SSH without a user specified, HTTPS with > Basic auth in the URI username/password fields (or just the password, > which is also allowed), and so on. Any sane and correct implementation > is pulling in a URI parser here, and if not, I don't think it's fair for > git to constrain itself in order to work around some other project's > bugs. The Git documentation explicitly enumerates all possible URL types that you can use with Git. On the other hand Git-related tools can support more types of URL, for example ones for AWS S3 buckets. > >> Lest you think that only Git has to handle parsing these > > I don't, given that my argument stems from making it easier for > third-party applications to deal with git URIs :) > >> Despite the fact that ssh+git is specified as deprecated, we had >> people expect it to magically work and had to support it in Git LFS. > > Aye, people do expect it to work. The problem is not going to go away. To reiterate, the idea of "prefixed URLs", that is using git+https:// and git+ssh:// is to denote that said URL is only usable by Git, without any additional out-of-band information (like other attributes on <a> element or its encompassing element)? Best, -- Jakub Narębski ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-17 0:45 ` Jakub Narębski @ 2021-03-17 14:53 ` Drew DeVault 0 siblings, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-17 14:53 UTC (permalink / raw) To: Jakub Narębski; +Cc: brian m. carlson, Eli Schwartz, Jonathan Nieder, git On Tue Mar 16, 2021 at 8:45 PM EDT, Jakub Narębski wrote: > Well, thara is a total of one protocol (CoAP) that uses '+' based > schemas, namely: coap+tcp, coap+ws, coaps+tcp, coaps+ws (well at least > out of those protocols that made it into IANA). One is greater than zero! It is blessed, even if only a little. We can just go ask the IANA about it if we want to further entertain the idea that this approach is non-kosher, but, like you said: if the RFC states it, then it is a standard construct. > Though it in this case neither of those parts of schema joined by the > '+' sign is an application name... git is both an application name and a protocol name. ¯\_(ツ)_/¯ > > Aye, people do expect it to work. The problem is not going to go away. > > To reiterate, the idea of "prefixed URLs", that is using git+https:// > and git+ssh:// is to denote that said URL is only usable by Git, without > any additional out-of-band information (like other attributes on <a> > element or its encompassing element)? Correct. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 14:21 ` Drew DeVault 2021-03-16 21:28 ` Jeff King 2021-03-17 0:45 ` Jakub Narębski @ 2021-03-17 22:06 ` brian m. carlson 2021-03-18 12:53 ` Drew DeVault 2 siblings, 1 reply; 28+ messages in thread From: brian m. carlson @ 2021-03-17 22:06 UTC (permalink / raw) To: Drew DeVault; +Cc: Eli Schwartz, Jonathan Nieder, git [-- Attachment #1: Type: text/plain, Size: 4184 bytes --] On 2021-03-16 at 14:21:13, Drew DeVault wrote: > On Tue Mar 16, 2021 at 7:54 AM EDT, brian m. carlson wrote: > > So I'm very much opposed to adding, expanding, or giving any sort of > > official blessing to this syntax, especially when there are perfectly > > valid and equivalent schemes that are already blessed and registered > > with IANA. > > This convention is blessed by the IANA, given that they have > accepted protocol registrations which use this convention: > > https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml I assume that you're volunteering to write the RFC to register these with IANA? If not, then they are indeed non-standard and will remain so. I should point out that I don't believe the IANA will accept such a registration, because they will believe it to be duplicative of the existing scheme. But if you want to go this route, we should only proceed if we register them with IANA. > > It's difficult enough to handle parsing of SSH specifications and > > distinguish them uniformly from Windows paths (think of an alias named > > "c"), so I'd prefer we didn't add additional complexity to handle this > > case. > > There's no additional complexity here: git remotes are URIs, and any > implementation which parses them as such already deals with this case > correctly. Any implementation which doesn't may face all kinds of > problems as a consequence: SSH without a user specified, HTTPS with > Basic auth in the URI username/password fields (or just the password, > which is also allowed), and so on. Any sane and correct implementation > is pulling in a URI parser here, and if not, I don't think it's fair for > git to constrain itself in order to work around some other project's > bugs. We accept local paths in a variety of situations and SSH specifications, neither of which are URLs. The ultimate problem is that we support Windows paths and need to handle them correctly on Windows but don't support them on other operating systems and need to not handle them there. So, somehow, in portable code which does not vary based on operating system, we need to decide what should be a local path and what should be an SSH specification and do that in a way compatible with Git. Git LFS has also run into the problem that the URL parser we use has gotten stricter in a point release due to CVEs against it which broke various kinds of parsing of our SSH URLs that were previously accepted. This almost certainly bit other Go-based tools that work with Git repositories as well, since everyone uses the standard library URI parser. If we only supported valid URLs, this would be much, much easier. That is not at all the case, and it has never been the case for Git. > > Lest you think that only Git has to handle parsing these > > I don't, given that my argument stems from making it easier for > third-party applications to deal with git URIs :) This does not make my life as a maintainer of said third-party application easier. It complicates it significantly, because people often upgrade Git without upgrading Git LFS and then are unhappy when the five-year old version they use from their distro doesn't support every new feature. Adding this feature which duplicates existing functionality does not improve my life as a user of Git, as a developer of Git, as a maintainer of a number of third-party tools which interact with Git, or as someone who maintains part of a hosting platform. It also will inevitably confuse users who will want to know the relevant difference between the URLs and which they should use. They will then see the new type of URL and wonder why it does not work with the version they are using. And many users already don't understand the difference between HTTPS and SSH URLs, which is compounded by the fact that many Windows users have never before and will never otherwise use SSH. In case it was not already clear, I'm very strongly opposed to this proposal. It seems to make a lot of needless work without a clear and convincing benefit. -- brian m. carlson (he/him or they/them) Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-17 22:06 ` brian m. carlson @ 2021-03-18 12:53 ` Drew DeVault 0 siblings, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-18 12:53 UTC (permalink / raw) To: brian m. carlson; +Cc: Eli Schwartz, Jonathan Nieder, git I feel like the tone here is getting a bit hostile. Let's try to keep things friendly. On Wed Mar 17, 2021 at 6:06 PM EDT, brian m. carlson wrote: > I assume that you're volunteering to write the RFC to register these > with IANA? If not, then they are indeed non-standard and will remain > so. > > I should point out that I don't believe the IANA will accept such a > registration, because they will believe it to be duplicative of the > existing scheme. But if you want to go this route, we should only > proceed if we register them with IANA. This is a needlessly high bar to set, and saying we can only proceed with the IANA's involvement seems like a convenient excuse to shut the conversation down entirely. Registering with IANA is nice, but there are thousands of protocols which don't bother. In any case, this is not quite as high of a bar as you may believe (or hope?). The process is pretty straightforward, and a scheme with "+" in it meets the criteria laid forth in the RFC, and the argument is even stronger given that WHATWG standards make use of the convention these days. If this is truly desirable, we can do it after the feature lands, but given that the git:// protocol was registered as an apparent after-thought by a third-party from Microsoft with zero commits in the git tree, it just seems like a requirement put forth in bad faith. > > > Lest you think that only Git has to handle parsing these > > > > I don't, given that my argument stems from making it easier for > > third-party applications to deal with git URIs :) > > This does not make my life as a maintainer of said third-party > application easier. It complicates it significantly, because people > often upgrade Git without upgrading Git LFS and then are unhappy when > the five-year old version they use from their distro doesn't support > every new feature. What third-party software do you represent? Can we make an objective estimation of the complexity of the change for your project in practice? > Adding this feature which duplicates existing functionality What existing method is there to identify a URL as being a git remote? > It also will inevitably confuse users who will want to know the relevant > difference between the URLs and which they should use. They will then > see the new type of URL and wonder why it does not work with the version > they are using. And many users already don't understand the difference > between HTTPS and SSH URLs, which is compounded by the fact that many > Windows users have never before and will never otherwise use SSH. As you explained, this confusion is already happening. If users don't know what a URI is, then they're already confused, and this is unlikely to make it worse. If anything, this could make it easier, as a URL which explicitly represents its relationship with git could hint at its intended usage. And again, I don't expect users to actually be handing these URLs around to each other for regular use. This is specifically necessary in cases where software needs to handle multiple kinds of version control. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 11:54 ` brian m. carlson 2021-03-16 14:21 ` Drew DeVault @ 2021-03-16 18:03 ` Eli Schwartz 2021-03-17 22:15 ` Jonathan Nieder 1 sibling, 1 reply; 28+ messages in thread From: Eli Schwartz @ 2021-03-16 18:03 UTC (permalink / raw) To: brian m. carlson, Jonathan Nieder, Drew DeVault, git [-- Attachment #1.1: Type: text/plain, Size: 2736 bytes --] On 3/16/21 7:54 AM, brian m. carlson wrote: > On 2021-03-16 at 04:38:08, Eli Schwartz wrote: >> Why does this even matter? Again, the point here is the assertion by >> Drew that, for the purpose of listing a manifest of remotely fetchable >> resources, he sees a benefit to having some standard format for the URI >> itself, describing how it's intended to be fetched. >> >> - ftp:// -> use the `ftp` tool >> - scp:// -> use the `scp` tool >> - http:// -> use the `wget` tool >> - git+http:// -> use the `git` tool >> >> But instead of needing every program with a git integration to >> reimplement "recognize git+http and do substring prefix removal before >> passing to git", the suggestion is for git to do this. > > I believe this construct is nonstandard. It is better to use standard > URL syntax when possible because it makes it much, much easier for > people to use standard tooling to parse and handle URLs. Such tooling > may have special cases for the HTTP syntax that it doesn't use in MAILTO > syntax, so it's important to pick something that works automatically. > > It's difficult enough to handle parsing of SSH specifications and > distinguish them uniformly from Windows paths (think of an alias named > "c"), so I'd prefer we didn't add additional complexity to handle this > case. > > Lest you think that only Git has to handle parsing these, the Git LFS > project (and every other implementation compatible with Git) has to > handle parsing them as well (and related things like url.*.insteadOf), > and providing bug-for-bug compatible behavior is generally a hassle. > We've run into numerous problems where things aren't exactly the same, > and making things more complex by adding an esoteric syntax that few > users are likely to use isn't helping. Despite the fact that ssh+git is > specified as deprecated, we had people expect it to magically work and > had to support it in Git LFS. > > So I'm very much opposed to adding, expanding, or giving any sort of > official blessing to this syntax, especially when there are perfectly > valid and equivalent schemes that are already blessed and registered > with IANA. Suddenly I'm hearing a much more reasonable response than "but it doesn't give me content-type so I can't know which media application is capable of opening it". (I'm not especially attached to the proposal. I'm a maintainer for one of these package managers that currently special-case git+https?:// and rewrite the url that git sees, which has worked adequately for a long time. However, I figured if you want to reject this proposal, reject it for a good reason...) -- Eli Schwartz Arch Linux Bug Wrangler and Trusted User [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-16 18:03 ` Eli Schwartz @ 2021-03-17 22:15 ` Jonathan Nieder 2021-03-31 4:23 ` Eli Schwartz 2021-04-07 13:46 ` Mark Lodato 0 siblings, 2 replies; 28+ messages in thread From: Jonathan Nieder @ 2021-03-17 22:15 UTC (permalink / raw) To: Eli Schwartz; +Cc: brian m. carlson, Drew DeVault, git Hi, Eli Schwartz wrote: > I'm not especially attached to the proposal. I'm a maintainer for one > of these package managers that currently special-case git+https?:// and > rewrite the url that git sees, which has worked adequately for a long > time. This is useful context. What URL forms does this package manager support (e.g., do you have a link to its documentation)? What would the effect be for the package manager and its users if Git started supporting a git+https:// synonym for https://? Thanks, Jonathan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-17 22:15 ` Jonathan Nieder @ 2021-03-31 4:23 ` Eli Schwartz 2021-04-07 13:46 ` Mark Lodato 1 sibling, 0 replies; 28+ messages in thread From: Eli Schwartz @ 2021-03-31 4:23 UTC (permalink / raw) To: Jonathan Nieder; +Cc: brian m. carlson, Drew DeVault, git [-- Attachment #1.1: Type: text/plain, Size: 1931 bytes --] On 3/17/21 6:15 PM, Jonathan Nieder wrote: > Hi, > > Eli Schwartz wrote: > >> I'm not especially attached to the proposal. I'm a maintainer for one >> of these package managers that currently special-case git+https?:// and >> rewrite the url that git sees, which has worked adequately for a long >> time. > > This is useful context. What URL forms does this package manager > support (e.g., do you have a link to its documentation)? What would > the effect be for the package manager and its users if Git started > supporting a git+https:// synonym for https://? https://archlinux.org/pacman/PKGBUILD.5.html#VCS We support cloning arbitrary version controlled sources via either vcs:// or vcs+proto:// but not proto+vcs:// so that encompasses git:// or git+https:// or git+ssh:// and also permits hg+https or svn+https:// or bzr+http:// or fossil+https:// (ignore the documentation not mentioning fossil, this is a development branch addition and obviously the docs are for the stable release) We then do prefix removal of everything before the plus sign since currently no VCS supports this directly (I think?), but we could remove that pass from our git source plugin if git implemented it internally. Implementing https+git:// as a synonym for https:// is IMO confusing, so I don't intend to implement it even if git does. I think one way to specify the VCS + transport protocol is enough... and prefix removal is easier than removing the middle of the string. The net effect would be, I guess, less code in the package manager, and users would be able to go to a public registry of source packages like https://aur.archlinux.org/packages/pacman-git, see the clickable link under "Sources (5)" and copy/paste that into a `git clone` command line without knowing they need to edit the link first. -- Eli Schwartz Arch Linux Bug Wrangler and Trusted User [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-17 22:15 ` Jonathan Nieder 2021-03-31 4:23 ` Eli Schwartz @ 2021-04-07 13:46 ` Mark Lodato 2021-04-07 19:46 ` Junio C Hamano 1 sibling, 1 reply; 28+ messages in thread From: Mark Lodato @ 2021-04-07 13:46 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Eli Schwartz, brian m. carlson, Drew DeVault, git Jonathan Nieder wrote: > This is useful context. What URL forms does this package manager > support (e.g., do you have a link to its documentation)? What would > the effect be for the package manager and its users if Git started > supporting a git+https:// synonym for https://? Here are two more examples: - pip: https://pip.pypa.io/en/latest/cli/pip_install/#git - SPDX: https://spdx.github.io/spdx-spec/3-package-information/#37-package-download-location The common thread is that systems need a way to uniquely identify a git repository or some object therein. I believe this means some combination of: - VCS type (git) - Transport location (e.g. https://github.com/git/git) - Ref (e.g. master) - Resolved commit ID (e.g. 48bf2fa8bad054d66bd79c6ba903c89c704201f7) - Path (e.g. contrib/diff-highlight) - (possibly) Clone depth As Drew has said, the current state of affairs is that, lacking a standard, multiple systems are all inventing incompatible schemes using the `git+https` name. This is not a good situation because the "URI" is no longer "unique". Given such a URI in isolation, one cannot know how to parse it. It's not clear to me that git itself needs to support this scheme. It would go a long way for git to simply recommend a particular scheme so that all these systems can use a common format. (We could register that with IANA.) The pip format seems to be the closest, but it doesn't support both ref AND resolved commit ID, and it is currently specific to pip (`egg=` could be replaced with `path=`). Best, Mark ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-04-07 13:46 ` Mark Lodato @ 2021-04-07 19:46 ` Junio C Hamano 2021-04-13 8:52 ` Kerry, Richard 0 siblings, 1 reply; 28+ messages in thread From: Junio C Hamano @ 2021-04-07 19:46 UTC (permalink / raw) To: Mark Lodato Cc: Jonathan Nieder, Eli Schwartz, brian m. carlson, Drew DeVault, git Mark Lodato <lodato@google.com> writes: > The common thread is that systems need a way to uniquely identify a git > repository or some object therein. I believe this means some combination > of: > > - VCS type (git) > - Transport location (e.g. https://github.com/git/git) > - Ref (e.g. master) > - Resolved commit ID (e.g. 48bf2fa8bad054d66bd79c6ba903c89c704201f7) > - Path (e.g. contrib/diff-highlight) > - (possibly) Clone depth Nice. So there is no reason to expect that these downstream systems can sanely force various VCS systems that the notation they use for "transport location" would identify what VCS type uses that location. All the other details (like refs, which may other VCS many not even have) other than VCS type depend on the VCS used. Thanks. ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: Regarding the depreciation of ssh+git/git+ssh protocols 2021-04-07 19:46 ` Junio C Hamano @ 2021-04-13 8:52 ` Kerry, Richard 0 siblings, 0 replies; 28+ messages in thread From: Kerry, Richard @ 2021-04-13 8:52 UTC (permalink / raw) To: git@vger.kernel.org Cc: Jonathan Nieder, Eli Schwartz, brian m. carlson, Drew DeVault, Mark Lodato, Junio C Hamano s/depreciation/deprecation/ Regards, Richard. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols 2021-03-15 22:01 ` brian m. carlson 2021-03-16 0:52 ` Drew DeVault @ 2021-03-16 0:54 ` Drew DeVault 1 sibling, 0 replies; 28+ messages in thread From: Drew DeVault @ 2021-03-16 0:54 UTC (permalink / raw) To: brian m. carlson; +Cc: Jonathan Nieder, git On Mon Mar 15, 2021 at 6:01 PM EDT, brian m. carlson wrote: > All a URL can tell you is literally where a resource is located. To further clarify: a URL tells you not only where to find a resource, but how to access it. This is the purpose of the scheme field. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Regarding the depreciation of ssh+git/git+ssh protocols @ 2023-10-13 20:49 David Rogers 0 siblings, 0 replies; 28+ messages in thread From: David Rogers @ 2023-10-13 20:49 UTC (permalink / raw) To: git Git repositories have become indispensable resources for citing parts of a development history with links. However, the format of git remote entries is not always distinguishable from other types of citation -- for example a git reference vs. a plain URL. Rather than rely on context to tell me that `https://github.com/git/git` refers to a git repository which I could clone with git over https, it would be nice to use a url like `git+https://github.com/git/git` or even `git+https://github.com/git/git?commit=d0e8084c65cbf949038ae4cc344ac2c2efd77415` to unambiguously specify that the type of data and its method of access are native to git. This issue is extremely important for version control systems which build dependency lists from git, e.g. https://pip.pypa.io/en/stable/topics/vcs-support/ That project lists several invented URL schemes (all beginning with git+) and assigning special reserved characters (https://datatracker.ietf.org/doc/html/rfc3986#section-2.2) git+https://git.example.com/MyProject.git@master git+https://git.example.com/MyProject.git@v1.0 git+https://git.example.com/MyProject.git@da39a3ee5e6b4b0d3255bfef95601890afd80709 git+https://git.example.com/MyProject.git@refs/pull/123/head It would be helpful for the git project itself to define its own URL scheme to codify these use cases and, possibly in addition, provide a standard way to reference within git repositories. For reference, some of the ways URLs are already used/defined within git are documented here: - https://github.com/git/git/blob/d0e8084c65cbf949038ae4cc344ac2c2efd77415/connect.c#L107 (alternately, using gitweb syntax not actually available on github, https://github.com/git/git.git/blob/d0e8084c65cbf949038ae4cc344ac2c2efd77415:/git/connect.c) - https://mirrors.edge.kernel.org/pub/software/scm/git/docs/gitremote-helpers.html - https://git-scm.com/docs/git-http-backend - https://git-scm.com/docs/gitweb Currently, a comment in connect.c notes "git+" schemes were deprecated. However, I would argue that at a minimum, these "git+" schemes should be a supported and documented feature of git. Also, something has to be fixed (or better communicated) about URLs of the form "git@github.com:user/project.git" These are implicitly treated as "git+ssh://git@github.com/user/project.git", but the use of ":" is confusing from the perspective of translating between these two forms. In addition, the use of paths, queries, and fragments should be considered to allow (IMHO) at least 3 distinct uses: 1. naming commit-ish objects (and potentially metadata like author and parents within the commit) 2. naming tree-ish objects and paths within them 3. naming blobs (and potentially fragment identifiers like lines or HTML tags within those blobs) These further refinements don't have to be supported by any special functions within git. However, their existence may influence git data structures and api-s in the future. The last discussion I can find of this issue on the git mailing list (https://lore.kernel.org/git/C9Y2DPYH4XO1.3KFD8LT770P2@taiga) indicates that defining conventions like these within git's documentation would be a good place to start. On a separate thread, I will send a draft "git+" URI naming scheme for discussion and eventual submission to IANA (https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml). ~ David M. Rogers ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2023-10-13 20:49 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-15 16:27 Regarding the depreciation of ssh+git/git+ssh protocols Drew DeVault 2021-03-15 17:56 ` Jonathan Nieder 2021-03-15 18:14 ` Drew DeVault 2021-03-15 22:01 ` brian m. carlson 2021-03-16 0:52 ` Drew DeVault 2021-03-16 1:02 ` Jonathan Nieder 2021-03-16 1:05 ` Drew DeVault 2021-03-16 21:23 ` Jeff King 2021-03-17 14:49 ` Drew DeVault 2021-03-18 21:30 ` Junio C Hamano 2021-03-18 21:53 ` Drew DeVault 2021-03-16 4:38 ` Eli Schwartz 2021-03-16 11:54 ` brian m. carlson 2021-03-16 14:21 ` Drew DeVault 2021-03-16 21:28 ` Jeff King 2021-03-17 14:50 ` Drew DeVault 2021-03-17 0:45 ` Jakub Narębski 2021-03-17 14:53 ` Drew DeVault 2021-03-17 22:06 ` brian m. carlson 2021-03-18 12:53 ` Drew DeVault 2021-03-16 18:03 ` Eli Schwartz 2021-03-17 22:15 ` Jonathan Nieder 2021-03-31 4:23 ` Eli Schwartz 2021-04-07 13:46 ` Mark Lodato 2021-04-07 19:46 ` Junio C Hamano 2021-04-13 8:52 ` Kerry, Richard 2021-03-16 0:54 ` Drew DeVault -- strict thread matches above, loose matches on Subject: below -- 2023-10-13 20:49 David Rogers
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).