* submodule support in git-bundle @ 2018-11-02 16:09 Duy Nguyen 2018-11-02 17:08 ` Stefan Beller 0 siblings, 1 reply; 4+ messages in thread From: Duy Nguyen @ 2018-11-02 16:09 UTC (permalink / raw) To: Git Mailing List I use git-bundle today and it occurs to me that if I want to use it to transfer part of a history that involves submodule changes, things aren't pretty. Has anybody given thought on how to do binary history transfer that contains changes from submodules? Since .bundle files are basically .pack files, i'm not sure if it's easy to bundle multiple pack files (one per repo)... -- Duy ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: submodule support in git-bundle 2018-11-02 16:09 submodule support in git-bundle Duy Nguyen @ 2018-11-02 17:08 ` Stefan Beller 2018-11-02 18:34 ` Duy Nguyen 0 siblings, 1 reply; 4+ messages in thread From: Stefan Beller @ 2018-11-02 17:08 UTC (permalink / raw) To: Duy Nguyen; +Cc: git On Fri, Nov 2, 2018 at 9:10 AM Duy Nguyen <pclouds@gmail.com> wrote: > > I use git-bundle today and it occurs to me that if I want to use it to > transfer part of a history that involves submodule changes, things > aren't pretty. Has anybody given thought on how to do binary history > transfer that contains changes from submodules? > > Since .bundle files are basically .pack files, i'm not sure if it's > easy to bundle multiple pack files (one per repo)... That is a really good discussion starter! As bundles are modeled after the fetch protocol, I would redirect the discussion there. The new fetch protocol could support sending more than one pack, which could be for both the superproject as well as the relevant submodule updates (i.e. what is recorded in the superproject) based on a new capability. We at Google have given this idea some thought, but from a different angle: As you may know currently Android uses the repo tool, which we want to replace with Gits native submodules eventually. The repo tool tests for each repository to clone if there is a bundle file for that repository, such that instead of cloning the repo, the bundle can be downloaded and then a catch-up fetch can be performed. (This helps the Git servers as well as the client, the bundle can be hosted on a CDN, which is faster and cheaper than a git server for us). So we've given some thought on extending the packfiles in the fetch protocol to have some redirection to a CDN possible, i.e. instead of sending bytes as is, you get more or less a "todo" list, which might be (a) take the following bytes as is (current pack format) (b) download these other bytes from $THERE (possibly with a checksum) once the stream of bytes is assembled, it will look like a regular packfile with deltas etc. This offloading-to-CDN (or "mostly resumable clone" in the sense that the communication with the server is minimal, and you get most of your data via resumable http range-requests) sounds like complete offtopic, but is one of the requirements for the repo to submodule migration, hence I came to speak of it. Did you have other things in mind, on a higher level? e.g. querying the bundle and creating submodule bundles based off the superproject bundle? 'git bundle create' could learn the --recurse-submodules option, which then produces multiple bundle files without changing the file formats. Stefan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: submodule support in git-bundle 2018-11-02 17:08 ` Stefan Beller @ 2018-11-02 18:34 ` Duy Nguyen 2018-11-02 19:00 ` Stefan Beller 0 siblings, 1 reply; 4+ messages in thread From: Duy Nguyen @ 2018-11-02 18:34 UTC (permalink / raw) To: Stefan Beller; +Cc: Git Mailing List On Fri, Nov 2, 2018 at 6:09 PM Stefan Beller <sbeller@google.com> wrote: > > On Fri, Nov 2, 2018 at 9:10 AM Duy Nguyen <pclouds@gmail.com> wrote: > > > > I use git-bundle today and it occurs to me that if I want to use it to > > transfer part of a history that involves submodule changes, things > > aren't pretty. Has anybody given thought on how to do binary history > > transfer that contains changes from submodules? > > > > Since .bundle files are basically .pack files, i'm not sure if it's > > easy to bundle multiple pack files (one per repo)... > > That is a really good discussion starter! > > As bundles are modeled after the fetch protocol, I would > redirect the discussion there. > > The new fetch protocol could support sending more than > one pack, which could be for both the superproject as > well as the relevant submodule updates (i.e. what is recorded > in the superproject) based on a new capability. > > We at Google have given this idea some thought, but from a > different angle: As you may know currently Android uses the > repo tool, which we want to replace with Gits native submodules > eventually. The repo tool tests for each repository to clone if > there is a bundle file for that repository, such that instead of > cloning the repo, the bundle can be downloaded and then > a catch-up fetch can be performed. (This helps the Git servers > as well as the client, the bundle can be hosted on a CDN, > which is faster and cheaper than a git server for us). > > So we've given some thought on extending the packfiles in the > fetch protocol to have some redirection to a CDN possible, > i.e. instead of sending bytes as is, you get more or less a "todo" > list, which might be > (a) take the following bytes as is (current pack format) > (b) download these other bytes from $THERE > (possibly with a checksum) > once the stream of bytes is assembled, it will look like a regular > packfile with deltas etc. > > This offloading-to-CDN (or "mostly resumable clone" in the > sense that the communication with the server is minimal, and > you get most of your data via resumable http range-requests) > sounds like complete offtopic, but is one of the requirements > for the repo to submodule migration, hence I came to speak of it. Hm.. so what you're saying is, we could have a pack file that lists other (real) pack files and for the bundle case they are all in the same file. And "download from $THERE" in this case is "download at this file offset"? That might actually work. > Did you have other things in mind, on a higher level? > e.g. querying the bundle and creating submodule bundles > based off the superproject bundle? 'git bundle create' could > learn the --recurse-submodules option, which then produces > multiple bundle files without changing the file formats. This is probably the simplest way to support submodules. I just haven't really thought much about it (the problem just came up to me like 2 hours ago). Two problems with this are convenience (I don't want to handle multiple files) and submodule info (which pack should be unbundled on which submodule?). But I suppose if "git bundle" produces a tarball of these bundle files then you solve both. But of course there may be other and better options like what you described above. If in long term we have "pack with hyperlinks" anyway for resumable clone and other fancy stuff then reusing the same mechanism for bundles makes sense, less maintenance burden. -- Duy ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: submodule support in git-bundle 2018-11-02 18:34 ` Duy Nguyen @ 2018-11-02 19:00 ` Stefan Beller 0 siblings, 0 replies; 4+ messages in thread From: Stefan Beller @ 2018-11-02 19:00 UTC (permalink / raw) To: Duy Nguyen; +Cc: git > > This offloading-to-CDN (or "mostly resumable clone" in the > > sense that the communication with the server is minimal, and > > you get most of your data via resumable http range-requests) > > sounds like complete offtopic, but is one of the requirements > > for the repo to submodule migration, hence I came to speak of it. > > Hm.. so what you're saying is, we could have a pack file that lists > other (real) pack files and for the bundle case they are all in the > same file. And "download from $THERE" in this case is "download at > this file offset"? That might actually work. We're conflating 2 things here. This idea of CDN offloading has nothing to do with submodules, it's just a general thing to improve the fetch protocol. And the pointed at file doesn't need to be a "real" packfile, as long as the bytestream at the end looks like a real packfile. For example the bytes to get from $THERE would not need to have a pack header (or if it had, I would ask you to omit the first bytes containing the header) as I can give the header myself. The idea for submodules is more along the lines of having "just" multiple pack files in the stream. For the bundle case we would probably not have redirection to $THERE in there, as it should be self contained completely (we don't know if the bundle recipient can access $THERE in a timely manner). > > Did you have other things in mind, on a higher level? > > e.g. querying the bundle and creating submodule bundles > > based off the superproject bundle? 'git bundle create' could > > learn the --recurse-submodules option, which then produces > > multiple bundle files without changing the file formats. > > This is probably the simplest way to support submodules. Yep, that sounds simplest, but I think it makes for bad UX. (Multiple files, need to be kept in some order and applied correctly) > I just > haven't really thought much about it (the problem just came up to me > like 2 hours ago). Two problems with this are convenience (I don't > want to handle multiple files) and submodule info (which pack should > be unbundled on which submodule?). But I suppose if "git bundle" > produces a tarball of these bundle files then you solve both. The tarball makes it one file and would naturally provide some order. It feels iffy, I'd rather have multiple packs in the bundle. > But of course there may be other and better options like what you > described above. If in long term we have "pack with hyperlinks" anyway > for resumable clone and other fancy stuff then reusing the same > mechanism for bundles makes sense, less maintenance burden. I think of the hyperlinks in packs as an orthogonal feature, but closely nearby in code and implementation, which is why I brought it up. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-11-02 19:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-02 16:09 submodule support in git-bundle Duy Nguyen 2018-11-02 17:08 ` Stefan Beller 2018-11-02 18:34 ` Duy Nguyen 2018-11-02 19:00 ` Stefan Beller
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).