* Reducing git size by building libgit.so @ 2019-06-11 19:52 Elmar Pruesse 2019-06-11 23:48 ` brian m. carlson 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 12+ messages in thread From: Elmar Pruesse @ 2019-06-11 19:52 UTC (permalink / raw) To: git@vger.kernel.org Hi! The total compiled size of libexec/git-core is currently somewhere around 30 MB. This is largely due to a number of binaries linking statically against libgit.a. For some folks, every byte counts. I meddled with the Makefile briefly to make it build and use a libgit.so instead, which dropped package size down to 5MB. Are there, beyond the ~20 ms in extra startup time and the slightly bigger hassle with DSO locations, reasons for the choice to link statically? best, Elmar ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-11 19:52 Reducing git size by building libgit.so Elmar Pruesse @ 2019-06-11 23:48 ` brian m. carlson 2019-06-12 9:29 ` Duy Nguyen 2019-06-12 13:57 ` Paul Smith 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason 1 sibling, 2 replies; 12+ messages in thread From: brian m. carlson @ 2019-06-11 23:48 UTC (permalink / raw) To: Elmar Pruesse; +Cc: git@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1323 bytes --] On 2019-06-11 at 19:52:18, Elmar Pruesse wrote: > Hi! > > The total compiled size of libexec/git-core is currently somewhere > around 30 MB. This is largely due to a number of binaries linking > statically against libgit.a. For some folks, every byte counts. I > meddled with the Makefile briefly to make it build and use a libgit.so > instead, which dropped package size down to 5MB. > > Are there, beyond the ~20 ms in extra startup time and the slightly > bigger hassle with DSO locations, reasons for the choice to link statically? I think the reason is that libgit is not API stable and we definitely don't want people linking against it. Before libgit2 existed, projects like cgit built their own libgit and it required pinning to a specific version of Git. Also, some people install Git into their home directories, and a shared library means that they'll have to use LD_LIBRARY_PATH (or equivalent) to run Git. Finally, we have support for a runtime relocatable Git which can be run out of any path and still automatically find its dependent binaries. That won't work with a shared library. So if we did allow for building a shared library, it would have to be an option that defaulted to off, I think. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-11 23:48 ` brian m. carlson @ 2019-06-12 9:29 ` Duy Nguyen 2019-06-12 13:57 ` Paul Smith 1 sibling, 0 replies; 12+ messages in thread From: Duy Nguyen @ 2019-06-12 9:29 UTC (permalink / raw) To: brian m. carlson, Elmar Pruesse, git@vger.kernel.org On Wed, Jun 12, 2019 at 2:11 PM brian m. carlson <sandals@crustytoothpaste.net> wrote: > > On 2019-06-11 at 19:52:18, Elmar Pruesse wrote: > > Hi! > > > > The total compiled size of libexec/git-core is currently somewhere > > around 30 MB. This is largely due to a number of binaries linking > > statically against libgit.a. For some folks, every byte counts. I > > meddled with the Makefile briefly to make it build and use a libgit.so > > instead, which dropped package size down to 5MB. > > > > Are there, beyond the ~20 ms in extra startup time and the slightly > > bigger hassle with DSO locations, reasons for the choice to link statically? > > I think the reason is that libgit is not API stable and we definitely > don't want people linking against it. Having .so files does not mean it's stable API though. If we don't ever install header files, there's no way for outside people to use it (people who dlopen() it anyway deserve whatever they get). I do agree with some hassles from .so files though. If installation size is a problem I think we can still shrink it a bit down. Some non-builtin commands (fast-import, sh-i18n--subst...) could be merged back in "git" binary. Some other for remote side (or background daemons) could also be bundled together unless there's security concerns. We could also have a look at function distribution in libgit.a. I'm surprised git-credential-store is 5.6 MB on my machine. We probably pull more stuff than needed somewhere due to dependency between .o files. > Before libgit2 existed, projects > like cgit built their own libgit and it required pinning to a specific > version of Git. > > Also, some people install Git into their home directories, and a shared > library means that they'll have to use LD_LIBRARY_PATH (or equivalent) > to run Git. > > Finally, we have support for a runtime relocatable Git which can be run > out of any path and still automatically find its dependent binaries. > That won't work with a shared library. > > So if we did allow for building a shared library, it would have to be an > option that defaulted to off, I think. > -- > brian m. carlson: Houston, Texas, US > OpenPGP: https://keybase.io/bk2204 -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-11 23:48 ` brian m. carlson 2019-06-12 9:29 ` Duy Nguyen @ 2019-06-12 13:57 ` Paul Smith 2019-06-12 23:31 ` brian m. carlson 2019-06-13 7:51 ` Johannes Schindelin 1 sibling, 2 replies; 12+ messages in thread From: Paul Smith @ 2019-06-12 13:57 UTC (permalink / raw) To: brian m. carlson, Elmar Pruesse; +Cc: git@vger.kernel.org On Tue, 2019-06-11 at 23:48 +0000, brian m. carlson wrote: > Also, some people install Git into their home directories, and a > shared library means that they'll have to use LD_LIBRARY_PATH (or > equivalent) to run Git. I don't have strong feeling about .so's although obviously less disk space used is always a good thing, everything else being equal. However, the above concern isn't actually an issue. You can install the .so in a known location relative to the binaries, then link the binaries with an RPATH setting using $ORIGIN (or the equivalent on MacOS which does exist but I forget the name). On Windows, DLLs are installed in the same directory as the binary, typically. Allowing relocatable binaries with .so dependencies without requiring LD_LIBRARY_PATH settings is a solved problem, to the best of my understanding. One thing to think about is that runtime loading a .so can take some time if it has lots of public symbols. If someone really wanted to do this, the ideal thing would be to make all symbols hidden except those needed by the binary front-ends and have those be very small shells that just had a very limited number of entry points into the .so. Maybe for git this doesn't matter but for some projects I've worked on the time to dlopen() a library was a blocking issue that the above procedure solved nicely. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-12 13:57 ` Paul Smith @ 2019-06-12 23:31 ` brian m. carlson 2019-06-13 19:19 ` Johannes Sixt 2019-06-13 7:51 ` Johannes Schindelin 1 sibling, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-06-12 23:31 UTC (permalink / raw) To: Paul Smith; +Cc: Elmar Pruesse, git@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 2095 bytes --] On 2019-06-12 at 13:57:43, Paul Smith wrote: > On Tue, 2019-06-11 at 23:48 +0000, brian m. carlson wrote: > > Also, some people install Git into their home directories, and a > > shared library means that they'll have to use LD_LIBRARY_PATH (or > > equivalent) to run Git. > > I don't have strong feeling about .so's although obviously less disk > space used is always a good thing, everything else being equal. > > However, the above concern isn't actually an issue. You can install > the .so in a known location relative to the binaries, then link the > binaries with an RPATH setting using $ORIGIN (or the equivalent on > MacOS which does exist but I forget the name). On Windows, DLLs are > installed in the same directory as the binary, typically. > > Allowing relocatable binaries with .so dependencies without requiring > LD_LIBRARY_PATH settings is a solved problem, to the best of my > understanding. This is possible to do, but it's not especially portable. People use various C toolchains to compile our code, which may or may not have easy access to linker flags. The proper syntax also varies depending on whether you're using ELF, Mach-O, PE[0], or another object format. And Debian tries hard to avoid RPATH settings[1], so we'd need to be sure to have an option not to set it. None of these are intractable problems, but there's not simply an easy solution that we can magically set that will work everywhere. If we were using autoconf and friends exclusively, this would be easier, but we're not. So someone is welcome to attack these problems with a set of patches, but I expect it to be fairly involved to get all the corner cases right if we want to make it the default. [0] AFAIUI, Windows doesn't have RPATH-like functionality, and from what I've read, the same-directory behavior may be going away due to security concerns. I don't use Windows, so any solution there is fine as long as Dscho is happy. [1] https://wiki.debian.org/RpathIssue -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-12 23:31 ` brian m. carlson @ 2019-06-13 19:19 ` Johannes Sixt 0 siblings, 0 replies; 12+ messages in thread From: Johannes Sixt @ 2019-06-13 19:19 UTC (permalink / raw) To: brian m. carlson; +Cc: Paul Smith, Elmar Pruesse, git@vger.kernel.org Am 13.06.19 um 01:31 schrieb brian m. carlson: > [0] AFAIUI, Windows doesn't have RPATH-like functionality, and from what > I've read, the same-directory behavior may be going away due to security > concerns. I don't use Windows, so any solution there is fine as long as > Dscho is happy. The solution is NOT to use DLLs on Windows. They are touchy and slow. -- Hannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-12 13:57 ` Paul Smith 2019-06-12 23:31 ` brian m. carlson @ 2019-06-13 7:51 ` Johannes Schindelin 2019-06-13 17:28 ` Paul Smith 1 sibling, 1 reply; 12+ messages in thread From: Johannes Schindelin @ 2019-06-13 7:51 UTC (permalink / raw) To: Paul Smith; +Cc: brian m. carlson, Elmar Pruesse, git@vger.kernel.org Hi Paul, On Wed, 12 Jun 2019, Paul Smith wrote: > On Tue, 2019-06-11 at 23:48 +0000, brian m. carlson wrote: > > Also, some people install Git into their home directories, and a > > shared library means that they'll have to use LD_LIBRARY_PATH (or > > equivalent) to run Git. > > I don't have strong feeling about .so's although obviously less disk > space used is always a good thing, everything else being equal. > > However, the above concern isn't actually an issue. You can install > the .so in a known location relative to the binaries, then link the > binaries with an RPATH setting using $ORIGIN (or the equivalent on > MacOS which does exist but I forget the name). Hassles aside, you mentioned Linux and macOS. What about literally *all* the other platforms we support? Like AIX, NonStop, HP/UX, etc? Sure, you can hunt down all of them, and maybe even come up with a workaround for platforms that do not have a $ORIGIN equivalent. You can pile workaround on workaround all you want. In the end, it seems to be a clear indicator that this is a complicator's glove, and the only reasonably simple way forward would be to either leave things as-are, or have an *opt-in* to build a shared libgit. But. And this is a really big but. While you can try to document _all you want_ how libgit.so is not supposed to be used as a library, how its API is not an API or at least not a stable one, if you have _some_ experience with software development you will know that it won't matter one bit. It _will_ be used, people _will_ complain, and it will turn out to simply not have been a good idea in the first place. > On Windows, DLLs are installed in the same directory as the binary, > typically. > > Allowing relocatable binaries with .so dependencies without requiring > LD_LIBRARY_PATH settings is a solved problem, to the best of my > understanding. You're probably right, as long as you restrict your view to mainstream Operating Systems. To put things into perspective, you might be interested in reading up on https://github.com/git/git/commit/0f50c8e32c87 (Makefile: remove the NO_R_TO_GCC_LINKER flag, 2019-05-17) and related commit history. Sure, you could still argue that it is a "solved" problem. Where "solved" is a different term than "desirable". > One thing to think about is that runtime loading a .so can take some > time if it has lots of public symbols. If someone really wanted to do > this, the ideal thing would be to make all symbols hidden except those > needed by the binary front-ends and have those be very small shells > that just had a very limited number of entry points into the .so. That would fall squarely into the "pile on workaround on workaround" category I mentioned above. > Maybe for git this doesn't matter but for some projects I've worked on > the time to dlopen() a library was a blocking issue that the above > procedure solved nicely. Sure, sometimes you cannot control whether it is an ill-designed `.so` you need to consume. As far as Git is concerned, this is not the case. At least when you look at libgit. When you look at libcurl, it is a different matter. But then, we do not need to play RPATH games there: we expect it to be in the system's preferred location. BTW Duy hinted at problems with libcurl that made us split apart `git-remote-https` from the main `git` executable. The full story is here: 1. The Linus complained about some "crazy" shared library loading behavior five months before Christmas 2009: https://public-inbox.org/git/alpine.LFD.2.01.0907241349390.3960@localhost.localdomain/ 2. Daniel Barkalow was working on some "foreign VCS" support and thought that HTTPS/HTTP support could be handled via the same route, to avoid having to load libcurl for every Git operation no matter what: https://public-inbox.org/git/alpine.LNX.2.00.0907242242310.2147@iabervon.org/ 3. Daniel then sent a patch series about two weeks later: https://public-inbox.org/git/alpine.LNX.2.00.0908050052390.2147@iabervon.org/ 4. Those patches were accepted via cd03eebbfdae (Merge branch 'db/vcs-helper', 2009-09-13) So yes, I think that a patch or patch series to turn libgit.a into libgit.so would need to be crafted *very* carefully, and _in the least_ offer a sound performance analysis in the commit messages. It would obviously need to be proven beyond doubt that the startup time does not deteriorate noticeably, otherwise the patch (series) would likely be rejected. Ciao, Johannes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-13 7:51 ` Johannes Schindelin @ 2019-06-13 17:28 ` Paul Smith 2019-06-13 18:23 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Paul Smith @ 2019-06-13 17:28 UTC (permalink / raw) To: git@vger.kernel.org On Thu, 2019-06-13 at 09:51 +0200, Johannes Schindelin wrote: > Hassles aside, you mentioned Linux and macOS. What about literally > *all* the other platforms we support? Like AIX, NonStop, HP/UX, etc? I assumed that we were discussing providing an _option_ of building with shared libraries, rather than removing support for static libraries and only supporting shared libraries. The former is the typical model in portable projects. So, the answer to most of the (important) issues you and Brian raise is, "if it doesn't work, can't be made to work, is too slow, or is annoying for ANY other reason, then don't do it". Regarding things like publish-ability of the API, I don't know what else to say. It's FOSS, after all: anyone can do whatever they want (with respect to building and using the code) regardless of the desires of the development team. All you can do is make clear that the intent is that the API is not stable, and if they don't listen and their stuff breaks, well, as the saying goes, they get to keep both halves. Not adding any header files to the installation rules and packages is also helpful :). There's a certain amount of cold, hard reality that every FOSS project, regardless of how friendly and welcoming they aspire to be, simply can't avoid while still making progress (and staying sane). I certainly don't want to minimize the amount of work involved here, nor do I want to in any way volunteer myself to undertake any of it: as I said, I don't have strong feelings about it. I'm just saying, there's no technical reason it can't be done while maintaining the same features (such as relocatability) as the static library installs, at least on the major platforms. Cheers! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-13 17:28 ` Paul Smith @ 2019-06-13 18:23 ` Junio C Hamano 0 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2019-06-13 18:23 UTC (permalink / raw) To: Paul Smith; +Cc: git@vger.kernel.org Paul Smith <paul@mad-scientist.net> writes: > I assumed that we were discussing providing an _option_ of building > with shared libraries, rather than removing support for static > libraries and only supporting shared libraries. The former is the > typical model in portable projects. > ... > So, the answer to most of the (important) issues you and Brian raise > is, "if it doesn't work, can't be made to work, is too slow, or is > annoying for ANY other reason, then don't do it". > > Regarding things like publish-ability of the API, I don't know what > else to say. It's FOSS, after all: anyone can do whatever they want > (with respect to building and using the code) regardless of the desires > of the development team. All you can do is make clear that the intent > is that the API is not stable, and if they don't listen and their stuff > breaks, well, as the saying goes, they get to keep both halves. Not > adding any header files to the installation rules and packages is also > helpful :). > > There's a certain amount of cold, hard reality that every FOSS project, > regardless of how friendly and welcoming they aspire to be, simply > can't avoid while still making progress (and staying sane). > > > I certainly don't want to minimize the amount of work involved here, > nor do I want to in any way volunteer myself to undertake any of it: as > I said, I don't have strong feelings about it. > > I'm just saying, there's no technical reason it can't be done while > maintaining the same features (such as relocatability) as the static > library installs, at least on the major platforms. > > Cheers! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-11 19:52 Reducing git size by building libgit.so Elmar Pruesse 2019-06-11 23:48 ` brian m. carlson @ 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason 2019-06-12 9:46 ` Duy Nguyen 2019-06-12 10:25 ` SZEDER Gábor 1 sibling, 2 replies; 12+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2019-06-12 9:41 UTC (permalink / raw) To: Elmar Pruesse; +Cc: git@vger.kernel.org On Tue, Jun 11 2019, Elmar Pruesse wrote: > Hi! > > The total compiled size of libexec/git-core is currently somewhere > around 30 MB. This is largely due to a number of binaries linking > statically against libgit.a. For some folks, every byte counts. I > meddled with the Makefile briefly to make it build and use a libgit.so > instead, which dropped package size down to 5MB. > > Are there, beyond the ~20 ms in extra startup time and the slightly > bigger hassle with DSO locations, reasons for the choice to link statically? brian mentioned API stability. I'd be fine with having a *.so shipped with git. We'd document the API non-stability, and of course it's GPL so you can only link other GPL programs to it, but if people would be fine with still using it and very closely following git development as we break their API/ABI why not. Have you looked at INSTALL_SYMLINKS & friends? I.e. maybe you're measuring size without accounting for most of the binaries being hardlinks to the same thing. We still have some stand-alone binaries, but IIRC there's under 5 of those with INSTALL_SYMLINKS. We could probably also just make those built-ins to get the rest of the size benefits. I.e. we'd just have one git binary, everything else symlinking to that, and we'd route to the right program by inspecting argv, which we mostly do already. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason @ 2019-06-12 9:46 ` Duy Nguyen 2019-06-12 10:25 ` SZEDER Gábor 1 sibling, 0 replies; 12+ messages in thread From: Duy Nguyen @ 2019-06-12 9:46 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Elmar Pruesse, git@vger.kernel.org On Wed, Jun 12, 2019 at 4:42 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > I.e. we'd just have one git binary, everything else symlinking to that, > and we'd route to the right program by inspecting argv, which we mostly > do already. If I remember correctly libcurl.so startup time was the reason it's split out of "git" binary, so we can't just merge everything into one (*). But yeah merging some back is not a bad idea. (*) but maybe "git" binary has gotten much slower overall, or libcurl.so much faster that it does not matter anymore. That problem was like 10 years ago. -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Reducing git size by building libgit.so 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason 2019-06-12 9:46 ` Duy Nguyen @ 2019-06-12 10:25 ` SZEDER Gábor 1 sibling, 0 replies; 12+ messages in thread From: SZEDER Gábor @ 2019-06-12 10:25 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Elmar Pruesse, git@vger.kernel.org On Wed, Jun 12, 2019 at 11:41:10AM +0200, Ævar Arnfjörð Bjarmason wrote: > On Tue, Jun 11 2019, Elmar Pruesse wrote: > > The total compiled size of libexec/git-core is currently somewhere > > around 30 MB. This is largely due to a number of binaries linking > > statically against libgit.a. For some folks, every byte counts. I wonder whether those folks actually need such non-builtin git binaries like 'git-shell' or 'git-daemon' in the first place. > We still have some stand-alone binaries, but IIRC there's under 5 of > those with INSTALL_SYMLINKS. We could probably also just make those > built-ins to get the rest of the size benefits. > > I.e. we'd just have one git binary, everything else symlinking to that, > and we'd route to the right program by inspecting argv, which we mostly > do already. Let's not forget that commands like 'git-daemon' and 'git-shell' are better left as stand-alone programs. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-06-13 19:19 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-11 19:52 Reducing git size by building libgit.so Elmar Pruesse 2019-06-11 23:48 ` brian m. carlson 2019-06-12 9:29 ` Duy Nguyen 2019-06-12 13:57 ` Paul Smith 2019-06-12 23:31 ` brian m. carlson 2019-06-13 19:19 ` Johannes Sixt 2019-06-13 7:51 ` Johannes Schindelin 2019-06-13 17:28 ` Paul Smith 2019-06-13 18:23 ` Junio C Hamano 2019-06-12 9:41 ` Ævar Arnfjörð Bjarmason 2019-06-12 9:46 ` Duy Nguyen 2019-06-12 10:25 ` SZEDER Gábor
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).