bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* [PATCH v2] bootstrap: When a commit hash is specified, do a shallow fetch if possible
@ 2021-10-25 19:44 Glenn Washburn
  0 siblings, 0 replies; only message in thread
From: Glenn Washburn @ 2021-10-25 19:44 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Glenn Washburn, Bruno Haible

The gnulib sources are large but more importantly have lots of changes. So
initial checkout of the repository can take a long time when network or
cpu resources are limited. The later is especially acute in a non-KVM QEMU
virtual machine (which can take 40+ minutes compared to <30 seconds with
this change[1]). The problem is specific to bootstrap configurations using
a specific gnulib revision specified by commit hash. In this case, git can
not do a shallow clone by using the --depth option because git does not know
ahead of time how deep the revision is from the tip. So git must clone the
whole repository.

However, there is an alternate method that requires support from the git
server[2], namely by asking for a specific commit on fetch. Refactor to use
fetch and fallback to fetching the entire repository if fetching by commit
hash fails.

[1] https://savannah.nongnu.org/support/index.php?110553#comment1
[2] https://stackoverflow.com/a/3489576/2108011

Signed-off-by: Glenn Washburn <development@efficientek.com>
---
Changes since v1:
* Updated commit message to reflect that the official repo now has support
  for fetch by commit hash.
* Update comment in source per Bruno's suggestion.

Glenn

---
Range-diff against v1:
1:  9f6fe7c4a ! 1:  9d980f00c bootstrap: When a commit hash is specified, do a shallow fetch if possible
    @@ Commit message
         fetch and fallback to fetching the entire repository if fetching by commit
         hash fails.
     
    -    Currently the git server hosting the official gnulib git repository does not
    -    support fetch by commit hash[3]. However, there are mirrors which do support
    -    this[4], and can be specified by setting the $GNULIB_URL.
    -
         [1] https://savannah.nongnu.org/support/index.php?110553#comment1
         [2] https://stackoverflow.com/a/3489576/2108011
    -    [3] https://savannah.nongnu.org/support/index.php?110553
    -    [4] https://github.com/coreutils/gnulib
     
      ## build-aux/bootstrap ##
     @@ build-aux/bootstrap: if $use_gnulib; then
    @@ build-aux/bootstrap: if $use_gnulib; then
     +      else
     +        git fetch -h 2>&1 | grep -- --depth > /dev/null && shallow='--depth 2'
     +        mkdir -p "$gnulib_path"
    ++        # Only want a shallow checkout of $GNULIB_REVISION, but git does not
    ++        # support cloning by commit hash. So attempt a shallow fetch by commit
    ++        # hash to minimize the amount of data downloaded and changes needed to
    ++        # be processed, which can drastically reduce download and processing
    ++        # time for checkout. If the fetch by commit fails, a shallow fetch can
    ++        # not be performed because we do not know what the depth of the commit
    ++        # is without fetching all commits. So fallback to fetching all commits.
     +        git -C "$gnulib_path" init
     +        git -C "$gnulib_path" remote add origin ${GNULIB_URL:-$default_gnulib_url}
    -+        # Can not do a shallow fetch if fetch by commit hash fails because we
    -+        # do not know how deep to go to get to $GNULIB_REVISION, so we must get
    -+        # all commits.
     +        git -C "$gnulib_path" fetch $shallow origin "$GNULIB_REVISION" \
     +          || git -C "$gnulib_path" fetch origin \
     +          || cleanup_gnulib

 build-aux/bootstrap | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/build-aux/bootstrap b/build-aux/bootstrap
index 733659850..5315857e0 100755
--- a/build-aux/bootstrap
+++ b/build-aux/bootstrap
@@ -694,9 +694,25 @@ if $use_gnulib; then
       shallow=
       if test -z "$GNULIB_REVISION"; then
         git clone -h 2>&1 | grep -- --depth > /dev/null && shallow='--depth 2'
+        git clone $shallow ${GNULIB_URL:-$default_gnulib_url} "$gnulib_path" \
+          || cleanup_gnulib
+      else
+        git fetch -h 2>&1 | grep -- --depth > /dev/null && shallow='--depth 2'
+        mkdir -p "$gnulib_path"
+        # Only want a shallow checkout of $GNULIB_REVISION, but git does not
+        # support cloning by commit hash. So attempt a shallow fetch by commit
+        # hash to minimize the amount of data downloaded and changes needed to
+        # be processed, which can drastically reduce download and processing
+        # time for checkout. If the fetch by commit fails, a shallow fetch can
+        # not be performed because we do not know what the depth of the commit
+        # is without fetching all commits. So fallback to fetching all commits.
+        git -C "$gnulib_path" init
+        git -C "$gnulib_path" remote add origin ${GNULIB_URL:-$default_gnulib_url}
+        git -C "$gnulib_path" fetch $shallow origin "$GNULIB_REVISION" \
+          || git -C "$gnulib_path" fetch origin \
+          || cleanup_gnulib
+        git -C "$gnulib_path" reset --hard FETCH_HEAD
       fi
-      git clone $shallow ${GNULIB_URL:-$default_gnulib_url} "$gnulib_path" \
-        || cleanup_gnulib
 
       trap - 1 2 13 15
     fi
-- 
2.27.0



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2021-10-25 19:45 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-25 19:44 [PATCH v2] bootstrap: When a commit hash is specified, do a shallow fetch if possible Glenn Washburn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).