git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Son Luong Ngoc <sluongng@gmail.com>
To: wansink@uber.com
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH] upload_pack.c: make deepen-not more tree-ish
Date: Sun, 12 Feb 2023 15:12:29 +0100	[thread overview]
Message-ID: <CAL3xRKcw5KD2xE2Z1hEc-j8WMWknoMTJwfdEVa=h5sGb=SmhNQ@mail.gmail.com> (raw)
In-Reply-To: <CAL3xRKdCkAAR0r3jyKFy+TtUi65LQcHaste=2WCqYHtwi8cUhw@mail.gmail.com>

Re-send to the Git mailing-list as setting a font on gmail switched
plain-text to HTML and thus, got blocked by mailing-list.

On Sun, Feb 12, 2023 at 3:09 PM Son Luong Ngoc <sluongng@gmail.com> wrote:
>
> Hi Andrew,
>
> On Sat, Feb 11, 2023 at 11:49 PM Andrew Wansink <andy@halogix.com> wrote:
> >
> > This unlocks `git clone --shallow-exclude=<commit-sha1>`
> >
> > git-clone only accepts --shallow-excude arguments where
> > the argument is a branch or tag because upload_pack only
> > searches deepen-not arguments for branches and tags.
> >
> > Make process_deepen_not search for commit objects if no
> > branch or tag is found then add them to the deepen_not
> > list.
> >
> > Signed-off-by: Andrew Wansink <wansink@uber.com>
> > ---
> >
> > At Uber we have a lot of patches in CI simultaneously,
> > the CI jobs will frequently clone the monorepo multiple
> > times for each patch.  They do this to calculate diffs
> > between a patch and its parent commit.
> >
>
> I used to manage a CI system that support monorepo use cases not so long ago.
> We had several hosts(VM/Baremetal) on which we spin up containers for CI to run.
>
> We maintain a bare copy of the monorepo on the host level (cron job / systemd / DaemonSet) and mount this as read-only into each of the CI containers.
>
> Each of the CI containers would attempt to clone/fetch the monorepo with `--reference-if-able ./path/to/read-only-mount/repo.git` (1)
> So that most of the needed objects are already on disk in the shared bare repo.
>
>
> +-----------+  +-----------+  +-----------+
> | container |  | container |  | container |
> +-----------+  +-----------+  +-----------+
>              \       |       /
>       (mount) \      |      /
>               +------------+                 +--------+
>               | bare-repo  | <-------------- | Remote |
>               +------------+   (git-fetch)   +--------+
>                     |
>                     | (maintain)
>                     |
>               +----------+
>               | cron-job |
>               +----------+
>
> (forgive my horrible drawing)
>
> With this setup, we did not have a need to shallow clone any longer,
> and our git-clone in each container is simply a combination of git-ls-remote and a very light-weighted git-fetch.
> In some cases, such as a job in the later stages of a CI pipeline,
> the host would already download all the needed objects into the bare copy of the repository.
> This lets us skip git-fetch entirely when the CI container executes.
>
> Compared to the shallow clone approach,
> our "local cache" approach sped up the clone speed drastically
> while allowing developers to interact with git history inside tests a lot easier.
>
> > One optimisation in this flow is to clone only to a specific
> > depth, this may or may not work, depending on how old the
> > patch is.  In this case we have to --unshallow or discard
> > the shallow clone and fully clone the repo.
> >
> > This patch would allow us to clone to exactly the depth we
> > need to find a patch's parent commit.
>
> Hope it helps,
> Son Luong.
>
> (1): https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---reference-if-ableltrepositorygt

      parent reply	other threads:[~2023-02-12 14:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10 21:31 Subject: [RFC PATCH] upload_pack.c: make deepen-not more tree-ish Andrew Wansink
2023-02-11 22:23 ` Andrew Wansink
2023-02-11 22:40   ` Andrew Wansink
     [not found]   ` <CAL3xRKdCkAAR0r3jyKFy+TtUi65LQcHaste=2WCqYHtwi8cUhw@mail.gmail.com>
2023-02-12 14:12     ` Son Luong Ngoc [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL3xRKcw5KD2xE2Z1hEc-j8WMWknoMTJwfdEVa=h5sGb=SmhNQ@mail.gmail.com' \
    --to=sluongng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=wansink@uber.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).