git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: phillip.wood@dunelm.org.uk,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com,
	avarab@gmail.com, dyroneteng@gmail.com,
	Johannes.Schindelin@gmx.de, "SZEDER Gábor" <szeder.dev@gmail.com>,
	"Matthew John Cheetham" <mjcheetham@outlook.com>,
	"Josh Steadmon" <steadmon@google.com>
Subject: Re: [PATCH v4 0/2] bundle URIs: design doc
Date: Tue, 9 Aug 2022 11:50:44 -0400	[thread overview]
Message-ID: <bce18d3e-ac37-7c7a-d411-d0aad87a8f68@github.com> (raw)
In-Reply-To: <5b98f1d8-e829-98db-1d13-7aba6c126f8d@gmail.com>

On 8/9/2022 9:49 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 09/08/2022 14:12, Derrick Stolee via GitGitGadget wrote:
>> This is the first of series towards building the bundle URI feature as
>> discussed in previous RFCs, specifically pulled directly out of [5]:
>>
>> [1]
>> https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/
>>
>> [2]
>> https://lore.kernel.org/git/cover-0.3-00000000000-20211025T211159Z-avarab@gmail.com/
>>
>> [3]
>> https://lore.kernel.org/git/pull.1160.git.1645641063.gitgitgadget@gmail.com
>>
>> [4]
>> https://lore.kernel.org/git/RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@gmail.com/
>>
>> [5]
>> https://lore.kernel.org/git/pull.1234.git.1653072042.gitgitgadget@gmail.com
>>
>> THIS ONLY INCLUDES THE DESIGN DOCUMENT. See "Updates in v3". There are two
>> patches:
>>
>>   1. The main design document that details the bundle URI standard and how
>>      the client interacts with the bundle data.
>>   2. An addendum to the design document that details one strategy for
>>      organizing bundles from the perspective of a bundle provider.
> 
> I thought the document was well written and left me with a good understanding
> of both the problem being addressed and the rationale for the solution.

Thanks for the kind words!

> One small query - the document mentions CI farms as benefiting from this work
> but my impression is that those commonly use shallow clones which are (quite
> reasonably) not supported in this proposal.

There are two different kinds of CI farms.

The most common one is a SaaS CI system that provides machines on-demand,
but each run starts from some "clean" state. For example, GitHub Actions
runs CI builds of the Git project across a number of platforms. These
machines need the source at HEAD, but do not need the full history. Further,
they will erase the repository entirely at the end of the build, never
fetching from those repositories. Thus, a shallow clone makes sense to
minimize the data transfer. Bundles don't make sense here for multiple
reasons, including that bundles must be closed under reachability and do
not work for representing a shallow clone (see [1]). The other reason is
that CI builds typically are triggered immediately after the commit appears
on the origin Git server, so there is no time for a bundle provider to
create a bundle representing that shallow clone.

[1] https://github.com/git/git/blob/c50926e1f48891e2671e1830dbcd2912a4563450/Documentation/technical/bundle-format.txt#L65-L69

The less common one is a private build farm. These machines are long-lived
and controlled by the repository owner. They come pre-loaded with all of
the software needed to build the repository. The best practice in this
type of build farm is to keep a full clone of the repository in a well-
known location and use incremental fetches to update the client repositories
to download the commit necessary for the build. This type of build farm is
typically self-hosted, but could also be hosted by a cloud provider. The
bundle URI design allows ways to quickly bootstrap new build machines using
a bundle provider (probably co-located with the build machines) as well as
improving fetch times by creating frequent incremental bundles. The new
commit being built is unlikely to exist immediately in the bundles, but it
is unlikely to be too far ahead of any of the bundles.

While private build farms are less common, they do become necessary for
large projects. Engineering teams that have the resources to self-host a
build farm are likely to also have the resources to self-host a bundle
server. They may not have the connections or desire to advertise those
bundle server URIs from the origin Git server.

I hope this helps clarify my perspective as to why build farms using long-
lived copies of the repository could take advantage of bundle URIs.

Thanks,
-Stolee

  reply	other threads:[~2022-08-09 15:50 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-06 19:55 [PATCH 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 1/6] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-06-06 22:18   ` Junio C Hamano
2022-06-08 19:20     ` Derrick Stolee
2022-06-08 19:27       ` Junio C Hamano
2022-06-08 20:44         ` Junio C Hamano
2022-06-08 20:39       ` Junio C Hamano
2022-06-08 20:52         ` Derrick Stolee
2022-06-07  0:33   ` Junio C Hamano
2022-06-08 19:46     ` Derrick Stolee
2022-06-08 21:01       ` Junio C Hamano
2022-06-09 16:00         ` Derrick Stolee
2022-06-09 17:56           ` Junio C Hamano
2022-06-09 18:27             ` Ævar Arnfjörð Bjarmason
2022-06-09 19:39             ` Derrick Stolee
2022-06-09 20:13               ` Junio C Hamano
2022-06-21 19:34       ` Derrick Stolee
2022-06-21 20:16         ` Junio C Hamano
2022-06-21 21:10           ` Derrick Stolee
2022-06-21 21:33             ` Junio C Hamano
2022-06-06 19:55 ` [PATCH 2/6] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-21 22:59   ` Junio C Hamano
2022-06-06 19:55 ` [PATCH 3/6] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 4/6] fetch: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 5/6] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 6/6] fetch: add 'refs/bundle/' to log.excludeDecoration Derrick Stolee via GitGitGadget
2022-06-29 20:40 ` [PATCH v2 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Derrick Stolee via GitGitGadget
2022-06-29 20:40   ` [PATCH v2 1/6] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-07-18  9:20     ` SZEDER Gábor
2022-07-21 12:09     ` Matthew John Cheetham
2022-07-22 13:52       ` Derrick Stolee
2022-07-22 16:03       ` Derrick Stolee
2022-07-21 21:39     ` Josh Steadmon
2022-07-22 13:15       ` Derrick Stolee
2022-07-22 15:01       ` Derrick Stolee
2022-06-29 20:40   ` [PATCH v2 2/6] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-21 21:41     ` Josh Steadmon
2022-06-29 20:40   ` [PATCH v2 3/6] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-07-21 21:45     ` Josh Steadmon
2022-07-22 13:18       ` Derrick Stolee
2022-06-29 20:40   ` [PATCH v2 4/6] fetch: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-06-29 20:40   ` [PATCH v2 5/6] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-06-29 20:40   ` [PATCH v2 6/6] fetch: add 'refs/bundle/' to log.excludeDecoration Derrick Stolee via GitGitGadget
2022-07-21 21:47     ` Josh Steadmon
2022-07-22 13:20       ` Derrick Stolee
2022-07-21 21:48   ` [PATCH v2 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Josh Steadmon
2022-07-21 21:56     ` Junio C Hamano
2022-07-25 13:53   ` [PATCH v3 0/2] " Derrick Stolee via GitGitGadget
2022-07-25 13:53     ` [PATCH v3 1/2] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-07-28  1:23       ` tenglong.tl
2022-08-01 13:42         ` Derrick Stolee
2022-07-25 13:53     ` [PATCH v3 2/2] bundle-uri: add example bundle organization Derrick Stolee via GitGitGadget
2022-08-04 16:09       ` Matthew John Cheetham
2022-08-04 17:39         ` Derrick Stolee
2022-08-04 20:29           ` Ævar Arnfjörð Bjarmason
2022-08-05 18:29             ` Derrick Stolee
2022-07-25 20:05     ` [PATCH v3 0/2] bundle URIs: design doc and initial git fetch --bundle-uri implementation Josh Steadmon
2022-08-09 13:12     ` [PATCH v4 0/2] bundle URIs: design doc Derrick Stolee via GitGitGadget
2022-08-09 13:12       ` [PATCH v4 1/2] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-10-04 19:48         ` Philip Oakley
2022-08-09 13:12       ` [PATCH v4 2/2] bundle-uri: add example bundle organization Derrick Stolee via GitGitGadget
2022-08-09 13:49       ` [PATCH v4 0/2] bundle URIs: design doc Phillip Wood
2022-08-09 15:50         ` Derrick Stolee [this message]
2022-08-11 15:42           ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bce18d3e-ac37-7c7a-d411-d0aad87a8f68@github.com \
    --to=derrickstolee@github.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=dyroneteng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=mjcheetham@outlook.com \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=steadmon@google.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).