git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Andrew Keller <andrew@kellerfarm.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: Borrowing objects from nearby repositories
Date: Tue, 25 Mar 2014 09:13:56 -0400	[thread overview]
Message-ID: <9A24D2D1-DD59-41DC-8237-2B5829695753@kellerfarm.com> (raw)
In-Reply-To: <CACBZZX5teZuqtNkPT4PdXJn=g34cOhRH2oNehROT8kJ_M2cgfg@mail.gmail.com>

On Mar 24, 2014, at 5:21 PM, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Wed, Mar 12, 2014 at 4:37 AM, Andrew Keller <andrew@kellerfarm.com> wrote:
>> Hi all,
>> 
>> I am considering developing a new feature, and I'd like to poll the group for opinions.
>> 
>> Background: A couple years ago, I wrote a set of scripts that speed up cloning of frequently used repositories.  The scripts utilize a bare Git repository located at a known location, and automate providing a --reference parameter to `git clone` and `git submodule update`.  Recently, some coworkers of mine expressed an interest in using the scripts, so I published the current version of my scripts, called `git repocache`, described at the bottom of <https://github.com/andrewkeller/ak-git-tools>.
>> 
>> Slowly, it has occurred to me that this feature, or something similar to it, may be worth adding to Git, so I've been thinking about the best approach.  Here's my best idea so far:
>> 
>> 1)  Introduce '--borrow' to `git-fetch`.  This would behave similarly to '--reference', except that it operates on a temporary basis, and does not assume that the reference repository will exist after the operation completes, so any used objects are copied into the local objects database.  In theory, this mechanism would be distinct from '--reference', so if both are used, some objects would be copied, and some objects would be accessible via a reference repository referenced by the alternates file.
> 
> Isn't this the same as git clone --reference <path> --no-hardlinks <url> ?

'--reference` adds an entry to 'info/alternates' inside the objects folder.  When an object is looked up, any objects folder listed in 'objects/info/alternates' is considered to be an extension of the local objects folder.  So, when, for example, fetch runs, when it goes to decide whether or not it already has a blob locally, it may decide "yes", and not download the blob at all, because it already exists in one of the reference repositories.  If I clone one of my 80 GB repositories over SSH using a reference repository, the resulting clone is only about 175 KB, because it's assuming the reference repository will exist going forward, so it doesn't actually own any objects itself at all.

The '--no-hardlinks' option is only applicable when hard linking is available in the first place - i.e., when cloning from one local folder to another on the same filesystem (assuming the filesystem supports hard links).

Thanks,
 - Andrew

  reply	other threads:[~2014-03-25 13:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-12  3:37 Borrowing objects from nearby repositories Andrew Keller
2014-03-23 18:04 ` Phil Hord
2014-03-24 21:21 ` Ævar Arnfjörð Bjarmason
2014-03-25 13:13   ` Andrew Keller [this message]
2014-03-25 17:02   ` Junio C Hamano
2014-03-25 22:17     ` Junio C Hamano
2014-03-26 13:36       ` Andrew Keller
2014-03-26 17:29         ` Junio C Hamano
2014-03-28 14:52           ` Andrew Keller
2014-03-28 17:02             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9A24D2D1-DD59-41DC-8237-2B5829695753@kellerfarm.com \
    --to=andrew@kellerfarm.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).