git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Ryan Anderson <ryan@michonline.com>
Cc: Petr Baudis <pasky@ucw.cz>, Russell King <rmk@arm.linux.org.uk>,
	git@vger.kernel.org
Subject: Re: More problems...
Date: Fri, 29 Apr 2005 13:21:21 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.58.0504291311320.18901@ppc970.osdl.org> (raw)
In-Reply-To: <20050429195055.GE1233@mythryan2.michonline.com>



On Fri, 29 Apr 2005, Ryan Anderson wrote:
> 
> Why not just use "rsync" for both remote and local synchronization, and
> provide a "relink" command to scan two .git/objects/ repositories and
> hardlink matching files together?

Absolutely. I use the same "git-pull-script" between two local directories 
on disk. The only issue there is that you have to give the ".git" 
directory, ie you should do

	git-pull-script ~/by/other/repository/.git

instead of pointing to the other repo's root.

Of course, I don't bother with the linking. But that's the trivial part.

> With the SHA1 hash, you can even have a --unsafe option that just
> compares the has names and does a link based purely off of that and the
> stat(2) results of both files.  (I'd expect that a ... safer variant
> would extract both files and compare them, but the --unsafe should be
> sufficient, in practice, I would think.)

I don't think there is any point to unsafe. The assumption is that if you 
do things this way, the "unlinked" files will the the uncommon case, so 
what you do is

 - remember the list of files you copied when you did the pull (you had to 
   have this list at some point anyway). Sort by name,
 - create a list of names of both repositories, sorted by name
 - do the union of those three lists (cheap, thanks to the sorting)
 - stat each name to see if it's already linked (which it will be, most of 
   the time), continue to the next one..
 - if they aren't linked, just do a "cmp" on them, and warn if they aren't 
   the same, continue to the next one.
 - else link them.

And if you want to, you can skip the first stage, and just relink two
trees without looking at a list of "known new" files - it's going to be
expensive to link two big repositories the _first_ time, but hey even the
"expensive" part is likely to be pretty cheap in the end. If it takes an
hour or two to relink some years of history, big deal. Do it overnight,
you only need it once.

		Linus

  parent reply	other threads:[~2005-04-29 20:28 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-29 16:01 More problems Russell King
2005-04-29 16:12 ` Russell King
2005-04-29 17:51   ` Linus Torvalds
2005-04-29 18:27 ` Petr Baudis
2005-04-29 19:50   ` Ryan Anderson
2005-04-29 20:03     ` Thomas Glanzmann
2005-04-29 20:21     ` Linus Torvalds [this message]
2005-04-29 21:07       ` Junio C Hamano
2005-04-29 21:19         ` Russell King
2005-04-29 21:57           ` Anton Altaparmakov
2005-05-02 19:33             ` Petr Baudis
2005-05-02 19:44               ` Dave Kleikamp
2005-05-02 19:51                 ` Thomas Glanzmann
2005-05-02 22:01               ` Anton Altaparmakov
2005-05-02 22:19                 ` Linus Torvalds
2005-05-03  1:48                   ` Petr Baudis
2005-05-03  2:56                     ` Daniel Barkalow
2005-05-03 15:00                     ` Andreas Gal
2005-05-03 19:18                       ` Junio C Hamano
2005-04-29 21:27         ` Daniel Barkalow
2005-04-29 22:01           ` Junio C Hamano
2005-04-30  5:36             ` [PATCH] Split out "pull" from particular methods Daniel Barkalow
2005-05-04  5:54       ` [PATCH] Add git-relink-script, a tool to hardlink two existing repositories Ryan Anderson
2005-05-02 21:13     ` More problems Petr Baudis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0504291311320.18901@ppc970.osdl.org \
    --to=torvalds@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=pasky@ucw.cz \
    --cc=rmk@arm.linux.org.uk \
    --cc=ryan@michonline.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).