git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* how to make "full" copy of a repo
@ 2015-03-28  2:56 Christoph Anton Mitterer
  2015-03-28 14:31 ` Kevin D
  2015-03-28 18:52 ` Torsten Bögershausen
  0 siblings, 2 replies; 9+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-28  2:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2589 bytes --]

Hey.

I was looking for an ideally simple way to make a "full" copy of a git
repo. Many howtos are floating around on this on the web, with also lots
of voodoo.


First, it shouldn't be just a clone, i.o.w.
- I want to have all refs (local/remote branches/tags) and of course all
objects from the source repo copied as is.
So it's local branches should become my local branches and not remote
branches as well - and so on.
Basically I want to be able to delete the source afterwards (and all
backups ;) ) and not having anything lost.

- It shouldn't set the source repo as origin or it's branches as remote
tracking branches, as said it should be identical the source repo, just
"freshly copied" via the "Git aware transport mechanisms".

- Whether GC or repacking happens, I don't care, as long as nothing that
is still reachable in the source repo wouldn't get lost (or get lost
once I run a GC in the copied repo).

- Whether anything that other tools have added to .git (e.g. git-svn
stuff) get's lost, I don't care.

- It should work for both, bare and non-bare repos, but it's okay when
it doesn't copy anything that is not committed or stashed.



I'd have said that either:
$ git clone --mirror URl-to-source-repo copy
for the direction from "outside" the source to a copy,
or alternatively:
$ cd source-repo
$ git push --mirror URl-to-copy
for the direction from "within" the source to a copy with copy being an
empty bare or non-bare repo,
would do the job.

But:

a) but the git-clone(1) part for --mirror:
   >and sets up a refspec configuration such that all these refs are
   >overwritten by a git remote update in the target repository.
   kinda confuses me since I wanted to get independent of the source
   repo and this ssems to set up a remote to it?

b) do I need --all --tags for the push as well?

c) When following
   https://help.github.com/articles/duplicating-a-repository/
   it doesn't seem as if --mirror is what I want because they seem to
   advertise it rather as having the copy tracking the source repo.
   Of course I read about just using git-clone --bare, but that seems to
   not copy everything that --mirror does (remote-tracking branches,
   notes).

   So I'm a bit confused...


1) Is it working like I assumed above?
2) Does that also copy things like git-config, hooks, etc.?
3) Does it copy the configured remotes from the source?
4) What else is not copied by that? I'd assume anything that is not
   tracked by git and the stash of the source?



Thanks a lot,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28  2:56 how to make "full" copy of a repo Christoph Anton Mitterer
@ 2015-03-28 14:31 ` Kevin D
  2015-03-29  2:21   ` Christoph Anton Mitterer
  2015-03-30 15:22   ` Duy Nguyen
  2015-03-28 18:52 ` Torsten Bögershausen
  1 sibling, 2 replies; 9+ messages in thread
From: Kevin D @ 2015-03-28 14:31 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: git

On Sat, Mar 28, 2015 at 03:56:37AM +0100, Christoph Anton Mitterer wrote:
> Hey.
> 
> I was looking for an ideally simple way to make a "full" copy of a git
> repo. Many howtos are floating around on this on the web, with also lots
> of voodoo.
> 
> 
> First, it shouldn't be just a clone, i.o.w.
> - I want to have all refs (local/remote branches/tags) and of course all
> objects from the source repo copied as is.
> So it's local branches should become my local branches and not remote
> branches as well - and so on.
> Basically I want to be able to delete the source afterwards (and all
> backups ;) ) and not having anything lost.
> 
> - It shouldn't set the source repo as origin or it's branches as remote
> tracking branches, as said it should be identical the source repo, just
> "freshly copied" via the "Git aware transport mechanisms".
> 
> - Whether GC or repacking happens, I don't care, as long as nothing that
> is still reachable in the source repo wouldn't get lost (or get lost
> once I run a GC in the copied repo).
> 
> - Whether anything that other tools have added to .git (e.g. git-svn
> stuff) get's lost, I don't care.
> 
> - It should work for both, bare and non-bare repos, but it's okay when
> it doesn't copy anything that is not committed or stashed.
> 
> 
> 
> I'd have said that either:
> $ git clone --mirror URl-to-source-repo copy
> for the direction from "outside" the source to a copy,
> or alternatively:
> $ cd source-repo
> $ git push --mirror URl-to-copy
> for the direction from "within" the source to a copy with copy being an
> empty bare or non-bare repo,
> would do the job.
> 
> But:
> 
> a) but the git-clone(1) part for --mirror:
>    >and sets up a refspec configuration such that all these refs are
>    >overwritten by a git remote update in the target repository.
>    kinda confuses me since I wanted to get independent of the source
>    repo and this ssems to set up a remote to it?
> 
> b) do I need --all --tags for the push as well?
> 
> c) When following
>    https://help.github.com/articles/duplicating-a-repository/
>    it doesn't seem as if --mirror is what I want because they seem to
>    advertise it rather as having the copy tracking the source repo.
>    Of course I read about just using git-clone --bare, but that seems to
>    not copy everything that --mirror does (remote-tracking branches,
>    notes).
> 
>    So I'm a bit confused...
> 
> 
> 1) Is it working like I assumed above?
> 2) Does that also copy things like git-config, hooks, etc.?
> 3) Does it copy the configured remotes from the source?
> 4) What else is not copied by that? I'd assume anything that is not
>    tracked by git and the stash of the source?
> 
> 
> 
> Thanks a lot,
> Chris.

Git clone is never going to get you a copy where nothing is lost.

What you are losing on clone is:

* config settings (this includes the configures remotes)
* hooks
* reflog (history of refs, though, by default disabled for bare
  repositories)
* Stashes, because the reference to them is stored in the reflog
* unreferenced objects (though you said those are not a requirement, it
  is still something that is lost)

git clone --mirror is used for repositories that regularly get updates
from the repositories they were cloned from. Though this is not what you
want, it's not difficult to reset the refspecs to the default refspecs.
Because it fetches all refs, it's not necessary to add --all --tags
(because tags are also refs).

git clone --mirror is the closest you are going to get by only using
git.

I guess you are aware of this, but if you want to retain more
information, you have to rely on other means, like scp to get the other
things

So to summarize, git clone is only used for cloning history, which means
objects and refs, the rest is not part of cloning. To get more, you have
to go outside git.

Hope this helps to clear some confussion.

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28  2:56 how to make "full" copy of a repo Christoph Anton Mitterer
  2015-03-28 14:31 ` Kevin D
@ 2015-03-28 18:52 ` Torsten Bögershausen
  2015-03-28 20:33   ` Ævar Arnfjörð Bjarmason
  2015-03-29  2:22   ` Christoph Anton Mitterer
  1 sibling, 2 replies; 9+ messages in thread
From: Torsten Bögershausen @ 2015-03-28 18:52 UTC (permalink / raw)
  To: Christoph Anton Mitterer, git

On 2015-03-28 03.56, Christoph Anton Mitterer wrote:
> Hey.
> 
> I was looking for an ideally simple way to make a "full" copy of a git
> repo. Many howtos are floating around on this on the web, with also lots
> of voodoo.
> 
> 
> First, it shouldn't be just a clone, i.o.w.
> - I want to have all refs (local/remote branches/tags) and of course all
> objects from the source repo copied as is.
> So it's local branches should become my local branches and not remote
> branches as well - and so on.
> Basically I want to be able to delete the source afterwards (and all
> backups ;) ) and not having anything lost.
> 
> - It shouldn't set the source repo as origin or it's branches as remote
> tracking branches, as said it should be identical the source repo, just
> "freshly copied" via the "Git aware transport mechanisms".
> 
> - Whether GC or repacking happens, I don't care, as long as nothing that
> is still reachable in the source repo wouldn't get lost (or get lost
> once I run a GC in the copied repo).
> 
> - Whether anything that other tools have added to .git (e.g. git-svn
> stuff) get's lost, I don't care.
> 
> - It should work for both, bare and non-bare repos, but it's okay when
> it doesn't copy anything that is not committed or stashed.
> 
> 
> 
> I'd have said that either:
> $ git clone --mirror URl-to-source-repo copy
> for the direction from "outside" the source to a copy,
> or alternatively:
> $ cd source-repo
> $ git push --mirror URl-to-copy
> for the direction from "within" the source to a copy with copy being an
> empty bare or non-bare repo,
> would do the job.
> 
> But:
> 
> a) but the git-clone(1) part for --mirror:
>    >and sets up a refspec configuration such that all these refs are
>    >overwritten by a git remote update in the target repository.
>    kinda confuses me since I wanted to get independent of the source
>    repo and this ssems to set up a remote to it?
> 
> b) do I need --all --tags for the push as well?
> 
> c) When following
>    https://help.github.com/articles/duplicating-a-repository/
>    it doesn't seem as if --mirror is what I want because they seem to
>    advertise it rather as having the copy tracking the source repo.
>    Of course I read about just using git-clone --bare, but that seems to
>    not copy everything that --mirror does (remote-tracking branches,
>    notes).
> 
>    So I'm a bit confused...
This instructions have 3 repos:
the source, "old", the destination "new" and a temporary one.
As you only push to "new", "new" should have no information about
"old" or "temp".
> 
> 
> 1) Is it working like I assumed above?
> 2) Does that also copy things like git-config, hooks, etc.?
> 3) Does it copy the configured remotes from the source?
> 4) What else is not copied by that? I'd assume anything that is not
>    tracked by git and the stash of the source?

You didn't write if this is a bare repository,
if it is on a local disc, if it is reachable by rsync ?
Linux or Windows ?

For a "full clone" (in the sense of having everything, bit for bit)
I would probably use rsync. (After stopping all activities on the repo)

But I don't know where you repos life, are they bare or not, so there
may be other ways to go.

> 
> 
> Thanks a lot,
> Chris.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28 18:52 ` Torsten Bögershausen
@ 2015-03-28 20:33   ` Ævar Arnfjörð Bjarmason
  2015-03-29  2:22   ` Christoph Anton Mitterer
  1 sibling, 0 replies; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2015-03-28 20:33 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Christoph Anton Mitterer, Git Mailing List

On Sat, Mar 28, 2015 at 7:52 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> On 2015-03-28 03.56, Christoph Anton Mitterer wrote:
>> Hey.
>>
>> I was looking for an ideally simple way to make a "full" copy of a git
>> repo. Many howtos are floating around on this on the web, with also lots
>> of voodoo.
>>
>>
>> First, it shouldn't be just a clone, i.o.w.
>> - I want to have all refs (local/remote branches/tags) and of course all
>> objects from the source repo copied as is.
>> So it's local branches should become my local branches and not remote
>> branches as well - and so on.
>> Basically I want to be able to delete the source afterwards (and all
>> backups ;) ) and not having anything lost.
>>
>> - It shouldn't set the source repo as origin or it's branches as remote
>> tracking branches, as said it should be identical the source repo, just
>> "freshly copied" via the "Git aware transport mechanisms".
>>
>> - Whether GC or repacking happens, I don't care, as long as nothing that
>> is still reachable in the source repo wouldn't get lost (or get lost
>> once I run a GC in the copied repo).
>>
>> - Whether anything that other tools have added to .git (e.g. git-svn
>> stuff) get's lost, I don't care.
>>
>> - It should work for both, bare and non-bare repos, but it's okay when
>> it doesn't copy anything that is not committed or stashed.
>>
>>
>>
>> I'd have said that either:
>> $ git clone --mirror URl-to-source-repo copy
>> for the direction from "outside" the source to a copy,
>> or alternatively:
>> $ cd source-repo
>> $ git push --mirror URl-to-copy
>> for the direction from "within" the source to a copy with copy being an
>> empty bare or non-bare repo,
>> would do the job.
>>
>> But:
>>
>> a) but the git-clone(1) part for --mirror:
>>    >and sets up a refspec configuration such that all these refs are
>>    >overwritten by a git remote update in the target repository.
>>    kinda confuses me since I wanted to get independent of the source
>>    repo and this ssems to set up a remote to it?
>>
>> b) do I need --all --tags for the push as well?
>>
>> c) When following
>>    https://help.github.com/articles/duplicating-a-repository/
>>    it doesn't seem as if --mirror is what I want because they seem to
>>    advertise it rather as having the copy tracking the source repo.
>>    Of course I read about just using git-clone --bare, but that seems to
>>    not copy everything that --mirror does (remote-tracking branches,
>>    notes).
>>
>>    So I'm a bit confused...
> This instructions have 3 repos:
> the source, "old", the destination "new" and a temporary one.
> As you only push to "new", "new" should have no information about
> "old" or "temp".
>>
>>
>> 1) Is it working like I assumed above?
>> 2) Does that also copy things like git-config, hooks, etc.?
>> 3) Does it copy the configured remotes from the source?
>> 4) What else is not copied by that? I'd assume anything that is not
>>    tracked by git and the stash of the source?
>
> You didn't write if this is a bare repository,
> if it is on a local disc, if it is reachable by rsync ?
> Linux or Windows ?
>
> For a "full clone" (in the sense of having everything, bit for bit)
> I would probably use rsync. (After stopping all activities on the repo)

This warrants more emphasis. If you rsync a repository that's
"active", i.e. getting pushes you *will* get corrupt copies. E.g. you
can easily copy something out of the objects directory that's in the
middle of being written, or copy the "refs" namespace after you copy
"objects" and end up with an unreachable object.

There's unfortunately no good solution to this other than doing both
git --mirror backups and rsync backups (for hooks etc.) and combining
the two, or pushing a hook for the duration that bans all updates.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28 14:31 ` Kevin D
@ 2015-03-29  2:21   ` Christoph Anton Mitterer
  2015-03-29 11:05     ` Kevin D
  2015-03-30 15:22   ` Duy Nguyen
  1 sibling, 1 reply; 9+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-29  2:21 UTC (permalink / raw)
  To: Kevin D; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

On Sat, 2015-03-28 at 15:31 +0100, Kevin D wrote: 
> What you are losing on clone is:
> * config settings (this includes the configures remotes)
> * hooks
that would be okay...


> * reflog (history of refs, though, by default disabled for bare
>   repositories)
is there a way to get this copied?


> * Stashes, because the reference to them is stored in the reflog
> * unreferenced objects (though you said those are not a requirement, it
>   is still something that is lost)
that would be okay for me either.


> git clone --mirror is used for repositories that regularly get updates
> from the repositories they were cloned from. Though this is not what you
> want, it's not difficult to reset the refspecs to the default refspecs.
What do you mean here? What would I need to reset exactly?


> git clone --mirror is the closest you are going to get by only using
> git.
I see, thanks :)

> So to summarize, git clone is only used for cloning history, which means
> objects and refs, the rest is not part of cloning. To get more, you have
> to go outside git.

Thanks :)
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28 18:52 ` Torsten Bögershausen
  2015-03-28 20:33   ` Ævar Arnfjörð Bjarmason
@ 2015-03-29  2:22   ` Christoph Anton Mitterer
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-29  2:22 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 713 bytes --]

On Sat, 2015-03-28 at 19:52 +0100, Torsten Bögershausen wrote: 
> As you only push to "new", "new" should have no information about
> "old" or "temp".
Exactly, that would be the goal.

 
> > 1) Is it working like I assumed above?
> > 2) Does that also copy things like git-config, hooks, etc.?
> > 3) Does it copy the configured remotes from the source?
> > 4) What else is not copied by that? I'd assume anything that is not
> >    tracked by git and the stash of the source?
> You didn't write if this is a bare repository,
> if it is on a local disc, if it is reachable by rsync ?
> Linux or Windows ?
Linux.
And in principle I have both cases, but mostly non-bare repos.


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-29  2:21   ` Christoph Anton Mitterer
@ 2015-03-29 11:05     ` Kevin D
  0 siblings, 0 replies; 9+ messages in thread
From: Kevin D @ 2015-03-29 11:05 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: git

On Sun, Mar 29, 2015 at 04:21:26AM +0200, Christoph Anton Mitterer wrote:
> On Sat, 2015-03-28 at 15:31 +0100, Kevin D wrote: 
> [..]
> 
> > * reflog (history of refs, though, by default disabled for bare
> >   repositories)
> is there a way to get this copied?
> 
> 

No, the reflog is considered something private to the repository, so
there is no way to git it through git clone.

> [..] 
> 
> > git clone --mirror is used for repositories that regularly get updates
> > from the repositories they were cloned from. Though this is not what you
> > want, it's not difficult to reset the refspecs to the default refspecs.
> What do you mean here? What would I need to reset exactly?

git clone --mirror sets up the fetch refspec in such a way that local
refs would get reset to whatever upstream has:

+refs/*:refs/*

So every time you would fetch / pull, all your branches would reflect
the way they are on the mirrored repo (which is why it's called mirror).

The default refspec is:

+refs/heads/*:refs/remotes/origin/*

Which would only fetch heads (branches), and maps them as remote
tracking branches, so that your local branches are left alone.

> > git clone --mirror is the closest you are going to get by only using
> > git.
> I see, thanks :)
> 
> > So to summarize, git clone is only used for cloning history, which means
> > objects and refs, the rest is not part of cloning. To get more, you have
> > to go outside git.
> 
> Thanks :)
> Chris.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-28 14:31 ` Kevin D
  2015-03-29  2:21   ` Christoph Anton Mitterer
@ 2015-03-30 15:22   ` Duy Nguyen
  2015-03-30 17:37     ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2015-03-30 15:22 UTC (permalink / raw)
  To: Kevin D; +Cc: Christoph Anton Mitterer, Git Mailing List

On Sat, Mar 28, 2015 at 9:31 PM, Kevin D <me@ikke.info> wrote:
> Git clone is never going to get you a copy where nothing is lost.
>
> What you are losing on clone is:
>
> * config settings (this includes the configures remotes)
> * hooks
> * reflog (history of refs, though, by default disabled for bare
>   repositories)
> * Stashes, because the reference to them is stored in the reflog
> * unreferenced objects (though you said those are not a requirement, it
>   is still something that is lost)

This is true. But I wonder if we should (and can) support
--super-mirror option (disabled by default), where reflog and stashes
are kept, for backup purposes. We might keep unreferenced objects as
well if it's not hard to do.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: how to make "full" copy of a repo
  2015-03-30 15:22   ` Duy Nguyen
@ 2015-03-30 17:37     ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2015-03-30 17:37 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Kevin D, Christoph Anton Mitterer, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> This is true. But I wonder if we should (and can) support
> --super-mirror option (disabled by default), where reflog and stashes
> are kept, for backup purposes. We might keep unreferenced objects as
> well if it's not hard to do.

I doubt that we want to be in the business of filesystem backup.  Is
there a practical use case that is *not* "I am relocating out of
this directory on this machine and will be using the other one I am
making by copying"?

For the "I am relocating" scenario, rsync is the most suitable
option.  The caveat "activity at the original will leave the copied
result incomplete" will apply to whatever transport method you use,
but in the "I am relocating" scenario, you will have a period that
the original is quiet (i.e. you stop using the original at some
point before you start the copied one, and do not expect that the
sequence to take zero down time).

In a sense, "super-mirror" is even worse, if it is doing some "Git
activity" on the source which we may want to log, which means the
original will never be quiet during the copying.  Sure, send-pack
may not currently not do any logging in the original repository, but
depending on the reason why such a copy is being made, the original
may even have a custom hook-based logging data left somewhere in the
repository and for copying such a repository the repository owner
would want to keep the logged data.

And if what super-mirror does is not considered a "Git activity" and
somehow bypasses all the Git rules in the original repository, then
what is the advantage of having it in Git in the first place, over
using something like rsync?

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-03-30 17:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-28  2:56 how to make "full" copy of a repo Christoph Anton Mitterer
2015-03-28 14:31 ` Kevin D
2015-03-29  2:21   ` Christoph Anton Mitterer
2015-03-29 11:05     ` Kevin D
2015-03-30 15:22   ` Duy Nguyen
2015-03-30 17:37     ` Junio C Hamano
2015-03-28 18:52 ` Torsten Bögershausen
2015-03-28 20:33   ` Ævar Arnfjörð Bjarmason
2015-03-29  2:22   ` Christoph Anton Mitterer

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).