git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Paul Smith <paul@mad-scientist.net>
Cc: git@vger.kernel.org
Subject: Re: Fetching everything in another bare repo
Date: Thu, 9 Mar 2023 10:35:46 -0500	[thread overview]
Message-ID: <ZAn80gnIFLOF4Gco@coredump.intra.peff.net> (raw)
In-Reply-To: <64282d0f99df59085a18585846d2086a652677e2.camel@mad-scientist.net>

On Thu, Mar 09, 2023 at 08:55:27AM -0500, Paul Smith wrote:

> > OK. It's not clear to me if this archive repo retains the old
> > references, or if it simply has a bunch of unreachable objects.
> > That distinction will matter below.
> 
> Sorry; I've been using Git for a long time but am still not totally
> immersed in the terminology :).
> 
> Basically, these bare clones have "gc.pruneExpire=never" set, and have
> never had any GC operations run so all commits are still present (when
> you say "unreachable" I assume you mean, not reachable through any
> reference).

Right, that's what I mean by unreachable. And no, you didn't use any
terminology wrong. I was just not sure if you realized that running
"fetch" would not get the unreachable objects. :)

> There is a separate database of information containing SHAs for these
> commits, that is used to find them, but there is nothing in Git itself
> that references them so they are indeed unreachable as far as Git is
> concerned.

OK, that makes sense (and I've done something like that before, as
well).

> Oh interesting.  I did a quick verification and all of the objects /
> packfiles in the old clone either don't exist in the new one, or are
> identical.  I'm sure you expected that but I needed to reassure myself
> I wouldn't be overwriting anything :).

The files are named after the sha1 of their contents (and that goes for
both loose objects and packfiles). But certainly it's a good idea to
double check that nothing funny is going on.

> One question: is the objects/info/packs file anything to be concerned
> about or will git repack (or something) take care of handling it?

You can ignore it.  It will be regenerated by git-repack. But also, it's
pretty useless these days. It's only used for "dumb" fetches (e.g., when
you export a repo via static http, but without using the git-aware CGI).

> > And then you can do any ref updates in the new repository (since it
> > now has all objects from both).
> 
> It's actually possible that I don't care about refs at all.  I might
> only care about objects.  I'm not sure, I can check what exists in the
> old clone.

Yeah, if you have a separate database of branch tips, etc, then the refs
aren't necessary. As long as you are careful not to run "gc" or repack
without "-k".

You may want to try the "preciousObjects" repository extension, which
was designed to prevent accidents for a case like this. Something like:

  [this will cause old versions of Git that don't understand
   extensions.* to bail on all commands for safety]
  $ git config core.repositoryformatversion 1

  [this will tell old versions of Git that don't understand this
   particular extension to bail on all commands for safety. But more
   importantly, it will tell recent versions (> 2.6.3) to allow most
   commands, but not ones that would delete unreachable objects]
  $ git config extensions.preciousObjects true

  [this is it in action]
  $ git repack -ad
  fatal: cannot delete packs in a precious-objects repo
  $ git prune
  fatal: cannot prune in a precious-objects repo

Sadly it's not quite smart enough to realize that "git repack -adk" is
safe. If you want to occasionally repack with that, you'd have to
manually disable the flag for a moment.

I will also say that while I implemented this extension a while back, it
never actually saw production use for my intended case. So I think it's
pretty good (and certainly safer than nothing), but it's not thoroughly
tested in the wild.

-Peff

  reply	other threads:[~2023-03-09 15:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-08 22:39 Fetching everything in another bare repo Paul Smith
2023-03-09  6:41 ` Jeff King
2023-03-09 13:55   ` Paul Smith
2023-03-09 15:35     ` Jeff King [this message]
2023-03-09 17:57       ` Konstantin Ryabitsev
2023-03-10  9:04         ` Jeff King
2023-03-09 18:15       ` Paul Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAn80gnIFLOF4Gco@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=paul@mad-scientist.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).