git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Klas Lindberg <klas.lindberg@gmail.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Git Users List <git@vger.kernel.org>
Subject: Re: Fetching SHA id's instead of named references?
Date: Mon, 6 Apr 2009 18:22:15 +0200	[thread overview]
Message-ID: <33f4f4d70904060922t5c868ec0x89ed5891cf4b19c2@mail.gmail.com> (raw)
In-Reply-To: <20090406144047.GE23604@spearce.org>

On Mon, Apr 6, 2009 at 4:40 PM, Shawn O. Pearce <spearce@spearce.org> wrote:

> The problem is, upload-pack won't perform a reachability analysis
> to determine if a wanted SHA1 is reachable from a current ref.
> Instead it requires that the wanted SHA1 is *exactly* referenced
> by at least one ref.

I probably just don't understand this properly, so please correct me
as needed. My understanding is that

 * git-fetch-pack looks at the local named reference to figure out the
SHA id "X" for the last locally available commit.
 * git-upload-pack is given "X" as a delimiter for what to include in
the pack to send back to git-fetch-pack.

So if I have "X" and I know which remote "Y" I want (because someone
told me, or it's in a manifest), why shouldn't I be able to let
git-upload-pack search for "X" from "Y" if that is exactly what it
does anyway for named references? I accept that it may fail because
"X" is not reachable from "Y" (just give me a sensible error message).

> There's no reason to perform the reachability test on the server
> when you can move it onto the client, and that's exactly what
> git-submodule is doing.  It fetches everything, and then assumes
> its reachable post fetch.  Since the client has fetched everything,
> the client has the object if its reachable by the server.

Except it will not always be available even when it was reachable at
the source. Here's the real world example that forced me to reject the
use of the submodule command for distributed setups:

 * Bob is located at site S where he sets up tree A with a submodule
B. He uses "submodule init" to initialize B, which will cause it to be
listed relative to S in A.
 * Lisa, at site T, clones A and updates the submodule B. No problem
so far. Her list of submodules is inherited from S and works for
updating B.
 * Lisa commits a new version of B and then a new version of A. Then
she asks Kent to merge her changes.
 * Kent's clone will also have a submodules list that refers to site S
(and not T). Running "submodule update" after fetching from T fails
even though all the material is available at T, because Git is then
trying to fetch the new revision of B from S.

If you try to work around this by not using "submodule init", then you
get a saner tree that can be worked on in a truly serverless fashion,
like with plain git trees, but you have to implement a CM tool on top.

> If the object is no longer reachable by the server's refs (think
> branch rebased) then the object is actually in danger of being GC'd
> off of the server's object store.

This is alright and I would make sure all the refs I want to keep are
reachable from named references to keep git-gc from chomping stuff in
my local tree.

In the remote tree, the unnamed reference is either available or it
isn't. If someone made an unnamed reference unreachable and then
garbage-collected it, well so be it. Just tell the user that the
reference can't be found and may in fact not exist at all and you're
done. No exhaustive search necessary.

> One way we get away with this sort of thing in repo is, we only
> put SHA1s in our manifest that are published in branches that
> won't ever rewind or delete.  Hence, its a moot point.

What is the syntax for that?

Anyway it's not a moot point. I may later want to use that revision of
the manifest to perform a checkout on every component listed by the
manifest. At that point I expect all the work trees to have exactly
the contents they "should" have for that old version of the manifest.
It's all about affordable reproducibility.

/Klas

  reply	other threads:[~2009-04-06 16:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-06 12:13 Fetching SHA id's instead of named references? Klas Lindberg
2009-04-06 12:33 ` Johannes Schindelin
2009-04-06 12:41   ` Klas Lindberg
2009-04-06 12:48     ` Johannes Schindelin
2009-04-06 21:50       ` Dmitry Potapov
2009-04-06 12:54     ` Matthieu Moy
2009-04-06 13:06       ` Klas Lindberg
2009-04-06 13:16         ` Finn Arne Gangstad
2009-04-06 14:40   ` Shawn O. Pearce
2009-04-06 16:22     ` Klas Lindberg [this message]
2009-04-06 16:55       ` Nicolas Pitre
2009-04-06 23:40         ` Klas Lindberg
2009-04-07  2:34           ` Nicolas Pitre
2009-04-08 20:03             ` Klas Lindberg
2009-04-08 20:38               ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33f4f4d70904060922t5c868ec0x89ed5891cf4b19c2@mail.gmail.com \
    --to=klas.lindberg@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).