git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Klas Lindberg <klas.lindberg@gmail.com>,
	Git Users List <git@vger.kernel.org>
Subject: Re: Fetching SHA id's instead of named references?
Date: Mon, 6 Apr 2009 07:40:47 -0700	[thread overview]
Message-ID: <20090406144047.GE23604@spearce.org> (raw)
In-Reply-To: <alpine.DEB.1.00.0904061431020.6619@intel-tinevez-2-302>

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Mon, 6 Apr 2009, Klas Lindberg wrote:
> 
> > Is there a way to fetch based on SHA id's instead of named references?
> 
> No, out of security concerns;  imagine you included some proprietary 
> source code by mistake, and undo the damage by forcing a push with a 
> branch that does not have the incriminating code.  Usually you do not 
> control the garbage-collection on the server, yet you still do not want 
> other people to fetch "by SHA-1".
> 
> BTW this is really a strong reason not to use HTTP push in such 
> environments.

Err, you mean http:// and rsync:// fetch, don't you?  Because if
you rely on being able to unpublish a ref you have to use only the
native git://, where direct access is otherwise forbidden.

Anyway.

The fetch-pack/upload-pack protocol uses SHA1s in the want commands,
so in theory at the protocol level you can say "git fetch URL SHA1"
and convey your request to the remote peer.

The problem is, upload-pack won't perform a reachability analysis
to determine if a wanted SHA1 is reachable from a current ref.
Instead it requires that the wanted SHA1 is *exactly* referenced
by at least one ref.

I had previously proposed adding a merge base test if SHA1 parses
as a commit, but IIRC Junio rejected the idea, saying it was too
costly to perform on the server.

The thing is, he's right.

There's no reason to perform the reachability test on the server
when you can move it onto the client, and that's exactly what
git-submodule is doing.  It fetches everything, and then assumes
its reachable post fetch.  Since the client has fetched everything,
the client has the object if its reachable by the server.

If the object is no longer reachable by the server's refs (think
branch rebased) then the object is actually in danger of being GC'd
off of the server's object store.  So you already are going to be
playing with fire, even if we added a server side config to permit
fetching of unreachable data.  A future "git gc" on that server
repository could suddenly wipe out that data entirely.


Klas, one suggestion might be to make a "refs/heads/world" ref which
has a threaded chain of merges of every commit you ever recorded
in the supermodule, and then you can assume post fetch that the
world is reachable.

E.g. every time you want to record a commit in the manifest file,
also shove it into the world:

  C=...commit.to.save... &
  W=$(git rev-parse refs/heads/world) &&
  git update-ref refs/heads/world \
    $(echo Save $C, save the world | git commit-tree $W -p $W -p $C) \
    $W &&
  git push URL refs/heads/world

One way we get away with this sort of thing in repo is, we only
put SHA1s in our manifest that are published in branches that
won't ever rewind or delete.  Hence, its a moot point.

-- 
Shawn.

  parent reply	other threads:[~2009-04-06 14:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-06 12:13 Fetching SHA id's instead of named references? Klas Lindberg
2009-04-06 12:33 ` Johannes Schindelin
2009-04-06 12:41   ` Klas Lindberg
2009-04-06 12:48     ` Johannes Schindelin
2009-04-06 21:50       ` Dmitry Potapov
2009-04-06 12:54     ` Matthieu Moy
2009-04-06 13:06       ` Klas Lindberg
2009-04-06 13:16         ` Finn Arne Gangstad
2009-04-06 14:40   ` Shawn O. Pearce [this message]
2009-04-06 16:22     ` Klas Lindberg
2009-04-06 16:55       ` Nicolas Pitre
2009-04-06 23:40         ` Klas Lindberg
2009-04-07  2:34           ` Nicolas Pitre
2009-04-08 20:03             ` Klas Lindberg
2009-04-08 20:38               ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090406144047.GE23604@spearce.org \
    --to=spearce@spearce.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=klas.lindberg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).