Fetching SHA id's instead of named references?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Fetching SHA id's instead of named references?
@ 2009-04-06 12:13 Klas Lindberg
  2009-04-06 12:33 ` Johannes Schindelin
  0 siblings, 1 reply; 15+ messages in thread
From: Klas Lindberg @ 2009-04-06 12:13 UTC (permalink / raw)
  To: Git Users List

Hello

Is there a way to fetch based on SHA id's instead of named references?

My usage scenario is this: A change management tool based on version
controlled manifest files (somewhat similar to Google's Repo) must be
able to check out exact versions of all 200 trees in the project view.
To support this, tags are used since they specify an exact revision.
But there are two problems with tagging:

 * Tags are not immutable.
 * External components that already have a tagging style get polluted
by our excessive use of tags.

I would really prefer to just list SHA keys in the manifests, but
fetch apparently doesn't support that? Could I use a combination of
lower level commands instead?

BR / Klas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:13 Fetching SHA id's instead of named references? Klas Lindberg
@ 2009-04-06 12:33 ` Johannes Schindelin
  2009-04-06 12:41   ` Klas Lindberg
  2009-04-06 14:40   ` Shawn O. Pearce
  0 siblings, 2 replies; 15+ messages in thread
From: Johannes Schindelin @ 2009-04-06 12:33 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Git Users List

Hi,

On Mon, 6 Apr 2009, Klas Lindberg wrote:

> Is there a way to fetch based on SHA id's instead of named references?

No, out of security concerns;  imagine you included some proprietary 
source code by mistake, and undo the damage by forcing a push with a 
branch that does not have the incriminating code.  Usually you do not 
control the garbage-collection on the server, yet you still do not want 
other people to fetch "by SHA-1".

BTW this is really a strong reason not to use HTTP push in such 
environments.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:33 ` Johannes Schindelin
@ 2009-04-06 12:41   ` Klas Lindberg
  2009-04-06 12:48     ` Johannes Schindelin
  2009-04-06 12:54     ` Matthieu Moy
  2009-04-06 14:40   ` Shawn O. Pearce
  1 sibling, 2 replies; 15+ messages in thread
From: Klas Lindberg @ 2009-04-06 12:41 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Users List

Hello

Thank you, but I don't understand the answer. If I mistakenly publish
a tree that contains secrets and someone manages to fetch against it
before I correct the mistake; how does the limitation to only fetch
named references help me???

By the way: I don't use push. I'd be perfectly happy if just fetch
supported SHA key references.

BR / Klas


On Mon, Apr 6, 2009 at 2:33 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Mon, 6 Apr 2009, Klas Lindberg wrote:
>
>> Is there a way to fetch based on SHA id's instead of named references?
>
> No, out of security concerns;  imagine you included some proprietary
> source code by mistake, and undo the damage by forcing a push with a
> branch that does not have the incriminating code.  Usually you do not
> control the garbage-collection on the server, yet you still do not want
> other people to fetch "by SHA-1".
>
> BTW this is really a strong reason not to use HTTP push in such
> environments.
>
> Ciao,
> Dscho
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:41   ` Klas Lindberg
@ 2009-04-06 12:48     ` Johannes Schindelin
  2009-04-06 21:50       ` Dmitry Potapov
  2009-04-06 12:54     ` Matthieu Moy
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Schindelin @ 2009-04-06 12:48 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Git Users List

Hi,

On Mon, 6 Apr 2009, Klas Lindberg wrote:

> Thank you, but I don't understand the answer. If I mistakenly publish a 
> tree that contains secrets and someone manages to fetch against it 
> before I correct the mistake; how does the limitation to only fetch 
> named references help me???

The issue is not if someone manages to fetch stuff before you repair it.  
The issue is that that someone should not be able to manage _after_ you 
repair it.

Oh, and please do not top-post,
Dscho

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:41   ` Klas Lindberg
  2009-04-06 12:48     ` Johannes Schindelin
@ 2009-04-06 12:54     ` Matthieu Moy
  2009-04-06 13:06       ` Klas Lindberg
  1 sibling, 1 reply; 15+ messages in thread
From: Matthieu Moy @ 2009-04-06 12:54 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Johannes Schindelin, Git Users List

Klas Lindberg <klas.lindberg@gmail.com> writes:

> Hello
>
> Thank you, but I don't understand the answer. If I mistakenly publish
> a tree that contains secrets and someone manages to fetch against it
> before I correct the mistake; how does the limitation to only fetch
> named references help me???

What Johannes pointed out is that someone could fetch from your repo
_after_ you correct the mistake (if you don't control garbage
collection).

-- 
Matthieu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:54     ` Matthieu Moy
@ 2009-04-06 13:06       ` Klas Lindberg
  2009-04-06 13:16         ` Finn Arne Gangstad
  0 siblings, 1 reply; 15+ messages in thread
From: Klas Lindberg @ 2009-04-06 13:06 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Johannes Schindelin, Git Users List

On Mon, Apr 6, 2009 at 2:54 PM, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> What Johannes pointed out is that someone could fetch from your repo
> _after_ you correct the mistake (if you don't control garbage
> collection).

Aha, ok. But how then does submodule update work? Git will only see
SHA keys for each submodule in the cotntainer tree commit, so how does
it perform fetching of those (unnamed) references?

BR / Klas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 13:06       ` Klas Lindberg
@ 2009-04-06 13:16         ` Finn Arne Gangstad
  0 siblings, 0 replies; 15+ messages in thread
From: Finn Arne Gangstad @ 2009-04-06 13:16 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Matthieu Moy, Johannes Schindelin, Git Users List

On Mon, Apr 06, 2009 at 03:06:46PM +0200, Klas Lindberg wrote:
> On Mon, Apr 6, 2009 at 2:54 PM, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> 
> > What Johannes pointed out is that someone could fetch from your repo
> > _after_ you correct the mistake (if you don't control garbage
> > collection).
> 
> Aha, ok. But how then does submodule update work? Git will only see
> SHA keys for each submodule in the cotntainer tree commit, so how does
> it perform fetching of those (unnamed) references?

git submodule update just does "git fetch" and hopes that the required
commit appears. In practice this means that you (may) need to invent a
tag or a branch for all the submodules, otherwise they are not
fetchable.

This bit us pretty hard when we tried to use submodules earlier, so we
gave up. Maybe some day...

- Finn Arne

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:33 ` Johannes Schindelin
  2009-04-06 12:41   ` Klas Lindberg
@ 2009-04-06 14:40   ` Shawn O. Pearce
  2009-04-06 16:22     ` Klas Lindberg
  1 sibling, 1 reply; 15+ messages in thread
From: Shawn O. Pearce @ 2009-04-06 14:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Klas Lindberg, Git Users List

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Mon, 6 Apr 2009, Klas Lindberg wrote:
> 
> > Is there a way to fetch based on SHA id's instead of named references?
> 
> No, out of security concerns;  imagine you included some proprietary 
> source code by mistake, and undo the damage by forcing a push with a 
> branch that does not have the incriminating code.  Usually you do not 
> control the garbage-collection on the server, yet you still do not want 
> other people to fetch "by SHA-1".
> 
> BTW this is really a strong reason not to use HTTP push in such 
> environments.

Err, you mean http:// and rsync:// fetch, don't you?  Because if
you rely on being able to unpublish a ref you have to use only the
native git://, where direct access is otherwise forbidden.

Anyway.

The fetch-pack/upload-pack protocol uses SHA1s in the want commands,
so in theory at the protocol level you can say "git fetch URL SHA1"
and convey your request to the remote peer.

The problem is, upload-pack won't perform a reachability analysis
to determine if a wanted SHA1 is reachable from a current ref.
Instead it requires that the wanted SHA1 is *exactly* referenced
by at least one ref.

I had previously proposed adding a merge base test if SHA1 parses
as a commit, but IIRC Junio rejected the idea, saying it was too
costly to perform on the server.

The thing is, he's right.

There's no reason to perform the reachability test on the server
when you can move it onto the client, and that's exactly what
git-submodule is doing.  It fetches everything, and then assumes
its reachable post fetch.  Since the client has fetched everything,
the client has the object if its reachable by the server.

If the object is no longer reachable by the server's refs (think
branch rebased) then the object is actually in danger of being GC'd
off of the server's object store.  So you already are going to be
playing with fire, even if we added a server side config to permit
fetching of unreachable data.  A future "git gc" on that server
repository could suddenly wipe out that data entirely.

Klas, one suggestion might be to make a "refs/heads/world" ref which
has a threaded chain of merges of every commit you ever recorded
in the supermodule, and then you can assume post fetch that the
world is reachable.

E.g. every time you want to record a commit in the manifest file,
also shove it into the world:

  C=...commit.to.save... &
  W=$(git rev-parse refs/heads/world) &&
  git update-ref refs/heads/world \
    $(echo Save $C, save the world | git commit-tree $W -p $W -p $C) \
    $W &&
  git push URL refs/heads/world

One way we get away with this sort of thing in repo is, we only
put SHA1s in our manifest that are published in branches that
won't ever rewind or delete.  Hence, its a moot point.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 14:40   ` Shawn O. Pearce
@ 2009-04-06 16:22     ` Klas Lindberg
  2009-04-06 16:55       ` Nicolas Pitre
  0 siblings, 1 reply; 15+ messages in thread
From: Klas Lindberg @ 2009-04-06 16:22 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Git Users List

On Mon, Apr 6, 2009 at 4:40 PM, Shawn O. Pearce <spearce@spearce.org> wrote:

> The problem is, upload-pack won't perform a reachability analysis
> to determine if a wanted SHA1 is reachable from a current ref.
> Instead it requires that the wanted SHA1 is *exactly* referenced
> by at least one ref.

I probably just don't understand this properly, so please correct me
as needed. My understanding is that

 * git-fetch-pack looks at the local named reference to figure out the
SHA id "X" for the last locally available commit.
 * git-upload-pack is given "X" as a delimiter for what to include in
the pack to send back to git-fetch-pack.

So if I have "X" and I know which remote "Y" I want (because someone
told me, or it's in a manifest), why shouldn't I be able to let
git-upload-pack search for "X" from "Y" if that is exactly what it
does anyway for named references? I accept that it may fail because
"X" is not reachable from "Y" (just give me a sensible error message).

> There's no reason to perform the reachability test on the server
> when you can move it onto the client, and that's exactly what
> git-submodule is doing.  It fetches everything, and then assumes
> its reachable post fetch.  Since the client has fetched everything,
> the client has the object if its reachable by the server.

Except it will not always be available even when it was reachable at
the source. Here's the real world example that forced me to reject the
use of the submodule command for distributed setups:

 * Bob is located at site S where he sets up tree A with a submodule
B. He uses "submodule init" to initialize B, which will cause it to be
listed relative to S in A.
 * Lisa, at site T, clones A and updates the submodule B. No problem
so far. Her list of submodules is inherited from S and works for
updating B.
 * Lisa commits a new version of B and then a new version of A. Then
she asks Kent to merge her changes.
 * Kent's clone will also have a submodules list that refers to site S
(and not T). Running "submodule update" after fetching from T fails
even though all the material is available at T, because Git is then
trying to fetch the new revision of B from S.

If you try to work around this by not using "submodule init", then you
get a saner tree that can be worked on in a truly serverless fashion,
like with plain git trees, but you have to implement a CM tool on top.

> If the object is no longer reachable by the server's refs (think
> branch rebased) then the object is actually in danger of being GC'd
> off of the server's object store.

This is alright and I would make sure all the refs I want to keep are
reachable from named references to keep git-gc from chomping stuff in
my local tree.

In the remote tree, the unnamed reference is either available or it
isn't. If someone made an unnamed reference unreachable and then
garbage-collected it, well so be it. Just tell the user that the
reference can't be found and may in fact not exist at all and you're
done. No exhaustive search necessary.

> One way we get away with this sort of thing in repo is, we only
> put SHA1s in our manifest that are published in branches that
> won't ever rewind or delete.  Hence, its a moot point.

What is the syntax for that?

Anyway it's not a moot point. I may later want to use that revision of
the manifest to perform a checkout on every component listed by the
manifest. At that point I expect all the work trees to have exactly
the contents they "should" have for that old version of the manifest.
It's all about affordable reproducibility.

/Klas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 16:22     ` Klas Lindberg
@ 2009-04-06 16:55       ` Nicolas Pitre
  2009-04-06 23:40         ` Klas Lindberg
  0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Pitre @ 2009-04-06 16:55 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Shawn O. Pearce, Git Users List

On Mon, 6 Apr 2009, Klas Lindberg wrote:

> In the remote tree, the unnamed reference is either available or it
> isn't. If someone made an unnamed reference unreachable and then
> garbage-collected it, well so be it. Just tell the user that the
> reference can't be found and may in fact not exist at all and you're
> done. No exhaustive search necessary.

Why can't you simply fetch the remote from its branch tip and then 
figure out / checkout the particular unnamed reference you wish locally?

> I may later want to use that revision of the manifest to perform a 
> checkout on every component listed by the manifest. At that point I 
> expect all the work trees to have exactly the contents they "should" 
> have for that old version of the manifest. It's all about affordable 
> reproducibility.

Unlike with CVS/SVN, you don't need anything from the remote if you want 
to checkout an old version.  In particular, there is no need for you to 
only fetch that old version from the remote.  You just fetch everything 
from the remote and then checkout the particular old version you wish.  
There is just no real advantage to limit yourself to some old version 
from the remote repository because that's what you want locally.  Sure 
you might be getting more data than needed, but usually not that much 
due to git's good delta compression making extra versions almost free.

Nicolas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 12:48     ` Johannes Schindelin
@ 2009-04-06 21:50       ` Dmitry Potapov
  0 siblings, 0 replies; 15+ messages in thread
From: Dmitry Potapov @ 2009-04-06 21:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Klas Lindberg, Git Users List

On Mon, Apr 06, 2009 at 02:48:17PM +0200, Johannes Schindelin wrote:
>_
> The issue is not if someone manages to fetch stuff before you repair it.__
> The issue is that that someone should not be able to manage _after_ you_
> repair it.

But how this someone will know the exact SHA-1 needed to fetch this
commit without having seen this commit already? Guessing SHA-1 by
brute force attack does not sound very promising...

Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 16:55       ` Nicolas Pitre
@ 2009-04-06 23:40         ` Klas Lindberg
  2009-04-07  2:34           ` Nicolas Pitre
  0 siblings, 1 reply; 15+ messages in thread
From: Klas Lindberg @ 2009-04-06 23:40 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn O. Pearce, Git Users List

On Mon, Apr 6, 2009 at 6:55 PM, Nicolas Pitre <nico@cam.org> wrote:

> Why can't you simply fetch the remote from its branch tip and then
> figure out / checkout the particular unnamed reference you wish locally?

It is a pretty sane thing to do, but it makes me a bit nervous that
branches are not immutable. Let's say I decide on a manifest format
where each tree is listed as a branch name plus the current SHA key on
that branch. The branch name is needed to enable fetch, but if the
branch is later renamed because of a change in naming policy, or its
name simply reused to refer to something completely different (*),
then there is no guarantee that the SHA key is reachable through that
branch name.

(*) These situations cannot be discounted in an organization with,
say, a few thousand employees and several tens of really big projects
with considerable overlap. I have to take into account that the right
hand may not know what the left hand is doing all the time.

> Unlike with CVS/SVN, you don't need anything from the remote if you want
> to checkout an old version. In particular, there is no need for you to
> only fetch that old version from the remote.  You just fetch everything
> from the remote and then checkout the particular old version you wish.

Please consider when you have to recreate some particular forest that
you never worked on before, but now you have to fetch and recreate a 3
year old version so that you can work on that critical error report.
And I may really not want to fetch everything. Some projects are just
very very big.

I think that what I would need is either

 * Immutable tags, or
 * A way to maintain sets of indestructible commits based on SHA id's
and a way to fetch them without going through a named reference.

The second option seems better because it would allow for recursion on
submodules and it doesn't pollute the tag name space.

BR / Klas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-06 23:40         ` Klas Lindberg
@ 2009-04-07  2:34           ` Nicolas Pitre
  2009-04-08 20:03             ` Klas Lindberg
  0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Pitre @ 2009-04-07  2:34 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Shawn O. Pearce, Git Users List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3664 bytes --]

On Tue, 7 Apr 2009, Klas Lindberg wrote:

> On Mon, Apr 6, 2009 at 6:55 PM, Nicolas Pitre <nico@cam.org> wrote:
> 
> > Why can't you simply fetch the remote from its branch tip and then
> > figure out / checkout the particular unnamed reference you wish locally?
> 
> It is a pretty sane thing to do, but it makes me a bit nervous that
> branches are not immutable. Let's say I decide on a manifest format
> where each tree is listed as a branch name plus the current SHA key on
> that branch. The branch name is needed to enable fetch, but if the
> branch is later renamed because of a change in naming policy, or its
> name simply reused to refer to something completely different (*),
> then there is no guarantee that the SHA key is reachable through that
> branch name.

In git terms, this is called "history rewriting".  And you really don't 
want to do that if your repository is pulled by other people, unless 
there is an explicit statement about that fact.

Still, if you fetch all branches from the remote (which is the default 
behavior anyway), then the branch you're interested in will always 
contain the particular commit you're looking for, regardless of the name 
of the branch.  While it is true that branches may not be immutable, any 
commit they collectively refer to still are immutable.

If the remote has deleted the only branch through which your particular 
commit of interest was reachable, then of course pulling all branches 
from the remote won't hhelp you.  But nor would a fetch with the 
particular commit's SHA1 because it may well have been pruned from the 
remote repository at that point.

> (*) These situations cannot be discounted in an organization with,
> say, a few thousand employees and several tens of really big projects
> with considerable overlap. I have to take into account that the right
> hand may not know what the left hand is doing all the time.

Thing is, with the distributed nature of git, nothing prevents you from 
keeping a local version of the commit you're interested in.  Unlike with 
a central repository where someone else might delete a branch you need, 
with git you will still have access to that particular commit locally 
regardless if the remote repository has deleted it or not.

> > Unlike with CVS/SVN, you don't need anything from the remote if you want
> > to checkout an old version. In particular, there is no need for you to
> > only fetch that old version from the remote.  You just fetch everything
> > from the remote and then checkout the particular old version you wish.
> 
> Please consider when you have to recreate some particular forest that
> you never worked on before, but now you have to fetch and recreate a 3
> year old version so that you can work on that critical error report.
> And I may really not want to fetch everything. Some projects are just
> very very big.

There is nothing a tool can do for you if someone is determined to be 
stupid with it.  In other words, don't delete a branch if it contains 
important stuff.  You may rename it if you wish.  And if you don't want 
to fetch everything then you may always find out about the right branch 
to pull with "git branch --contains <SHA1>".

> I think that what I would need is either
> 
>  * Immutable tags, or
>  * A way to maintain sets of indestructible commits based on SHA id's
> and a way to fetch them without going through a named reference.
> 
> The second option seems better because it would allow for recursion on
> submodules and it doesn't pollute the tag name space.

Maybe.  But as others already explained, there are technical reasons 
that makes such a solution undesirable.

Nicolas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-07  2:34           ` Nicolas Pitre
@ 2009-04-08 20:03             ` Klas Lindberg
  2009-04-08 20:38               ` Nicolas Pitre
  0 siblings, 1 reply; 15+ messages in thread
From: Klas Lindberg @ 2009-04-08 20:03 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Users List

7 apr 2009 kl. 04.34 skrev Nicolas Pitre:

> In git terms, this is called "history rewriting".  And you really  
> don't
> want to do that if your repository is pulled by other people, unless
> there is an explicit statement about that fact.

I thought you had to use filter-branch to qualify for history  
rewriting? Anyway, the scenario I have in mind is when a new branch is  
created from the old one, the old one deleted and then the name of the  
old one gets reused. The deltas are still there, intact, but now you  
have to use a different named reference to reach them  :-(

> Thing is, with the distributed nature of git, nothing prevents you  
> from
> keeping a local version of the commit you're interested in.  Unlike  
> with
> a central repository where someone else might delete a branch you  
> need,
> with git you will still have access to that particular commit locally
> regardless if the remote repository has deleted it or not.

This is true, and Git is indeed very good at saving your ass on the  
client side. Other systems spend much more effort on saving your ass  
on the server side. My problem is that "my" people responsible for the  
overall system are mostly interested in the server side. At least that  
is where they put the tough requirements on perpetual availability.

However, it is good enough if there is some way to somehow guarantee  
that a branch or tag will never be misused as outlined above. This  
could be solved through basic file system mechanisms (like write  
protecting the refs/tags files perhaps?) or a backup mechanism that  
raises an alarm on forbidden manipulations, or a host of other more or  
less weird mechanisms. Git doesn't have to provide the mechanism  
directly, but it would be nice for enterprise users if it did.

> There is nothing a tool can do for you if someone is determined to be
> stupid with it.  In other words, don't delete a branch if it contains
> important stuff.  You may rename it if you wish.  And if you don't  
> want
> to fetch everything then you may always find out about the right  
> branch
> to pull with "git branch --contains <SHA1>".

This is all very true. And I wasn't aware of the --contains switch  
before. That one covers an entire scenario for me. Thanks!!

BR / Klas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Fetching SHA id's instead of named references?
  2009-04-08 20:03             ` Klas Lindberg
@ 2009-04-08 20:38               ` Nicolas Pitre
  0 siblings, 0 replies; 15+ messages in thread
From: Nicolas Pitre @ 2009-04-08 20:38 UTC (permalink / raw)
  To: Klas Lindberg; +Cc: Git Users List

On Wed, 8 Apr 2009, Klas Lindberg wrote:

> 7 apr 2009 kl. 04.34 skrev Nicolas Pitre:
> 
> > In git terms, this is called "history rewriting".  And you really don't
> > want to do that if your repository is pulled by other people, unless
> > there is an explicit statement about that fact.
> 
> I thought you had to use filter-branch to qualify for history rewriting?

That, or 'git rebase', or 'git reset', or 'git commit --amend' on an 
already published commit, or reusing a branch name for a different line 
of development.  Anything that would look like the past has changed to a 
client fetching from you.

> Anyway, the scenario I have in mind is when a new branch is created from the
> old one, the old one deleted and then the name of the old one gets reused. The
> deltas are still there, intact, but now you have to use a different named
> reference to reach them  :-(

Right.  And normally you would use a good name for the new branch that 
clearly indicate its archiving purpose.

> > Thing is, with the distributed nature of git, nothing prevents you from
> > keeping a local version of the commit you're interested in.  Unlike with
> > a central repository where someone else might delete a branch you need,
> > with git you will still have access to that particular commit locally
> > regardless if the remote repository has deleted it or not.
> 
> This is true, and Git is indeed very good at saving your ass on the client
> side. Other systems spend much more effort on saving your ass on the server
> side. My problem is that "my" people responsible for the overall system are
> mostly interested in the server side. At least that is where they put the
> tough requirements on perpetual availability.

Just never allow for any branch to be deleted nor rewound on the server 
then.

> However, it is good enough if there is some way to somehow guarantee that a
> branch or tag will never be misused as outlined above. This could be solved
> through basic file system mechanisms (like write protecting the refs/tags
> files perhaps?) or a backup mechanism that raises an alarm on forbidden
> manipulations, or a host of other more or less weird mechanisms. Git doesn't
> have to provide the mechanism directly, but it would be nice for enterprise
> users if it did.

Git provides you with hooks.  Have a look here:

   http://www.kernel.org/pub/software/scm/git/docs/githooks.html


Nicolas

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-04-08 20:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-06 12:13 Fetching SHA id's instead of named references? Klas Lindberg
2009-04-06 12:33 ` Johannes Schindelin
2009-04-06 12:41   ` Klas Lindberg
2009-04-06 12:48     ` Johannes Schindelin
2009-04-06 21:50       ` Dmitry Potapov
2009-04-06 12:54     ` Matthieu Moy
2009-04-06 13:06       ` Klas Lindberg
2009-04-06 13:16         ` Finn Arne Gangstad
2009-04-06 14:40   ` Shawn O. Pearce
2009-04-06 16:22     ` Klas Lindberg
2009-04-06 16:55       ` Nicolas Pitre
2009-04-06 23:40         ` Klas Lindberg
2009-04-07  2:34           ` Nicolas Pitre
2009-04-08 20:03             ` Klas Lindberg
2009-04-08 20:38               ` Nicolas Pitre

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).