Re: Track /etc directory using Git

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Re: Track /etc directory using Git
       [not found]     ` <20070914091545.GA26432@piper.oerlikon.madduck.net>
@ 2007-09-14 17:31       ` Thomas Harning Jr.
  2007-09-14 21:26         ` Nicolas Vilz
  2007-09-15 13:26         ` metastore (was: Track /etc directory using Git) martin f krafft
  0 siblings, 2 replies; 72+ messages in thread
From: Thomas Harning Jr. @ 2007-09-14 17:31 UTC (permalink / raw)
  To: martin f krafft; +Cc: git, Francis Moreau

On 9/14/07, martin f krafft <madduck@madduck.net> wrote:
> also sprach Francis Moreau <francis.moro@gmail.com> [2007.09.14.1008 +0200]:
> > Did you find an alternative to git in this case ?
>
> No, and I did not look anywhere, but I know of no other VCS that can
> adequatly track permissions.
Has anyone checked out metastore?  http://repo.or.cz/w/metastore.git
... there's an XML error in there somewhere, so its not loading the
'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should
work.

It looks like it could work.... any thoughts on this?


-- 
Thomas Harning Jr.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-14 17:31       ` Track /etc directory using Git Thomas Harning Jr.
@ 2007-09-14 21:26         ` Nicolas Vilz
  2007-09-15 14:29           ` Pierre Habouzit
  2007-09-15 13:26         ` metastore (was: Track /etc directory using Git) martin f krafft
  1 sibling, 1 reply; 72+ messages in thread
From: Nicolas Vilz @ 2007-09-14 21:26 UTC (permalink / raw)
  To: Thomas Harning Jr.; +Cc: martin f krafft, git, Francis Moreau

On Fri, Sep 14, 2007 at 01:31:06PM -0400, Thomas Harning Jr. wrote:
> On 9/14/07, martin f krafft <madduck@madduck.net> wrote:
> > also sprach Francis Moreau <francis.moro@gmail.com> [2007.09.14.1008 +0200]:
> > > Did you find an alternative to git in this case ?
> >
> > No, and I did not look anywhere, but I know of no other VCS that can
> > adequatly track permissions.
> Has anyone checked out metastore?  http://repo.or.cz/w/metastore.git
> ... there's an XML error in there somewhere, so its not loading the
> 'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should
> work.
> 
> It looks like it could work.... any thoughts on this?

I use that tool. If you just have one branch, it works. With the
commit-hook, which also updates the metadata, you have current
permission tracking. 

There is a lack of a checkout-hook, which sets the permissions, so you
have to remeber todo a metastore -a after you checked out a revision.

But if you have several branches which fork the master branch and try to
rebase the branches on master, you get trouble, because the metadata gets
corrupted somehow. I will think about a solution on this sometime.

Nicolas

^ permalink raw reply	[flat|nested] 72+ messages in thread

* metastore (was: Track /etc directory using Git)
  2007-09-14 17:31       ` Track /etc directory using Git Thomas Harning Jr.
  2007-09-14 21:26         ` Nicolas Vilz
@ 2007-09-15 13:26         ` martin f krafft
  2007-09-15 14:10           ` Johannes Schindelin
  1 sibling, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-15 13:26 UTC (permalink / raw)
  To: git; +Cc: Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

[-- Attachment #1: Type: text/plain, Size: 1737 bytes --]

also sprach Thomas Harning Jr. <harningt@gmail.com> [2007.09.14.1931 +0200]:
> > No, and I did not look anywhere, but I know of no other VCS that can
> > adequatly track permissions.
> Has anyone checked out metastore?  http://repo.or.cz/w/metastore.git
> ... there's an XML error in there somewhere, so its not loading the
> 'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should
> work.

This looks interesting, though I guess getfacl/setfacl and
getfattr/setfattr can pretty much do the same job, especially if you
can call them from shell scripts/hooks (except for mtime). Or did
I misunderstand something?

The problem with metdata getting corrupted, which Nicolas reported,
may well have to do with the use of a single file. It may be worth
to consider using a shadow hierarchy of files, each containing the
metadata, e.g. for a project with foo, and bar/foo and bar/baz
files, you might have

  .metastore/foo
  .metastore/bar/.<uniqueid>.dir
  .metastore/bar/foo
  .metastore/bar/baz

and each file could just be an rfc822-style file:

  Owner: root
  Group: root
  Mode: 4754
  Mtime: 1234567890
  Fattr-<key1>: <value1>
  Fattr-<key2>: <value2>

This would be my approach, which should probably be a little better
at preventing corruption.

Anyway, this *really* should go into git itself!

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
"i like wagner's music better than anybody's. it is so loud that one
 can talk the whole time without other people hearing what one says."
                                                        -- oscar wilde
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 13:26         ` metastore (was: Track /etc directory using Git) martin f krafft
@ 2007-09-15 14:10           ` Johannes Schindelin
  2007-09-15 14:16             ` metastore David Kastrup
  2007-09-15 14:54             ` metastore (was: Track /etc directory using Git) martin f krafft
  0 siblings, 2 replies; 72+ messages in thread
From: Johannes Schindelin @ 2007-09-15 14:10 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

Hi,

On Sat, 15 Sep 2007, martin f krafft wrote:

> The problem with metdata getting corrupted, which Nicolas reported,
> may well have to do with the use of a single file.

Then the tool is corrupt.  Introducing a shadow hierarchy, as you propose, 
is very inefficient.

> Anyway, this *really* should go into git itself!

No.  Git is a source code management system.  Everything else that you can 
do with it is a bonus, a second class citizen.  Should we really try to 
support your use case, we will invariably affect the primary use case.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-15 14:10           ` Johannes Schindelin
@ 2007-09-15 14:16             ` David Kastrup
  2007-09-15 14:54             ` metastore (was: Track /etc directory using Git) martin f krafft
  1 sibling, 0 replies; 72+ messages in thread
From: David Kastrup @ 2007-09-15 14:16 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: martin f krafft, git, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi,
>
> On Sat, 15 Sep 2007, martin f krafft wrote:
>
>> The problem with metdata getting corrupted, which Nicolas reported,
>> may well have to do with the use of a single file.
>
> Then the tool is corrupt.  Introducing a shadow hierarchy, as you
> propose, is very inefficient.
>
>> Anyway, this *really* should go into git itself!
>
> No.  Git is a source code management system.  Everything else that
> you can do with it is a bonus, a second class citizen.  Should we
> really try to support your use case, we will invariably affect the
> primary use case.

That's what bad design is all about, after all.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-14 21:26         ` Nicolas Vilz
@ 2007-09-15 14:29           ` Pierre Habouzit
  2007-09-15 15:24             ` martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: Pierre Habouzit @ 2007-09-15 14:29 UTC (permalink / raw)
  To: Nicolas Vilz; +Cc: Thomas Harning Jr., martin f krafft, git, Francis Moreau

[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]

On Fri, Sep 14, 2007 at 09:26:43PM +0000, Nicolas Vilz wrote:
> On Fri, Sep 14, 2007 at 01:31:06PM -0400, Thomas Harning Jr. wrote:
> > On 9/14/07, martin f krafft <madduck@madduck.net> wrote:
> > > also sprach Francis Moreau <francis.moro@gmail.com> [2007.09.14.1008 +0200]:
> > > > Did you find an alternative to git in this case ?
> > >
> > > No, and I did not look anywhere, but I know of no other VCS that can
> > > adequatly track permissions.
> > Has anyone checked out metastore?  http://repo.or.cz/w/metastore.git
> > ... there's an XML error in there somewhere, so its not loading the
> > 'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should
> > work.
> > 
> > It looks like it could work.... any thoughts on this?
> 
> I use that tool. If you just have one branch, it works. With the
> commit-hook, which also updates the metadata, you have current
> permission tracking. 
> 
> There is a lack of a checkout-hook, which sets the permissions, so you
> have to remeber todo a metastore -a after you checked out a revision.

  Note that having metastore run by a hook makes it unsuitable for /etc
versioning, because you may have short period of times during which
s3kr3t files are readable by more people that what it should be.

  The sole sane way to do that would be to track permissions, acls,
whatever _in_ git. Though, I'm still not convinced that it is such a
good idea at all. I mean for source code you absolutely _don't_ want git
to track permissions (outside from the +x bit). You don't want git to
try to chown your files to "madcoder:madcoder" because I was the last
one committing. So that would mean that you want sometimes to track
permissions, sometimes not. So you need a bunch of tools to list files
whose permissions have to be tracked, and whose permissions don't need
to be.

  I fear that you'll end up with quite a big bloat of git, for a use
case that is fairly limited.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 14:10           ` Johannes Schindelin
  2007-09-15 14:16             ` metastore David Kastrup
@ 2007-09-15 14:54             ` martin f krafft
  2007-09-15 16:22               ` Grzegorz Kulewski
  2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  1 sibling, 2 replies; 72+ messages in thread
From: martin f krafft @ 2007-09-15 14:54 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]

also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.15.1610 +0200]:
> No.  Git is a source code management system.  Everything else that
> you can do with it is a bonus, a second class citizen.  Should we
> really try to support your use case, we will invariably affect the
> primary use case.

I thought git was primarily a content tracker... so it all comes
down to how to define content, doesn't it? But either way, we need
not discuss that because that definition depends a lot on context
and purpose and thus cannot be answered once and for all.

I understand that for the primary use case, tracking nothing more
than +x makes sense and should not be interfered with. This is why
I was proposing a policy-based approach. The primary use case is
unaffected, it's the default policy. Someone may choose to track
other mode bits or file/inode attributes, according to one of
several policies available with git, or even a custom policy. In
that case, the repository needs to be appropriately configured.

The reason why I say this should be done inside git rather than with
hooks and an external tool, such as metastore is quite simple: git
knows about every content entity in any tree of a repo and already
has a data node for each object. Rather than introducing a parallel
object database (shadow hierarchy or single file), it would make
a lot more sense and be way more robust to attach additional
information to these object nodes, wouldn't it?

So with "appropriately configured" above, I meant that one should be
able to say

  git-config core.track all

or

  git-config core.track mode+attr

or the default:

  git-config core.track 7666
  (read that as a umask, which masks out everything but the three
  x bits. I made it 7666 instead of 7677 because core.umask and
  core.sharedrepository then override the group and world bits if
  needed)

and have git do the right thing, rather than expecting those who
want to track more than the executable bit to assemble a brittle set
of hooks and metadata collectors+applicators and hope it all works.

I understand also that this is not top priority for git, which is
why I said earlier in the thread that the real difficulty might be
to get Junio to accept a patch. But I think that the patch would be
rather contained and small, having it all configurable would make it
unintrusive, and if we all test it real well, it should pass as
a bonus. After all, git can e.g upload patches to IMAP boxes, which
in my world clearly is bonus material as well.

Cheers,

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

"the well-bred contradict other people.
 the wise contradict themselves."
                                                        -- oscar wilde

spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-15 14:29           ` Pierre Habouzit
@ 2007-09-15 15:24             ` martin f krafft
  2007-09-15 15:27               ` Pierre Habouzit
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-15 15:24 UTC (permalink / raw)
  To: git; +Cc: Pierre Habouzit, Nicolas Vilz, Thomas Harning Jr., Francis Moreau

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

also sprach Pierre Habouzit <madcoder@debian.org> [2007.09.15.1629 +0200]:
>   I fear that you'll end up with quite a big bloat of git, for a use
> case that is fairly limited.

I think it doesn't get bloated until you try to support the model of
tracking different stuff for different files in the same repo. If
you just track one set of data across all files in the repo, I don't
think it'll cause too much bloat.

-- 
 .''`.   martin f. krafft <madduck@debian.org>
: :'  :  proud Debian developer, author, administrator, and user
`. `'`   http://people.debian.org/~madduck - http://debiansystem.info
  `-  Debian - when you have better things to do than fixing systems
 
gentoo: the performance placebo.

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-15 15:24             ` martin f krafft
@ 2007-09-15 15:27               ` Pierre Habouzit
  2007-09-15 15:42                 ` martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: Pierre Habouzit @ 2007-09-15 15:27 UTC (permalink / raw)
  To: martin f krafft; +Cc: git, Nicolas Vilz, Thomas Harning Jr., Francis Moreau

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

On Sat, Sep 15, 2007 at 03:24:55PM +0000, martin f krafft wrote:
> also sprach Pierre Habouzit <madcoder@debian.org> [2007.09.15.1629 +0200]:
> >   I fear that you'll end up with quite a big bloat of git, for a use
> > case that is fairly limited.
> 
> I think it doesn't get bloated until you try to support the model of
> tracking different stuff for different files in the same repo. If
> you just track one set of data across all files in the repo, I don't
> think it'll cause too much bloat.

  Yeah but if the stuff is opaque to git, you'll definitely end up with
security issues, which makes it also a no-go for /etc versionning.

  Note that I don't specifically care about git being able to deal with
/etc, I was just pointing out some issues I can see with it, but I'm
neiter in favor nor against it.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-15 15:27               ` Pierre Habouzit
@ 2007-09-15 15:42                 ` martin f krafft
  0 siblings, 0 replies; 72+ messages in thread
From: martin f krafft @ 2007-09-15 15:42 UTC (permalink / raw)
  To: git; +Cc: Pierre Habouzit, Nicolas Vilz, Thomas Harning Jr., Francis Moreau

[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]

also sprach Pierre Habouzit <madcoder@debian.org> [2007.09.15.1727 +0200]:
> > I think it doesn't get bloated until you try to support the model of
> > tracking different stuff for different files in the same repo. If
> > you just track one set of data across all files in the repo, I don't
> > think it'll cause too much bloat.
> 
>   Yeah but if the stuff is opaque to git, you'll definitely end up with
> security issues, which makes it also a no-go for /etc versionning.

With "opaque to git" do you mean "implemented outside git"?

I'd say if done properly inside git, the security issues could be
prevented.

-- 
 .''`.   martin f. krafft <madduck@debian.org>
: :'  :  proud Debian developer, author, administrator, and user
`. `'`   http://people.debian.org/~madduck - http://debiansystem.info
  `-  Debian - when you have better things to do than fixing systems
 
"this sentence contradicts itself -- no actually it doesn't."
                                                 -- douglas hofstadter

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 14:54             ` metastore (was: Track /etc directory using Git) martin f krafft
@ 2007-09-15 16:22               ` Grzegorz Kulewski
  2007-09-15 17:43                 ` Johannes Schindelin
  2007-09-15 23:33                 ` metastore Randal L. Schwartz
  2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  1 sibling, 2 replies; 72+ messages in thread
From: Grzegorz Kulewski @ 2007-09-15 16:22 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

On Sat, 15 Sep 2007, martin f krafft wrote:
> also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.15.1610 +0200]:
>> No.  Git is a source code management system.  Everything else that
>> you can do with it is a bonus, a second class citizen.  Should we
>> really try to support your use case, we will invariably affect the
>> primary use case.
>
> I thought git was primarily a content tracker... so it all comes
> down to how to define content, doesn't it? But either way, we need
> not discuss that because that definition depends a lot on context
> and purpose and thus cannot be answered once and for all.
>
> I understand that for the primary use case, tracking nothing more
> than +x makes sense and should not be interfered with. This is why
> I was proposing a policy-based approach. The primary use case is
> unaffected, it's the default policy. Someone may choose to track
> other mode bits or file/inode attributes, according to one of
> several policies available with git, or even a custom policy. In
> that case, the repository needs to be appropriately configured.
>
> The reason why I say this should be done inside git rather than with
> hooks and an external tool, such as metastore is quite simple: git
> knows about every content entity in any tree of a repo and already
> has a data node for each object. Rather than introducing a parallel
> object database (shadow hierarchy or single file), it would make
> a lot more sense and be way more robust to attach additional
> information to these object nodes, wouldn't it?
>
> So with "appropriately configured" above, I meant that one should be
> able to say
>
>  git-config core.track all
>
> or
>
>  git-config core.track mode+attr
>
> or the default:
>
>  git-config core.track 7666
>  (read that as a umask, which masks out everything but the three
>  x bits. I made it 7666 instead of 7677 because core.umask and
>  core.sharedrepository then override the group and world bits if
>  needed)
>
> and have git do the right thing, rather than expecting those who
> want to track more than the executable bit to assemble a brittle set
> of hooks and metadata collectors+applicators and hope it all works.
>
> I understand also that this is not top priority for git, which is
> why I said earlier in the thread that the real difficulty might be
> to get Junio to accept a patch. But I think that the patch would be
> rather contained and small, having it all configurable would make it
> unintrusive, and if we all test it real well, it should pass as
> a bonus. After all, git can e.g upload patches to IMAP boxes, which
> in my world clearly is bonus material as well.

I also think such configuration option would be cool.

Not only for tracking /etc or /home but also for example for "web 
applications" (for example in PHP). In that case file and directory 
permissions can be as important as the source code tracked and it is pain 
to chmod (and sometimes chown) all files to different values after each 
checkout. Not speaking about potential race.


Thanks,

Grzegorz Kulewski

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
       [not found]   ` <38b2ab8a0709140120k50f5b474oc8a841ea0a5fda50@mail.gmail.com>
@ 2007-09-15 16:32     ` martin f krafft
  2007-09-15 16:57       ` David Kastrup
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-15 16:32 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 513 bytes --]

also sprach Francis Moreau <francis.moro@gmail.com> [2007.09.14.1020 +0200]:
> The funny thing is that this tool is based on git/cogito but the
> scm used to manage it is darc.

They switched to git after running their heads too many times
against darcs walls.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
if god had meant for us to be naked,
we would have been born that way.
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Track /etc directory using Git
  2007-09-15 16:32     ` Track /etc directory using Git martin f krafft
@ 2007-09-15 16:57       ` David Kastrup
  0 siblings, 0 replies; 72+ messages in thread
From: David Kastrup @ 2007-09-15 16:57 UTC (permalink / raw)
  To: martin f krafft; +Cc: git

martin f krafft <madduck@madduck.net> writes:

> also sprach Francis Moreau <francis.moro@gmail.com> [2007.09.14.1020 +0200]:
>> The funny thing is that this tool is based on git/cogito but the
>> scm used to manage it is darc.
>
> They switched to git after running their heads too many times
> against darcs walls.

Since they are not using git as a source code management system nor
darcs as a versioned file system, it is not particularly funny.

It's like being proud when the neighbor watchmaker borrows a hammer:
"ah, I always told him that a hammer is the ultimate device for
repairing a clock".

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 16:22               ` Grzegorz Kulewski
@ 2007-09-15 17:43                 ` Johannes Schindelin
  2007-09-15 23:33                 ` metastore Randal L. Schwartz
  1 sibling, 0 replies; 72+ messages in thread
From: Johannes Schindelin @ 2007-09-15 17:43 UTC (permalink / raw)
  To: Grzegorz Kulewski
  Cc: martin f krafft, git, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

Hi,

On Sat, 15 Sep 2007, Grzegorz Kulewski wrote:

> On Sat, 15 Sep 2007, martin f krafft wrote:
> > I understand also that this is not top priority for git, which is why 
> > I said earlier in the thread that the real difficulty might be to get 
> > Junio to accept a patch. But I think that the patch would be rather 
> > contained and small, having it all configurable would make it 
> > unintrusive, and if we all test it real well, it should pass as a 
> > bonus. After all, git can e.g upload patches to IMAP boxes, which in 
> > my world clearly is bonus material as well.
> 
> I also think such configuration option would be cool.

Why don't you just give it a try?  Hack on git, make it work for what you 
want to do, clean it up, make a nice patch series, post it here.

Then we'll talk.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 14:54             ` metastore (was: Track /etc directory using Git) martin f krafft
  2007-09-15 16:22               ` Grzegorz Kulewski
@ 2007-09-15 19:56               ` Daniel Barkalow
  2007-09-15 22:14                 ` Johannes Schindelin
                                   ` (2 more replies)
  1 sibling, 3 replies; 72+ messages in thread
From: Daniel Barkalow @ 2007-09-15 19:56 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

On Sat, 15 Sep 2007, martin f krafft wrote:

> also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.15.1610 +0200]:
> > No.  Git is a source code management system.  Everything else that
> > you can do with it is a bonus, a second class citizen.  Should we
> > really try to support your use case, we will invariably affect the
> > primary use case.
> 
> I thought git was primarily a content tracker... so it all comes
> down to how to define content, doesn't it? But either way, we need
> not discuss that because that definition depends a lot on context
> and purpose and thus cannot be answered once and for all.
> 
> I understand that for the primary use case, tracking nothing more
> than +x makes sense and should not be interfered with. This is why
> I was proposing a policy-based approach. The primary use case is
> unaffected, it's the default policy. Someone may choose to track
> other mode bits or file/inode attributes, according to one of
> several policies available with git, or even a custom policy. In
> that case, the repository needs to be appropriately configured.

Configuration options only apply to the local aspects of the repository. 
That is, when you clone a repository, you don't get the configuration 
options from it, in general. And changing configuration options on a 
repository does not have any effect on the content it contains. So 
configuration options aren't appropriate.

> The reason why I say this should be done inside git rather than with
> hooks and an external tool, such as metastore is quite simple: git
> knows about every content entity in any tree of a repo and already
> has a data node for each object. Rather than introducing a parallel
> object database (shadow hierarchy or single file), it would make
> a lot more sense and be way more robust to attach additional
> information to these object nodes, wouldn't it?

Git doesn't have any way to represent owners or groups, and they would 
need to be represented carefully in order to make sense across multiple 
computers. If you're adding support for metadata-as-content (for more than 
"is this a script?"), you should be able to cover all of the common cases 
of extended stuff, like AFS-style ACLs. And if you want to allow 
meaningful development with this mechanism (as opposed to just archival of 
a sequence of states of a live system), the normal case will be that the 
metadata beyond +x is manipulated by ordinary users in some way other than 
modifying their working directory. So the normal case here will be like 
working on a filesystem that doesn't support symlinks or an executable bit 
when this is important content.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
@ 2007-09-15 22:14                 ` Johannes Schindelin
  2007-09-16  1:30                   ` david
  2007-09-16  6:14                   ` martin f krafft
  2007-09-16  1:35                 ` david
  2007-09-16  6:08                 ` martin f krafft
  2 siblings, 2 replies; 72+ messages in thread
From: Johannes Schindelin @ 2007-09-15 22:14 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: martin f krafft, git, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

Hi,

On Sat, 15 Sep 2007, Daniel Barkalow wrote:

> Git doesn't have any way to represent owners or groups, and they would 
> need to be represented carefully in order to make sense across multiple 
> computers.

[speaking mostly to the proponents of git-as-a-backup-tool]

While at it, you should invent a fallback what to do when the owner is not 
present on the system you check out on.  And a fallback when checking out 
on a filesystem that does not support owners.

And a fallback when a non-root user uses it.

Oh, and while you're at it (you said that it would be nice not to restrict 
git in any way: "it is a content tracker") support the Windows style 
"Group-or-User-or-something:[FRW]" ACLs.

Looking forward to your patches,
Dscho

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-15 16:22               ` Grzegorz Kulewski
  2007-09-15 17:43                 ` Johannes Schindelin
@ 2007-09-15 23:33                 ` Randal L. Schwartz
  2007-09-16  0:37                   ` metastore david
  2007-09-17 13:04                   ` metastore Francis Moreau
  1 sibling, 2 replies; 72+ messages in thread
From: Randal L. Schwartz @ 2007-09-15 23:33 UTC (permalink / raw)
  To: Grzegorz Kulewski
  Cc: martin f krafft, git, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

>>>>> "Grzegorz" == Grzegorz Kulewski <kangur@polcom.net> writes:

Grzegorz> Not only for tracking /etc or /home but also for example for "web
Grzegorz> applications" (for example in PHP). In that case file and directory
Grzegorz> permissions can be as important as the source code tracked and it is pain to
Grzegorz> chmod (and sometimes chown) all files to different values after each
Grzegorz> checkout. Not speaking about potential race.

Uh, works just fine for me to manage my web site content.  The point is
that I treat git for what it is... a source code management system.
And then I have a Makefile that "installs" my source code into the live
directory, with the right modes during installation.

Why does everyone keep wanting "work dir == live dir".  Ugh!  The work dir is
the *source*... it gets *copied* into your live dir *somehow*.  And *that* is
where the meta information needs to be.  In that "somehow".

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-15 23:33                 ` metastore Randal L. Schwartz
@ 2007-09-16  0:37                   ` david
  2007-09-16  1:10                     ` metastore Randal L. Schwartz
  2007-09-17 13:04                   ` metastore Francis Moreau
  1 sibling, 1 reply; 72+ messages in thread
From: david @ 2007-09-16  0:37 UTC (permalink / raw)
  To: Randal L. Schwartz
  Cc: Grzegorz Kulewski, martin f krafft, git, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sat, 15 Sep 2007, Randal L. Schwartz wrote:

>>>>>> "Grzegorz" == Grzegorz Kulewski <kangur@polcom.net> writes:
>
> Grzegorz> Not only for tracking /etc or /home but also for example for "web
> Grzegorz> applications" (for example in PHP). In that case file and directory
> Grzegorz> permissions can be as important as the source code tracked and it is pain to
> Grzegorz> chmod (and sometimes chown) all files to different values after each
> Grzegorz> checkout. Not speaking about potential race.
>
> Uh, works just fine for me to manage my web site content.  The point is
> that I treat git for what it is... a source code management system.
> And then I have a Makefile that "installs" my source code into the live
> directory, with the right modes during installation.
>
> Why does everyone keep wanting "work dir == live dir".  Ugh!  The work dir is
> the *source*... it gets *copied* into your live dir *somehow*.  And *that* is
> where the meta information needs to be.  In that "somehow".

the problem is that at checkin you need to do the reverse process. the 
other tools that you use on the system work on the live dir, not the 'work 
dir', so it's only a 'work dir' in that git requires it as an staging step 
between the repository and the place where it's going to be used.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  0:37                   ` metastore david
@ 2007-09-16  1:10                     ` Randal L. Schwartz
  2007-09-16  1:49                       ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: Randal L. Schwartz @ 2007-09-16  1:10 UTC (permalink / raw)
  To: david
  Cc: Grzegorz Kulewski, martin f krafft, git, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

>>>>> "david" == david  <david@lang.hm> writes:

>> Why does everyone keep wanting "work dir == live dir".  Ugh!  The work dir is
>> the *source*... it gets *copied* into your live dir *somehow*.  And *that* is
>> where the meta information needs to be.  In that "somehow".

david> the problem is that at checkin you need to do the reverse process. the
david> other tools that you use on the system work on the live dir, not the
david> 'work dir', so it's only a 'work dir' in that git requires it as an
david> staging step between the repository and the place where it's going to
david> be used.

Eh?  Are we still talking about a "website", or "/etc"?  I'm talking about the
website case.  I don't do *anything* to the live site.  When I want to add a
file, I add it to my dev repo, possibly modifying my Makefile, and then spit
it out on my staging server.  (You *do* have one of those, right?)  Once I
know it's good, I push it to the live repo, and then "go live" with it.  I
*never* work on the files that are the result of "make install".

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 22:14                 ` Johannes Schindelin
@ 2007-09-16  1:30                   ` david
  2007-09-16  2:48                     ` Johannes Schindelin
                                       ` (2 more replies)
  2007-09-16  6:14                   ` martin f krafft
  1 sibling, 3 replies; 72+ messages in thread
From: david @ 2007-09-16  1:30 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Daniel Barkalow, martin f krafft, git, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

On Sat, 15 Sep 2007, Johannes Schindelin wrote:

> On Sat, 15 Sep 2007, Daniel Barkalow wrote:
>
>> Git doesn't have any way to represent owners or groups, and they would
>> need to be represented carefully in order to make sense across multiple
>> computers.
>
> [speaking mostly to the proponents of git-as-a-backup-tool]
>
> While at it, you should invent a fallback what to do when the owner is not
> present on the system you check out on.  And a fallback when checking out
> on a filesystem that does not support owners.
>
> And a fallback when a non-root user uses it.
>
> Oh, and while you're at it (you said that it would be nice not to restrict
> git in any way: "it is a content tracker") support the Windows style
> "Group-or-User-or-something:[FRW]" ACLs.

git has pre-commit hooks that could be used to gather the permission 
information and store it into a file.

git now has the ability to define cusom merge strategies for specific file 
types, which could be used to handle merges for the permission files.

what git lacks the ability to do is to deal with special cases on 
checkout.

the handling of gitattributes came really close, but there are two 
problems remaining.

1. whatever is trying to write the files with the correct permissions
    needs to be able to query the permission store before files are
    written. This needs to either be an API call into git to retreive the
    information for any file when it's written, or the ability to define a
    specific file to be checked out first so that it can be used for
    everything else.

2. the ability to specify a custom routine/program to write the file out
    (assuming that it's being written to a filesystem not a pipe). this
    routine would be responsible for querying the permission store and
    doing 'the right thing' when the file is written during a checkout

there are some significant advantages of having the permission store be 
just a text file.

1. it doesn't require a special API to a new datastore in git

2. when working in an environment that doesn't allow for implementing the
    permissions (either a filesystem that can't store the permissions or
    when not working as root so that you can't set the ownership) the file
    can just be written and then edited with normal tools.

3. normal merge tools do a reasonable job of merging them.

however to do this git would need to gain the ability to say 'this 
filename is special, it must be checked out before any other file is 
checked out' (either on a per-directory or per-repository level)

if this is acceptable then altering the routines that write the files to 
have the additional option of calling a different routine based on the 
settings in .gitattributes seems relativly simple. there should already be 
logic to decect if it's writing to a pipe or a filesystem (it needs to 
know if it should set the write bit if nothing else), and there's the 
existing passthrough or custom routine logic for the crlf translation from 
.gitattributes. combining the logic of the two should handle the output 
issues.

the ability to handle /etc comes up every few months. it's got to be the 
most common unimplemented request git has seen. Adding the nessasary hooks 
for it to be done could end up being less effort then repeatedly telling 
people that they shouldn't use git for that task (or should wrap git in 
their own scripts and use the result instead of useing git directly)

so would changes like this be acceptable?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  2007-09-15 22:14                 ` Johannes Schindelin
@ 2007-09-16  1:35                 ` david
  2007-09-16  6:08                 ` martin f krafft
  2 siblings, 0 replies; 72+ messages in thread
From: david @ 2007-09-16  1:35 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: martin f krafft, git, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

On Sat, 15 Sep 2007, Daniel Barkalow wrote:

>> The reason why I say this should be done inside git rather than with 
>> hooks and an external tool, such as metastore is quite simple: git 
>> knows about every content entity in any tree of a repo and already has 
>> a data node for each object. Rather than introducing a parallel object 
>> database (shadow hierarchy or single file), it would make a lot more 
>> sense and be way more robust to attach additional information to these 
>> object nodes, wouldn't it?
>
> Git doesn't have any way to represent owners or groups, and they would 
> need to be represented carefully in order to make sense across multiple 
> computers. If you're adding support for metadata-as-content (for more 
> than "is this a script?"), you should be able to cover all of the common 
> cases of extended stuff, like AFS-style ACLs. And if you want to allow 
> meaningful development with this mechanism (as opposed to just archival 
> of a sequence of states of a live system)

don't underestimate the usefullness of the ability to archive and restore 
snapshots of a live system. just that ability would be wonderful to have.

the ability to checkout a copy of things elsewhere and tinker with it 
would be better, but the lack of that doesn't eliminate the utility by any 
means.

David Lang

> , the normal case will be that 
> the metadata beyond +x is manipulated by ordinary users in some way 
> other than modifying their working directory. So the normal case here 
> will be like working on a filesystem that doesn't support symlinks or an 
> executable bit when this is important content.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  1:10                     ` metastore Randal L. Schwartz
@ 2007-09-16  1:49                       ` david
  0 siblings, 0 replies; 72+ messages in thread
From: david @ 2007-09-16  1:49 UTC (permalink / raw)
  To: Randal L. Schwartz
  Cc: Grzegorz Kulewski, martin f krafft, git, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sat, 15 Sep 2007, Randal L. Schwartz wrote:

>>>>>> "david" == david  <david@lang.hm> writes:
>
>>> Why does everyone keep wanting "work dir == live dir".  Ugh!  The work dir is
>>> the *source*... it gets *copied* into your live dir *somehow*.  And *that* is
>>> where the meta information needs to be.  In that "somehow".
>
> david> the problem is that at checkin you need to do the reverse process. the
> david> other tools that you use on the system work on the live dir, not the
> david> 'work dir', so it's only a 'work dir' in that git requires it as an
> david> staging step between the repository and the place where it's going to
> david> be used.
>
> Eh?  Are we still talking about a "website", or "/etc"?  I'm talking about the
> website case.  I don't do *anything* to the live site.  When I want to add a
> file, I add it to my dev repo, possibly modifying my Makefile, and then spit
> it out on my staging server.  (You *do* have one of those, right?)  Once I
> know it's good, I push it to the live repo, and then "go live" with it.  I
> *never* work on the files that are the result of "make install".

even when working on a website it can be relavent.

yes, when you are developing html you want to do it on a test server , 
move it to staging, and then move to production. but it's also not 
uncommon to have web based tools that allow other people to make some 
changes as well (for example, a bank's website is mostly maintained by 
their web development company, but the bank administraters want the 
ability to change rate information instantly). sometimes this is 
implemented by writing the info to a database and then querying that 
database for every hit, but a far more efficiant way is to store that data 
in a file on the webserver, which can include modifying pages directly.

but yes, I was mostly thinking of /etc instead of the webserver when I 
wrote that.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16  1:30                   ` david
@ 2007-09-16  2:48                     ` Johannes Schindelin
  2007-09-16  3:00                       ` david
  2007-09-16  8:06                     ` metastore Junio C Hamano
  2007-09-16 15:59                     ` metastore (was: Track /etc directory using Git) Jan Hudec
  2 siblings, 1 reply; 72+ messages in thread
From: Johannes Schindelin @ 2007-09-16  2:48 UTC (permalink / raw)
  To: david
  Cc: Daniel Barkalow, martin f krafft, git, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

Hi,

On Sat, 15 Sep 2007, david@lang.hm wrote:

> so would changes like this be acceptable?

Would they be acceptable for you?  If so, go ahead.  If not, don't.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16  2:48                     ` Johannes Schindelin
@ 2007-09-16  3:00                       ` david
  0 siblings, 0 replies; 72+ messages in thread
From: david @ 2007-09-16  3:00 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Daniel Barkalow, martin f krafft, git, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

On Sun, 16 Sep 2007, Johannes Schindelin wrote:

> On Sat, 15 Sep 2007, david@lang.hm wrote:
>
>> so would changes like this be acceptable?
>
> Would they be acceptable for you?  If so, go ahead.  If not, don't.

frankly, unless I am willing for fork git (which I am not) it matters a 
whole lot less if such a change is acceptable to me then if it is 
acceptable to the maintainers.

if it's not acceptable to the maintainers as a concept then it's not worth 
going to the effort of producing the patches as they will just be 
rejected.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  2007-09-15 22:14                 ` Johannes Schindelin
  2007-09-16  1:35                 ` david
@ 2007-09-16  6:08                 ` martin f krafft
  2007-09-19 19:16                   ` David Härdeman
  2 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-16  6:08 UTC (permalink / raw)
  To: git
  Cc: Daniel Barkalow, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

[-- Attachment #1: Type: text/plain, Size: 2065 bytes --]

also sprach Daniel Barkalow <barkalow@iabervon.org> [2007.09.15.2156 +0200]:
> Configuration options only apply to the local aspects of the repository. 
> That is, when you clone a repository, you don't get the configuration 
> options from it, in general. And changing configuration options on a 
> repository does not have any effect on the content it contains. So 
> configuration options aren't appropriate.

Sure they are. Just like git-commit figures out your email address 
if user.email is missing from git-config, or core.sharedRepository 
or core.umask deal with permissions only when you tell them to, 
you'd have to enable core.track or else git would just do what it
does right now.

> Git doesn't have any way to represent owners or groups, and they
> would need to be represented carefully in order to make sense
> across multiple computers. If you're adding support for
> metadata-as-content (for more than "is this a script?"), you
> should be able to cover all of the common cases of extended stuff,
> like AFS-style ACLs.

Ideally, git should be able to store an open-ended number of
properties for each object, yes.

> And if you want to allow meaningful development with this
> mechanism (as opposed to just archival of a sequence of states of
> a live system), the normal case will be that the metadata beyond
> +x is manipulated by ordinary users in some way other than
> modifying their working directory.

I have no idea what you mean with that.

> So the normal case here will be like working on a filesystem that
> doesn't support symlinks or an executable bit when this is
> important content.

... and yet, we support symlinks and executable files. But anyway,
I really don't understand what you're trying to say.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
"ist gott eine erfindung des teufels?"
                                                 - friedrich nietzsche
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-15 22:14                 ` Johannes Schindelin
  2007-09-16  1:30                   ` david
@ 2007-09-16  6:14                   ` martin f krafft
  2007-09-16 15:51                     ` Jan Hudec
  1 sibling, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-16  6:14 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Daniel Barkalow, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.16.0014 +0200]:
> While at it, you should invent a fallback what to do when the
> owner is not present on the system you check out on.  And
> a fallback when checking out on a filesystem that does not support
> owners.

Like rsync, git would use numerical UIDs (which are always present)
by default, but could be told to try to map account names.

If the filesystem does not support owners, chown() would not exist.
I actually tend to think of things the other way around: instead of
a fallback when chown() does not work (what would such a fallback be
other than not chown()ing?), it would only try chown() if such
functionality existed.

> And a fallback when a non-root user uses it.

That's easy, Unix already provides you with that "fallback": pack up
/etc in a tar and unpack it as a normal user...

> Oh, and while you're at it (you said that it would be nice not to
> restrict git in any way: "it is a content tracker") support the
> Windows style "Group-or-User-or-something:[FRW]" ACLs.

Provided we find a way to implement this in an extensible manner,
this should not be hard to do. I can't do it since I don't have
access to a Windows machine.

Your statement does catch me off-guard though. Does git now
officially target Windows?

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

if you find a spelling mistake in the above, you get to keep it.

spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  1:30                   ` david
  2007-09-16  2:48                     ` Johannes Schindelin
@ 2007-09-16  8:06                     ` Junio C Hamano
  2007-09-16  8:30                       ` metastore David Kastrup
                                         ` (2 more replies)
  2007-09-16 15:59                     ` metastore (was: Track /etc directory using Git) Jan Hudec
  2 siblings, 3 replies; 72+ messages in thread
From: Junio C Hamano @ 2007-09-16  8:06 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

> git has pre-commit hooks that could be used to gather the permission
> information and store it into a file.
>
> git now has the ability to define cusom merge strategies for specific
> file types, which could be used to handle merges for the permission
> files.
> ...
> There are some significant advantages of having the permission store
> be just a text file.
>
> 1. it doesn't require a special API to a new datastore in git
>
> 2. when working in an environment that doesn't allow for implementing the
>    permissions (either a filesystem that can't store the permissions or
>    when not working as root so that you can't set the ownership) the file
>    can just be written and then edited with normal tools.
>
> 3. normal merge tools do a reasonable job of merging them.
>
> however to do this git would need to gain the ability to say 'this
> filename is special, it must be checked out before any other file is
> checked out' (either on a per-directory or per-repository level)

I'd rather not implement it at such a low level where a true
"checkout" happens.  For one thing, I am afraid that the special
casing will affect the normal codepath too much and would make
it into a maintenance nightmare.  But more importantly, if you
are switching between commits (this includes switching branches,
checking out a different commit to a detached HEAD, or
pulling/merging updates your HEAD and updates your work tree),
and the contents of a path does not change between the original
commit and the switched-to commit, you may still have to
"checkout" the external information for that path if your
"permission information file" are different between these two
commits.  To the underlying checkout aka "two tree merge"
operation, that kind of change is invisible and it should stay
so for performance reasons, not to harm the normal operation.
IOW, I do not want the core level to even know about the
existence of "permission information file", even the code that
implements it is well isolated, ifdefed out or made conditional
based on some config variable.

I however think your idea to have extra "permission information
file" is very interesting.  What would be more palatable, than
mucking with the core level git, would be to have an external
command that takes two tree object names that tells it what the
old and new trees our work tree is switching between, and have
that command to:

 - inspect the diff-tree output to find out what were checked
   out and might need their permission information tweaked;

 - inspect the differences between the "permission information
   file" in these trees to find out what were _not_ checked out,
   but still need their permission information tweaked.

 - tweak whatever external information you are interested in
   expressing in your "permission information file" in the work
   tree for the paths it discovered in the above two steps.
   This step may involve actions specific to projects and call
   hook scripts with <path, info from "permission information
   file" for that path> tuples to carry out the actual tweaking.

If we go that route, I am not deeply opposed to add code to
Porcelains to call that new command after they "checkout" a new
commit at the very end of their processing (namely, git-commit,
git-merge, git-am, and git-rebase).

Yes, I am very well aware that somebody already mentioned "there
is a window between the true checkout and permission tweaking".
If you need to touch the core level in order to close that
window, I am not interested.

> The ability to handle /etc comes up every few months. it's got to be
> the most common unimplemented request git has seen.

Asking a pony for many times does not necessary make it the
right for you to have the pony.  The sane way to implement this
is in your Makefile, as Randal and other people with more
experience have already pointed out, and I happen to agree with
them.

My gut feeling is that the approach to use an external hook that
reads your "permission information file" could be done with
negligible impact to the normal operation of git.  I suspect
that the "new command" I suggested above that would run after
"checkout" actions would perform what people need to do in their
Makefiles' "install" rules (if they have the work tree vs target
tree distinction), or "post-checkout" rules (if they want to use
the work tree in-place), and not having to write/reinvent a
Makefile target for this in every project would hopefully make
it easier to use.  That is the only reason I am writing this
message on this topic.

> so would changes like this be acceptable?

That is a different question.  Is having an extention to help
people who want to manage perm bits a worthy goal?  Perhaps, but
it depends.  Is it worthy enough goal to complicate the really
core parts of the code and add huge maintenance burden?
Absolutely not.  Can it be made in such a way that it does not
have much impact to the core parts?  We need to see how it is
done.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  8:06                     ` metastore Junio C Hamano
@ 2007-09-16  8:30                       ` David Kastrup
  2007-09-16 20:19                         ` metastore david
  2007-09-16 15:51                       ` metastore Daniel Barkalow
  2007-09-16 21:45                       ` metastore david
  2 siblings, 1 reply; 72+ messages in thread
From: David Kastrup @ 2007-09-16  8:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: david, Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

Junio C Hamano <gitster@pobox.com> writes:

> Yes, I am very well aware that somebody already mentioned "there
> is a window between the true checkout and permission tweaking".
> If you need to touch the core level in order to close that
> window, I am not interested.

Doing this atomically involves creating the file in question by
specifying the permissions on the creat system call already, and
possibly wrap seteuid calls and similar around it for getting the
right file/ownership.

However, it is not really necessary to do this atomically: instead one
can rather create the file using safe permissions (600) at first, then
do fchown and fchmod (or chown/chmod) at some point in time afterwards
as required.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  8:06                     ` metastore Junio C Hamano
  2007-09-16  8:30                       ` metastore David Kastrup
@ 2007-09-16 15:51                       ` Daniel Barkalow
  2007-09-16 21:12                         ` metastore david
  2007-09-16 21:45                       ` metastore david
  2 siblings, 1 reply; 72+ messages in thread
From: Daniel Barkalow @ 2007-09-16 15:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: david, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> I however think your idea to have extra "permission information
> file" is very interesting.  What would be more palatable, than
> mucking with the core level git, would be to have an external
> command that takes two tree object names that tells it what the
> old and new trees our work tree is switching between, and have
> that command to:
> 
>  - inspect the diff-tree output to find out what were checked
>    out and might need their permission information tweaked;
> 
>  - inspect the differences between the "permission information
>    file" in these trees to find out what were _not_ checked out,
>    but still need their permission information tweaked.
> 
>  - tweak whatever external information you are interested in
>    expressing in your "permission information file" in the work
>    tree for the paths it discovered in the above two steps.
>    This step may involve actions specific to projects and call
>    hook scripts with <path, info from "permission information
>    file" for that path> tuples to carry out the actual tweaking.

Why not have the command also responsible for creating the files that need 
to be created (calling back into git to read their contents)? That way, 
there's no window where they've been created without their metadata, and 
there's more that the core git doesn't have to worry about.

I could see the program getting the index, the target tree, and the 
directory to put files in, and being told to do the whole 2-way merge 
(except, perhaps, updating the index to match the tree, which git could do 
afterwards). As far as git would be concerned, it would mostly be like a 
bare repository.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16  6:14                   ` martin f krafft
@ 2007-09-16 15:51                     ` Jan Hudec
  2007-09-16 19:43                       ` david
  2007-09-17 13:31                       ` martin f krafft
  0 siblings, 2 replies; 72+ messages in thread
From: Jan Hudec @ 2007-09-16 15:51 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Johannes Schindelin, Daniel Barkalow, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz, David Härdeman

[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]

On Sun, Sep 16, 2007 at 08:14:11 +0200, martin f krafft wrote:
> also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.16.0014 +0200]:
> > While at it, you should invent a fallback what to do when the
> > owner is not present on the system you check out on.  And
> > a fallback when checking out on a filesystem that does not support
> > owners.
> 
> Like rsync, git would use numerical UIDs (which are always present)
> by default, but could be told to try to map account names.
> 
> If the filesystem does not support owners, chown() would not exist.
> I actually tend to think of things the other way around: instead of
> a fallback when chown() does not work (what would such a fallback be
> other than not chown()ing?), it would only try chown() if such
> functionality existed.

There's a problem. You need to know that the functionality is missing and not
try to read attributes back, but instead consider them unchanged. Nothing
that can't be taken care of, but it needs to be handled carefuly.

> > And a fallback when a non-root user uses it.
> 
> That's easy, Unix already provides you with that "fallback": pack up
> /etc in a tar and unpack it as a normal user...

But if you tar that up again, the owners will be different. But you don't
want the change.

> > Oh, and while you're at it (you said that it would be nice not to
> > restrict git in any way: "it is a content tracker") support the
> > Windows style "Group-or-User-or-something:[FRW]" ACLs.
> 
> Provided we find a way to implement this in an extensible manner,
> this should not be hard to do. I can't do it since I don't have
> access to a Windows machine.
> 
> Your statement does catch me off-guard though. Does git now
> officially target Windows?

Official git works in cygwin. There is also a port to msys, which is
not official in a sense it is not merged into mainline.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16  1:30                   ` david
  2007-09-16  2:48                     ` Johannes Schindelin
  2007-09-16  8:06                     ` metastore Junio C Hamano
@ 2007-09-16 15:59                     ` Jan Hudec
  2007-09-16 20:36                       ` david
  2 siblings, 1 reply; 72+ messages in thread
From: Jan Hudec @ 2007-09-16 15:59 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

On Sat, Sep 15, 2007 at 18:30:53 -0700, david@lang.hm wrote:
> 1. whatever is trying to write the files with the correct permissions
>    needs to be able to query the permission store before files are
>    written. This needs to either be an API call into git to retreive the
>    information for any file when it's written, or the ability to define a
>    specific file to be checked out first so that it can be used for
>    everything else.

You seem to be forgetting about the index. Git never writes trees directly to
filesystem, but always with intermediate step in the index. So the API
actually exists -- simply read from the index.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16 15:51                     ` Jan Hudec
@ 2007-09-16 19:43                       ` david
  2007-09-17 13:31                       ` martin f krafft
  1 sibling, 0 replies; 72+ messages in thread
From: david @ 2007-09-16 19:43 UTC (permalink / raw)
  To: Jan Hudec
  Cc: martin f krafft, git, Johannes Schindelin, Daniel Barkalow,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Jan Hudec wrote:

> On Sun, Sep 16, 2007 at 08:14:11 +0200, martin f krafft wrote:
>> also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.16.0014 +0200]:
>>> While at it, you should invent a fallback what to do when the
>>> owner is not present on the system you check out on.  And
>>> a fallback when checking out on a filesystem that does not support
>>> owners.
>>
>> Like rsync, git would use numerical UIDs (which are always present)
>> by default, but could be told to try to map account names.
>>
>> If the filesystem does not support owners, chown() would not exist.
>> I actually tend to think of things the other way around: instead of
>> a fallback when chown() does not work (what would such a fallback be
>> other than not chown()ing?), it would only try chown() if such
>> functionality existed.
>
> There's a problem. You need to know that the functionality is missing and not
> try to read attributes back, but instead consider them unchanged. Nothing
> that can't be taken care of, but it needs to be handled carefuly.

but this can be handled by a local config option. yes, you have to be 
careful, but it'snot that hard.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  8:30                       ` metastore David Kastrup
@ 2007-09-16 20:19                         ` david
  0 siblings, 0 replies; 72+ messages in thread
From: david @ 2007-09-16 20:19 UTC (permalink / raw)
  To: David Kastrup
  Cc: Junio C Hamano, Johannes Schindelin, Daniel Barkalow,
	martin f krafft, git, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

On Sun, 16 Sep 2007, David Kastrup wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Yes, I am very well aware that somebody already mentioned "there
>> is a window between the true checkout and permission tweaking".
>> If you need to touch the core level in order to close that
>> window, I am not interested.
>
> Doing this atomically involves creating the file in question by
> specifying the permissions on the creat system call already, and
> possibly wrap seteuid calls and similar around it for getting the
> right file/ownership.
>
> However, it is not really necessary to do this atomically: instead one
> can rather create the file using safe permissions (600) at first, then
> do fchown and fchmod (or chown/chmod) at some point in time afterwards
> as required.

the problem with this in /etc is if you do the wrong file as 600 you can 
cause lots of nasty problems to the system during the window. for some 
files/directories you will want to write the file to a temp name and then 
move the file atomicly to the final location.

git itself shouldn't need to worry about this, the external write routine 
I'm talking about is the correct place for this (at least until all the 
bugs get worked out and everyone is comfortable that everything is good, 
and doesn't impact the core git code badly)

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16 15:59                     ` metastore (was: Track /etc directory using Git) Jan Hudec
@ 2007-09-16 20:36                       ` david
  0 siblings, 0 replies; 72+ messages in thread
From: david @ 2007-09-16 20:36 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Jan Hudec wrote:

> On Sat, Sep 15, 2007 at 18:30:53 -0700, david@lang.hm wrote:
>> 1. whatever is trying to write the files with the correct permissions
>>    needs to be able to query the permission store before files are
>>    written. This needs to either be an API call into git to retreive the
>>    information for any file when it's written, or the ability to define a
>>    specific file to be checked out first so that it can be used for
>>    everything else.
>
> You seem to be forgetting about the index. Git never writes trees directly to
> filesystem, but always with intermediate step in the index. So the API
> actually exists -- simply read from the index.

Ok, this sounds promising.

looking into one approach here.

assume for the moment that at write time an external program gets called.

this program reads the file contents from stdin and gets it's other 
information from git as command line parameters
   parameters I can think it would need are
    path to write the file to
    length of file
    name of the permission file
    id of the commit this is part of (possibly)

how does this program access the contents of the permission file in the 
index?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 15:51                       ` metastore Daniel Barkalow
@ 2007-09-16 21:12                         ` david
  2007-09-16 21:28                           ` metastore Junio C Hamano
  2007-09-16 22:02                           ` metastore Daniel Barkalow
  0 siblings, 2 replies; 72+ messages in thread
From: david @ 2007-09-16 21:12 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Junio C Hamano, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Daniel Barkalow wrote:

>> I however think your idea to have extra "permission information
>> file" is very interesting.  What would be more palatable, than
>> mucking with the core level git, would be to have an external
>> command that takes two tree object names that tells it what the
>> old and new trees our work tree is switching between, and have
>> that command to:
>>
>>  - inspect the diff-tree output to find out what were checked
>>    out and might need their permission information tweaked;
>>
>>  - inspect the differences between the "permission information
>>    file" in these trees to find out what were _not_ checked out,
>>    but still need their permission information tweaked.
>>
>>  - tweak whatever external information you are interested in
>>    expressing in your "permission information file" in the work
>>    tree for the paths it discovered in the above two steps.
>>    This step may involve actions specific to projects and call
>>    hook scripts with <path, info from "permission information
>>    file" for that path> tuples to carry out the actual tweaking.
>
> Why not have the command also responsible for creating the files that need
> to be created (calling back into git to read their contents)? That way,
> there's no window where they've been created without their metadata, and
> there's more that the core git doesn't have to worry about.

my initial thoughts were to have git do all it's normal work and hook into 
git at the point where it's writing the file out (where today it chooses 
between writing the data to a file on disk, pipeing to stdout, or pipeing 
to a pager) by adding the option to pipe into a different program that 
would deal with the permission stuff. this program would only have to 
write the file and set the permissions, it wouldn't have to know anything 
about git other then where to find the permissions it needs to know.

it sounds like you are suggesting that the hook be much earlier in the 
process, and instead of one copy of git running and calling many copies of 
the writing program, you would have one copy of the writing program that 
would call many copies of git.

I'll admit that my initial reaction is that it's probably a lot more 
expensive to do all the calls into git. git just has a lot more complex 
things to do.

> I could see the program getting the index, the target tree, and the
> directory to put files in, and being told to do the whole 2-way merge
> (except, perhaps, updating the index to match the tree, which git could do
> afterwards). As far as git would be concerned, it would mostly be like a
> bare repository.

if this functionality does shift to earlier in the process, how much of 
the git logic needs to be duplicated in this program?

if this program needs to do the merge, won't it have to duplicate the 
merge logic, including the .gitattributes checking for custom merge calls?

I have been thinking primarily in terms of doing a complete checkout, 
overwriting all files, and secondarily how do do a checkout of just a few 
files, but again where all files selected overwrite the existing files.

I wasn't thinking of the fact that git optimizes the checkout and avoids 
writing a file that didn't change.

this changes things slightly

prior to this I was thinking that the permission file needed to be handled 
differently becouse writing it out needed to avoid doing any circular 
refrences where you would need to check the contents of it to write it 
out.

it now appears as if what really needs to happen is that if the permission 
file changes a different program needs to be called when it's written out 
then when the other files are written out. by itself this isn't hard as 
.gitattributes can have a special entry for this filename and that entry 
can specify a different program, and that program fixes all the 
permissions (and/or detects that they can't be fixed due to 
user/filesystem limits, records the error, checks if the repository is set 
appropriately, and screams to the user if it isn't)

it would be a nice optimization to this permission checkout for it to 
compare the old and the new permissions so that it only tries to change 
the permissions where it needs to, but is that really nessasary? the 
program can look at the permissions of the existing files to see what they 
are and decide if it needs to change them (this would tromp on local 
changes that aren't checked in. how big of a problem is this?) my initial 
reaction is that having to know the two commits and do the comparison 
between them is adding a lot of logic and git interaction that I'd rather 
avoid if I could.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 21:12                         ` metastore david
@ 2007-09-16 21:28                           ` Junio C Hamano
  2007-09-16 21:45                             ` metastore Daniel Barkalow
  2007-09-16 21:53                             ` metastore david
  2007-09-16 22:02                           ` metastore Daniel Barkalow
  1 sibling, 2 replies; 72+ messages in thread
From: Junio C Hamano @ 2007-09-16 21:28 UTC (permalink / raw)
  To: david
  Cc: Daniel Barkalow, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

> my initial thoughts were to have git do all it's normal work and hook
> into git at the point where it's writing the file out (where today it
> chooses between writing the data to a file on disk, pipeing to stdout,
> or pipeing to a pager) by adding the option to pipe into a different
> program that would deal with the permission stuff. this program would
> only have to write the file and set the permissions, it wouldn't have
> to know anything about git other then where to find the permissions it
> needs to know.
>
> it sounds like you are suggesting that the hook be much earlier in the
> process,...

Well, you misread me or what I said was confusing or both.  I
was suggesting totally opposite.  Let git do all its normal
work, and then call your hook to munge the work tree in any way
you want.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16  8:06                     ` metastore Junio C Hamano
  2007-09-16  8:30                       ` metastore David Kastrup
  2007-09-16 15:51                       ` metastore Daniel Barkalow
@ 2007-09-16 21:45                       ` david
  2007-09-16 22:11                         ` metastore Junio C Hamano
  2 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-09-16 21:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

some of this duplicates thoughts from other messages in this thread. 
apologies for the duplication, but I want to be clear the response to 
Junio's concerns here as well

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
>
>> git has pre-commit hooks that could be used to gather the permission
>> information and store it into a file.
>>
>> git now has the ability to define cusom merge strategies for specific
>> file types, which could be used to handle merges for the permission
>> files.
>> ...
>> There are some significant advantages of having the permission store
>> be just a text file.
>>
>> 1. it doesn't require a special API to a new datastore in git
>>
>> 2. when working in an environment that doesn't allow for implementing the
>>    permissions (either a filesystem that can't store the permissions or
>>    when not working as root so that you can't set the ownership) the file
>>    can just be written and then edited with normal tools.
>>
>> 3. normal merge tools do a reasonable job of merging them.
>>
>> however to do this git would need to gain the ability to say 'this
>> filename is special, it must be checked out before any other file is
>> checked out' (either on a per-directory or per-repository level)
>
> I'd rather not implement it at such a low level where a true
> "checkout" happens.  For one thing, I am afraid that the special
> casing will affect the normal codepath too much and would make
> it into a maintenance nightmare.

as I understand it, at this point you already choose between three 
options.

1. write to a file (and set the write bit if needed)
2. write to stdout
3. write to a pager program

I am suggesting adding

4. write to a .gitattributes defined program and pass it some parameters.
    (and only if the .gitattributes tell you to)

this should be a very small change to the codepath

or am I missing something major here?

if this program can get the contents of the permission file out of the 
index, then the requirement I listed before to make sure the permission 
file gets written before anything else goes away, and the only requirement 
left is the ability to specify a different write method

> But more importantly, if you
> are switching between commits (this includes switching branches,
> checking out a different commit to a detached HEAD, or
> pulling/merging updates your HEAD and updates your work tree),
> and the contents of a path does not change between the original
> commit and the switched-to commit, you may still have to
> "checkout" the external information for that path if your
> "permission information file" are different between these two
> commits.  To the underlying checkout aka "two tree merge"
> operation, that kind of change is invisible and it should stay
> so for performance reasons, not to harm the normal operation.

I had not thought of this condition.

however, I think this may be easier then you are thinking

we have two conditions.

1. the permission file hasn't changed.

Solution:  do nothing

2. the permission file has changed

Solution: set all the permissions to match the new file

this could be done by useing .gitattributes to specify a different program 
for checking out the permission file, and that program goes through the 
file and sets the permssions on everything. yes this is a bit inefficiant 
compared to diffing the two permission files and only touching the files 
that have changed, but is the efficiancy at this point that critical? if 
so then instead of feeding the program the contents of the new file you 
could feed it the diff between the old and the new file.

in theory you could do this for any file, and it would be a win for some 
files (a large file that has a few changes to it would possibly be more 
efficiant to modify in place then to re-write), but I'm not sure the 
results would be worth the complications. if .gitattributes gains the 
ability to specify the program to be used to write the file, it could also 
gain the ability to specify feeding that file the diff instead of the full 
contents.

the one drawback to just setting all the permissions is that this will 
overrule any local changes to files that weren't otherwise modified. how 
big of a problem is this?

> IOW, I do not want the core level to even know about the
> existence of "permission information file", even the code that
> implements it is well isolated, ifdefed out or made conditional
> based on some config variable.

nobody is suggesting anything that wouldn't be at least conditional based 
on some config variable.

> I however think your idea to have extra "permission information
> file" is very interesting.  What would be more palatable, than
> mucking with the core level git, would be to have an external
> command that takes two tree object names that tells it what the
> old and new trees our work tree is switching between, and have
> that command to:
>
> - inspect the diff-tree output to find out what were checked
>   out and might need their permission information tweaked;
>
> - inspect the differences between the "permission information
>   file" in these trees to find out what were _not_ checked out,
>   but still need their permission information tweaked.
>
> - tweak whatever external information you are interested in
>   expressing in your "permission information file" in the work
>   tree for the paths it discovered in the above two steps.
>   This step may involve actions specific to projects and call
>   hook scripts with <path, info from "permission information
>   file" for that path> tuples to carry out the actual tweaking.

this is an area I wasn't aware of, but it doesn't seem that difficult to 
do. the issue (as I address above) is if this needs to be done as a diff 
or if it can be done simply by setting all the permissions according to 
the new file.

> If we go that route, I am not deeply opposed to add code to
> Porcelains to call that new command after they "checkout" a new
> commit at the very end of their processing (namely, git-commit,
> git-merge, git-am, and git-rebase).

this is saying you want a wrapper around git instead of a hook in git.

> Yes, I am very well aware that somebody already mentioned "there
> is a window between the true checkout and permission tweaking".
> If you need to touch the core level in order to close that
> window, I am not interested.

no matter how small the change? (see the above comments) If so this 
converstion isn't worth continuing, if you are just concerned about 
maintainability and are willing to consider small changes that won't cause 
big maintinance problems then we can continue to discuss if the changes I 
am suggesting are small enough. the need to be able to close the 
vunerability window is a showstopper to many uses.

>> The ability to handle /etc comes up every few months. it's got to be
>> the most common unimplemented request git has seen.
>
> Asking a pony for many times does not necessary make it the
> right for you to have the pony.  The sane way to implement this
> is in your Makefile, as Randal and other people with more
> experience have already pointed out, and I happen to agree with
> them.

you don't always have a makefile. if other tools that you use make 
modifications to the files in the locations where they reside, having to 
pull those changes back before you can do a checking is a complication as 
well

> My gut feeling is that the approach to use an external hook that
> reads your "permission information file" could be done with
> negligible impact to the normal operation of git.  I suspect
> that the "new command" I suggested above that would run after
> "checkout" actions would perform what people need to do in their
> Makefiles' "install" rules (if they have the work tree vs target
> tree distinction), or "post-checkout" rules (if they want to use
> the work tree in-place), and not having to write/reinvent a
> Makefile target for this in every project would hopefully make
> it easier to use.  That is the only reason I am writing this
> message on this topic.

but you are not willing to allow the hook to be created, you are saying 
that there would need to be an external wrapper instead.

at this point it appears that having a hook to be able to specify external 
programs at a point where you are already deciding between different 
options would be sufficiant.

>> so would changes like this be acceptable?
>
> That is a different question.  Is having an extention to help
> people who want to manage perm bits a worthy goal?  Perhaps, but
> it depends.  Is it worthy enough goal to complicate the really
> core parts of the code and add huge maintenance burden?
> Absolutely not.  Can it be made in such a way that it does not
> have much impact to the core parts?  We need to see how it is
> done.

this is why I was asking about this approach. do changes like this seem 
small enough to be worth the effort of coding and submitting?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 21:28                           ` metastore Junio C Hamano
@ 2007-09-16 21:45                             ` Daniel Barkalow
  2007-09-16 21:53                             ` metastore david
  1 sibling, 0 replies; 72+ messages in thread
From: Daniel Barkalow @ 2007-09-16 21:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: david, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
> 
> > my initial thoughts were to have git do all it's normal work and hook
> > into git at the point where it's writing the file out (where today it
> > chooses between writing the data to a file on disk, pipeing to stdout,
> > or pipeing to a pager) by adding the option to pipe into a different
> > program that would deal with the permission stuff. this program would
> > only have to write the file and set the permissions, it wouldn't have
> > to know anything about git other then where to find the permissions it
> > needs to know.
> >
> > it sounds like you are suggesting that the hook be much earlier in the
> > process,...
> 
> Well, you misread me or what I said was confusing or both.  I
> was suggesting totally opposite.  Let git do all its normal
> work, and then call your hook to munge the work tree in any way
> you want.

I think he was replying to me, not you. I was suggesting that git stop at 
the index, and let him take care of deciding how the index relates to the 
work tree. That is, he'd get called instead of check_updates() in 
unpack-trees. (And we might have to funnel more code paths through this 
function, so that checkout-index does what read-tree -m would do, wrt 
changes to the filesystem).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 21:28                           ` metastore Junio C Hamano
  2007-09-16 21:45                             ` metastore Daniel Barkalow
@ 2007-09-16 21:53                             ` david
  1 sibling, 0 replies; 72+ messages in thread
From: david @ 2007-09-16 21:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Daniel Barkalow, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
>
>> my initial thoughts were to have git do all it's normal work and hook
>> into git at the point where it's writing the file out (where today it
>> chooses between writing the data to a file on disk, pipeing to stdout,
>> or pipeing to a pager) by adding the option to pipe into a different
>> program that would deal with the permission stuff. this program would
>> only have to write the file and set the permissions, it wouldn't have
>> to know anything about git other then where to find the permissions it
>> needs to know.
>>
>> it sounds like you are suggesting that the hook be much earlier in the
>> process,...
>
> Well, you misread me or what I said was confusing or both.  I
> was suggesting totally opposite.  Let git do all its normal
> work, and then call your hook to munge the work tree in any way
> you want.

so you are saying, have git write everything out as-is and then call a 
program afterwords to do things? essentially a post-checkout hook?

such a hook is useful in many situations, and would allow for the workflow 
where you have /etc, /etc.git, and write scripts to move things back and 
forth between them.

so I do think that this is a capability that would be useful to git 
overall.

however, for the specific use-case of maintaining /etc I don't think that 
it's as good as having a hook at write time.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 21:12                         ` metastore david
  2007-09-16 21:28                           ` metastore Junio C Hamano
@ 2007-09-16 22:02                           ` Daniel Barkalow
  2007-09-16 22:37                             ` metastore david
  1 sibling, 1 reply; 72+ messages in thread
From: Daniel Barkalow @ 2007-09-16 22:02 UTC (permalink / raw)
  To: david
  Cc: Junio C Hamano, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, david@lang.hm wrote:

> On Sun, 16 Sep 2007, Daniel Barkalow wrote:
> 
> > > I however think your idea to have extra "permission information
> > > file" is very interesting.  What would be more palatable, than
> > > mucking with the core level git, would be to have an external
> > > command that takes two tree object names that tells it what the
> > > old and new trees our work tree is switching between, and have
> > > that command to:
> > >
> > >  - inspect the diff-tree output to find out what were checked
> > >    out and might need their permission information tweaked;
> > >
> > >  - inspect the differences between the "permission information
> > >    file" in these trees to find out what were _not_ checked out,
> > >    but still need their permission information tweaked.
> > >
> > >  - tweak whatever external information you are interested in
> > >    expressing in your "permission information file" in the work
> > >    tree for the paths it discovered in the above two steps.
> > >    This step may involve actions specific to projects and call
> > >    hook scripts with <path, info from "permission information
> > >    file" for that path> tuples to carry out the actual tweaking.
> >
> > Why not have the command also responsible for creating the files that need
> > to be created (calling back into git to read their contents)? That way,
> > there's no window where they've been created without their metadata, and
> > there's more that the core git doesn't have to worry about.
> 
> my initial thoughts were to have git do all it's normal work and hook into git
> at the point where it's writing the file out (where today it chooses between
> writing the data to a file on disk, pipeing to stdout, or pipeing to a pager)
> by adding the option to pipe into a different program that would deal with the
> permission stuff. this program would only have to write the file and set the
> permissions, it wouldn't have to know anything about git other then where to
> find the permissions it needs to know.
> 
> it sounds like you are suggesting that the hook be much earlier in the
> process, and instead of one copy of git running and calling many copies of the
> writing program, you would have one copy of the writing program that would
> call many copies of git.

A lot of the git commands are actually currently shell scripts  that call 
back to git, so that's not too different. The reason to have a single copy 
of the writing program is that it would be able to get the whole set of 
differences that need to be handled, and first pick out the metadata file, 
process it to figure out the writing instructions once, figure out the 
changes in the writing instructions, and figure out the changes in the 
content, and decide what to do.

> > I could see the program getting the index, the target tree, and the
> > directory to put files in, and being told to do the whole 2-way merge
> > (except, perhaps, updating the index to match the tree, which git could do
> > afterwards). As far as git would be concerned, it would mostly be like a
> > bare repository.
> 
> if this functionality does shift to earlier in the process, how much of the
> git logic needs to be duplicated in this program?
> 
> if this program needs to do the merge, won't it have to duplicate the merge
> logic, including the .gitattributes checking for custom merge calls?

This is two-way merge, not three-way merge. The basic concept is that 
you're in state A, and you want to be in state B. Rather than writing out 
all of state B, you write out all of state B that's different from state 
A. Think of taking a diff of two big trees and then applying it as a 
patch, instead of copying the new tree onto the old tree; the benefit is 
that stuff that doesn't change doesn't get rewritten, and the diff is 
blazingly fast, given how we store our information.

3-way merge will be handled by git, and not in a live /etc directory 
anyway (that is, you'd want to fix up the metadata files as plain text 
files, not as metadata bits on a checked out directory; otherwise, you'll 
be trying to put conflict markers in mode bits, and that's clearly not 
what you want).

> I have been thinking primarily in terms of doing a complete checkout,
> overwriting all files, and secondarily how do do a checkout of just a few
> files, but again where all files selected overwrite the existing files.
> 
> I wasn't thinking of the fact that git optimizes the checkout and avoids
> writing a file that didn't change.
> 
> this changes things slightly
> 
> prior to this I was thinking that the permission file needed to be handled
> differently becouse writing it out needed to avoid doing any circular
> refrences where you would need to check the contents of it to write it out.
> 
> it now appears as if what really needs to happen is that if the permission
> file changes a different program needs to be called when it's written out then
> when the other files are written out. by itself this isn't hard as
> .gitattributes can have a special entry for this filename and that entry can
> specify a different program, and that program fixes all the permissions
> (and/or detects that they can't be fixed due to user/filesystem limits,
> records the error, checks if the repository is set appropriately, and screams
> to the user if it isn't)

While we're at it, you probably don't even want to write the permission 
file to the live filesystem. It's just one more thing that could leak 
information, and changes to the permissions of files that you record by 
committing the live filesystem would presumably be done by changing the 
permissions of files in the filesystem, not by changing the text file.

(Of course, you could check out the same commits as ordinary source, with 
developer-owned 644 files and a 644 "permissions" file, and there you'd 
have the permissions file appear in the work tree, and you could edit it 
and check it in in a totally mundane way.)

> it would be a nice optimization to this permission checkout for it to compare
> the old and the new permissions so that it only tries to change the
> permissions where it needs to, but is that really nessasary? the program can
> look at the permissions of the existing files to see what they are and decide
> if it needs to change them (this would tromp on local changes that aren't
> checked in. how big of a problem is this?) my initial reaction is that having
> to know the two commits and do the comparison between them is adding a lot of
> logic and git interaction that I'd rather avoid if I could.

You probably want to be able to keep local uncommitted changes. People 
like to be able to have things slightly different in their particular 
deployment from the way things are in the repository, for stuff that only 
applies to one system and isn't "how it should be".

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 21:45                       ` metastore david
@ 2007-09-16 22:11                         ` Junio C Hamano
  2007-09-16 22:52                           ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2007-09-16 22:11 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

>> I'd rather not implement it at such a low level where a true
>> "checkout" happens.  For one thing, I am afraid that the special
>> casing will affect the normal codepath too much and would make
>> it into a maintenance nightmare.
>
> as I understand it, at this point you already choose between three
> options.
>
> 1. write to a file (and set the write bit if needed)
> 2. write to stdout
> 3. write to a pager program
>
> I am suggesting adding
> ...
> or am I missing something major here?

I do not think we are choosing any option in the codepath at
all.

What I mean by the normal "checkout" is what checkout_entry in
entry.c does.  There is no other option than (1) above.  I would
want to see an extremely good justification if you need to touch
that codepath to implement this fringe use case.

I do not think there is nothing that writes file contents to
stdout/pager other than "git cat-file" or "git show"; I do not
think they are what you have in mind when talking about managing
the files under /etc.  So unfortunately I do not understand the
rest of the discussion you made in your message.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 22:02                           ` metastore Daniel Barkalow
@ 2007-09-16 22:37                             ` david
  2007-09-17 13:30                               ` metastore martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-09-16 22:37 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Junio C Hamano, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Daniel Barkalow wrote:

> On Sun, 16 Sep 2007, david@lang.hm wrote:
>
>> On Sun, 16 Sep 2007, Daniel Barkalow wrote:
>>
>>>> I however think your idea to have extra "permission information
>>>> file" is very interesting.  What would be more palatable, than
>>>> mucking with the core level git, would be to have an external
>>>> command that takes two tree object names that tells it what the
>>>> old and new trees our work tree is switching between, and have
>>>> that command to:
>>>>
>>>>  - inspect the diff-tree output to find out what were checked
>>>>    out and might need their permission information tweaked;
>>>>
>>>>  - inspect the differences between the "permission information
>>>>    file" in these trees to find out what were _not_ checked out,
>>>>    but still need their permission information tweaked.
>>>>
>>>>  - tweak whatever external information you are interested in
>>>>    expressing in your "permission information file" in the work
>>>>    tree for the paths it discovered in the above two steps.
>>>>    This step may involve actions specific to projects and call
>>>>    hook scripts with <path, info from "permission information
>>>>    file" for that path> tuples to carry out the actual tweaking.
>>>
>>> Why not have the command also responsible for creating the files that need
>>> to be created (calling back into git to read their contents)? That way,
>>> there's no window where they've been created without their metadata, and
>>> there's more that the core git doesn't have to worry about.
>>
>> my initial thoughts were to have git do all it's normal work and hook into git
>> at the point where it's writing the file out (where today it chooses between
>> writing the data to a file on disk, pipeing to stdout, or pipeing to a pager)
>> by adding the option to pipe into a different program that would deal with the
>> permission stuff. this program would only have to write the file and set the
>> permissions, it wouldn't have to know anything about git other then where to
>> find the permissions it needs to know.
>>
>> it sounds like you are suggesting that the hook be much earlier in the
>> process, and instead of one copy of git running and calling many copies of the
>> writing program, you would have one copy of the writing program that would
>> call many copies of git.
>
> A lot of the git commands are actually currently shell scripts  that call
> back to git, so that's not too different. The reason to have a single copy
> of the writing program is that it would be able to get the whole set of
> differences that need to be handled, and first pick out the metadata file,
> process it to figure out the writing instructions once, figure out the
> changes in the writing instructions, and figure out the changes in the
> content, and decide what to do.

I'm still a little unclear on how much work this program would then have 
to do. it's problably my lack of understanding that's makeing this sound 
much scarier.

>>> I could see the program getting the index, the target tree, and the
>>> directory to put files in, and being told to do the whole 2-way merge
>>> (except, perhaps, updating the index to match the tree, which git could do
>>> afterwards). As far as git would be concerned, it would mostly be like a
>>> bare repository.
>>
>> if this functionality does shift to earlier in the process, how much of the
>> git logic needs to be duplicated in this program?
>>
>> if this program needs to do the merge, won't it have to duplicate the merge
>> logic, including the .gitattributes checking for custom merge calls?
>
> This is two-way merge, not three-way merge. The basic concept is that
> you're in state A, and you want to be in state B. Rather than writing out
> all of state B, you write out all of state B that's different from state
> A. Think of taking a diff of two big trees and then applying it as a
> patch, instead of copying the new tree onto the old tree; the benefit is
> that stuff that doesn't change doesn't get rewritten, and the diff is
> blazingly fast, given how we store our information.

so what would this program be given?

it sounds like it would be called once for the entire tree checkout

would it be handed just the start and end commits and query git for 
everything else it needs?

it sounds like there is more then this, you refer to git fully crafting 
the new index.

so would this program be accessing an old and new index and do the 
comparison between the two?

or would git feed it a list of what's changed and then have it query git 
to find the details of the changes.

> 3-way merge will be handled by git, and not in a live /etc directory
> anyway (that is, you'd want to fix up the metadata files as plain text
> files, not as metadata bits on a checked out directory; otherwise, you'll
> be trying to put conflict markers in mode bits, and that's clearly not
> what you want).

right, we don't want conflict markers on mode bits or other ACL type 
things, that way lies madness ;)

>> I have been thinking primarily in terms of doing a complete checkout,
>> overwriting all files, and secondarily how do do a checkout of just a few
>> files, but again where all files selected overwrite the existing files.
>>
>> I wasn't thinking of the fact that git optimizes the checkout and avoids
>> writing a file that didn't change.
>>
>> this changes things slightly
>>
>> prior to this I was thinking that the permission file needed to be handled
>> differently becouse writing it out needed to avoid doing any circular
>> refrences where you would need to check the contents of it to write it out.
>>
>> it now appears as if what really needs to happen is that if the permission
>> file changes a different program needs to be called when it's written out then
>> when the other files are written out. by itself this isn't hard as
>> .gitattributes can have a special entry for this filename and that entry can
>> specify a different program, and that program fixes all the permissions
>> (and/or detects that they can't be fixed due to user/filesystem limits,
>> records the error, checks if the repository is set appropriately, and screams
>> to the user if it isn't)
>
> While we're at it, you probably don't even want to write the permission
> file to the live filesystem. It's just one more thing that could leak
> information, and changes to the permissions of files that you record by
> committing the live filesystem would presumably be done by changing the
> permissions of files in the filesystem, not by changing the text file.

the permissions and ACL's can be queried directly from the filesystem, so 
I don't see any security problems with writing the permission file to the 
filesystem.

changing the permissions would be done by changing the files themselves 
(when you are running as root on a filesystem that supports the changes, 
otherwise it would need to fall back to writing the file and getting the 
changes there, but that should be able to be a local config option)

I don't like the idea of having a file that doesn't appear on the local 
filesystem at any point, it just makes troubleshooting too hard.

> (Of course, you could check out the same commits as ordinary source, with
> developer-owned 644 files and a 644 "permissions" file, and there you'd
> have the permissions file appear in the work tree, and you could edit it
> and check it in in a totally mundane way.)

right, and the same thing if the filesystem doesn't support something in 
the permission file.

>> it would be a nice optimization to this permission checkout for it to compare
>> the old and the new permissions so that it only tries to change the
>> permissions where it needs to, but is that really nessasary? the program can
>> look at the permissions of the existing files to see what they are and decide
>> if it needs to change them (this would tromp on local changes that aren't
>> checked in. how big of a problem is this?) my initial reaction is that having
>> to know the two commits and do the comparison between them is adding a lot of
>> logic and git interaction that I'd rather avoid if I could.
>
> You probably want to be able to keep local uncommitted changes. People
> like to be able to have things slightly different in their particular
> deployment from the way things are in the repository, for stuff that only
> applies to one system and isn't "how it should be".

if so this means that the permission changing program definantly needs to 
operate on the diff of the permisison file, not on the absolute file. this 
complicates things slightly, but it shouldn't be too bad.

changing topic slightly.

I know git has pre-commit hooks, but I've never needed to use them.

at what point can you hook in?

can you define a hook that runs when you do a git-add? or only when you do 
a git-commit?

the reason I'm asking is to try and figure out when and how to create the 
permissions file. when I was thinking in terms of dealing with the 
permissions as a single bog block it wasn't that bad to say that at 
git-commit time you have to scan every file and check it's permissions to 
record them into the file, but with the push for the optimizations that 
you're talking about  this is no longer reasonable and it really should be 
done when the file is added to the index.

on a related note, if this is implemented as a per-write hook then it 
makes a lot of sense to have the permission file be per-directory, but if 
we do a per-checkout hook like you are suggesting then the permission file 
may make more sense as a single file in the top-level directory.

thoughts?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 22:11                         ` metastore Junio C Hamano
@ 2007-09-16 22:52                           ` david
  2007-09-17  0:58                             ` metastore Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-09-16 22:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
>
>>> I'd rather not implement it at such a low level where a true
>>> "checkout" happens.  For one thing, I am afraid that the special
>>> casing will affect the normal codepath too much and would make
>>> it into a maintenance nightmare.
>>
>> as I understand it, at this point you already choose between three
>> options.
>>
>> 1. write to a file (and set the write bit if needed)
>> 2. write to stdout
>> 3. write to a pager program
>>
>> I am suggesting adding
>> ...
>> or am I missing something major here?
>
> I do not think we are choosing any option in the codepath at
> all.
>
> What I mean by the normal "checkout" is what checkout_entry in
> entry.c does.  There is no other option than (1) above.  I would
> want to see an extremely good justification if you need to touch
> that codepath to implement this fringe use case.
>
> I do not think there is nothing that writes file contents to
> stdout/pager other than "git cat-file" or "git show"; I do not
> think they are what you have in mind when talking about managing
> the files under /etc.  So unfortunately I do not understand the
> rest of the discussion you made in your message.

Ok, I thought that there was common code for these different uses. could 
you re-read the rest of the logic based on the change being done in 
checkout_entry?

if you are unwilling to have any changes made to the checkout_entry code 
then the only remaing question is what you think of Daniel's suggestion to 
have a hook to replace check_updates()?

if it's not acceptable either then we are down to doing a post-checkout 
trigger.

one concern I have with that approach is how to deal with partial 
checkouts. if a user checks out one file how can the post-checkout trigger 
know if it's looking at the correct permissions file as opposed to one 
left over from something else? can/should it go and read the file from the 
index instead of reading the file on the filesystem? (I don't like this 
becouse it leads to non-obvious behavior), or can/should there be a config 
option to say that whenever any file is checked out the permissions file 
needs to be checked out as well.

a post checkout trigger is useful in enough different situations that the 
answers to the above questions don't eliminate the usefulness of the 
trigger, they just map out the pitfalls of useing it.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 22:52                           ` metastore david
@ 2007-09-17  0:58                             ` Junio C Hamano
  2007-09-17  2:31                               ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2007-09-17  0:58 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

> On Sun, 16 Sep 2007, Junio C Hamano wrote:
>>
>> I do not think there is nothing that writes file contents to
>> stdout/pager other than "git cat-file" or "git show"; I do not
>> think they are what you have in mind when talking about managing
>> the files under /etc.  So unfortunately I do not understand the
>> rest of the discussion you made in your message.
>
> Ok, I thought that there was common code for these different
> uses. could you re-read the rest of the logic based on the change
> being done in checkout_entry?
>
> if you are unwilling to have any changes made to the checkout_entry
> code then the only remaing question is what you think of Daniel's
> suggestion to have a hook to replace check_updates()?
>
> if it's not acceptable either then we are down to doing a
> post-checkout trigger.

Post-checkout trigger is something I can say I can live with
without looking at the actual patch, but that does not mean it
would be a better approach at all.

I would not be able to answer the first question right now; that
needs a patch to prove that it can be done with a well contained
set of changes that results in a maintainable code.

I haven't tried to assess the potential extent of damage needed
to checkout_entry(), and I have never been interested in this
"keeping track of /etc in place" topic myself.  It is unlikely
I'll try to come up with such a patch on my own to support it at
such a low level near the core.  Somebody who cares about that
feature needs to take the initiative of doing that work before
we can discuss and decide, although older-times including myself
can help spot potential issues.

So while I admit I am skeptical, consider me neither willing nor
unwilling at this point.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17  0:58                             ` metastore Junio C Hamano
@ 2007-09-17  2:31                               ` david
  2007-09-17  4:23                                 ` metastore Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-09-17  2:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
>
>> On Sun, 16 Sep 2007, Junio C Hamano wrote:
>>>
>>> I do not think there is nothing that writes file contents to
>>> stdout/pager other than "git cat-file" or "git show"; I do not
>>> think they are what you have in mind when talking about managing
>>> the files under /etc.  So unfortunately I do not understand the
>>> rest of the discussion you made in your message.
>>
>> Ok, I thought that there was common code for these different
>> uses. could you re-read the rest of the logic based on the change
>> being done in checkout_entry?
>>
>> if you are unwilling to have any changes made to the checkout_entry
>> code then the only remaing question is what you think of Daniel's
>> suggestion to have a hook to replace check_updates()?
>>
>> if it's not acceptable either then we are down to doing a
>> post-checkout trigger.
>
> Post-checkout trigger is something I can say I can live with
> without looking at the actual patch, but that does not mean it
> would be a better approach at all.

we agree on this much at least :-)

> I would not be able to answer the first question right now; that
> needs a patch to prove that it can be done with a well contained
> set of changes that results in a maintainable code.

you cannot answer the question in the affirmitive, but you could say that 
any changes in that area would be completely unacceptable to you (and for 
a while it sounded like you were saying exactly that). in which case any 
effort put into preparing patches would be a waste of time

> I haven't tried to assess the potential extent of damage needed
> to checkout_entry(), and I have never been interested in this
> "keeping track of /etc in place" topic myself.  It is unlikely
> I'll try to come up with such a patch on my own to support it at
> such a low level near the core.  Somebody who cares about that
> feature needs to take the initiative of doing that work before
> we can discuss and decide, although older-times including myself
> can help spot potential issues.
>
> So while I admit I am skeptical, consider me neither willing nor
> unwilling at this point.

this is reasonable. thanks for pointing me so clearly at the routine that 
needs to be modified.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17  2:31                               ` metastore david
@ 2007-09-17  4:23                                 ` Junio C Hamano
  2007-09-17  4:35                                   ` metastore david
  2007-09-17 17:42                                   ` metastore Daniel Barkalow
  0 siblings, 2 replies; 72+ messages in thread
From: Junio C Hamano @ 2007-09-17  4:23 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

>> Post-checkout trigger is something I can say I can live with
>> without looking at the actual patch, but that does not mean it
>> would be a better approach at all.
>
> we agree on this much at least :-)
>
>> I would not be able to answer the first question right now; that
>> needs a patch to prove that it can be done with a well contained
>> set of changes that results in a maintainable code.
>
> you cannot answer the question in the affirmitive, but you could say
> that any changes in that area would be completely unacceptable to you
> (and for a while it sounded like you were saying exactly that). in
> which case any effort put into preparing patches would be a waste of
> time

I tend to disagree.  It's far from a waste of time.  While, as I
said, I am skeptical that such a patch would be small impact, if
it helps people's needs, somebody will pick it up and carry
forward, even if that somebody is not me.  It can then mature
out of tree and later could be merged.  We simply do not know
unless somebody tries.  And I am quite happy that you seem to be
motivated enough to see how it goes.

On the other hand, the experiment could fail and you may end up
with a patch that is too messy to be acceptable, in which case
you might feel it a waste of time, but I do not think it is a
waste even in such a case.  We would learn what works and what
doesn't, and we can bury "keeping track of /etc" topic to rest.

I also need to rant here a bit.

Fortunately we haven't had this problem too many times on this
list, but sometimes people say "Here is my patch.  If this is
accepted I'll add documentation and tests".  I rarely reply to
such patches without sugarcoating my response, but my internal
reaction is, "Don't you, as the person who proposes that change,
believe in your patch deeply enough to be willing to perfect it,
in order to make it suitable for consumption by the general
public, whether it is included in my tree or not?  A change that
even you do not believe in yourself has very little chance of
benefitting the general public, so thanks but no thanks, I'll
pass."

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17  4:23                                 ` metastore Junio C Hamano
@ 2007-09-17  4:35                                   ` david
  2007-09-17  6:06                                     ` metastore Junio C Hamano
  2007-09-17 17:42                                   ` metastore Daniel Barkalow
  1 sibling, 1 reply; 72+ messages in thread
From: david @ 2007-09-17  4:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
>
>>> Post-checkout trigger is something I can say I can live with
>>> without looking at the actual patch, but that does not mean it
>>> would be a better approach at all.
>>
>> we agree on this much at least :-)
>>
>>> I would not be able to answer the first question right now; that
>>> needs a patch to prove that it can be done with a well contained
>>> set of changes that results in a maintainable code.
>>
>> you cannot answer the question in the affirmitive, but you could say
>> that any changes in that area would be completely unacceptable to you
>> (and for a while it sounded like you were saying exactly that). in
>> which case any effort put into preparing patches would be a waste of
>> time
>
> I tend to disagree.  It's far from a waste of time.  While, as I
> said, I am skeptical that such a patch would be small impact, if
> it helps people's needs, somebody will pick it up and carry
> forward, even if that somebody is not me.  It can then mature
> out of tree and later could be merged.  We simply do not know
> unless somebody tries.  And I am quite happy that you seem to be
> motivated enough to see how it goes.
>
> On the other hand, the experiment could fail and you may end up
> with a patch that is too messy to be acceptable, in which case
> you might feel it a waste of time, but I do not think it is a
> waste even in such a case.  We would learn what works and what
> doesn't, and we can bury "keeping track of /etc" topic to rest.

this is perfectly acceptable to me. I was trying to make very sure that 
this topic fell in this catagory.

there are other topics that come up repeatedly that do get (and deserve) 
automatic rejections ('patch to explicitly record renames' for example). 
and while I didn't think that 'managing /etc' was in the same catagory, 
sometimes that catagory is defined as much by the opinions and goals of 
the core team as it is by techinical considerations.

there's a huge difference between 'this patch is rejected becouse we think 
the implementation is bad' and 'this patch is rejected becouse we disagree 
with the fundamental goal of the patch' effort spent on a patch rejected 
for the first reason is never a complete waste (if nothing else it can 
serve an an example of how not to do things for future developers ;-) but 
effort spent on a patch that's rejected for the second reason is useually 
a waste, and as such I make it a point to discuss the objective and basic 
approach before spending much effort on somthing.

> I also need to rant here a bit.
>
> Fortunately we haven't had this problem too many times on this
> list, but sometimes people say "Here is my patch.  If this is
> accepted I'll add documentation and tests".  I rarely reply to
> such patches without sugarcoating my response, but my internal
> reaction is, "Don't you, as the person who proposes that change,
> believe in your patch deeply enough to be willing to perfect it,
> in order to make it suitable for consumption by the general
> public, whether it is included in my tree or not?  A change that
> even you do not believe in yourself has very little chance of
> benefitting the general public, so thanks but no thanks, I'll
> pass."
>

I hope that my questions did not seem to fall into this catagory.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17  4:35                                   ` metastore david
@ 2007-09-17  6:06                                     ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2007-09-17  6:06 UTC (permalink / raw)
  To: david
  Cc: Johannes Schindelin, Daniel Barkalow, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

david@lang.hm writes:

> On Sun, 16 Sep 2007, Junio C Hamano wrote:
>
>> I also need to rant here a bit.
>>
>> Fortunately we haven't had this problem too many times on this
>> list, but sometimes people say "Here is my patch.  If this is
>> accepted I'll add documentation and tests".  I rarely reply to
>> such patches without sugarcoating my response, but my internal
>> reaction is, "Don't you, as the person who proposes that change,
>> believe in your patch deeply enough to be willing to perfect it,
>> in order to make it suitable for consumption by the general
>> public, whether it is included in my tree or not?  A change that
>> even you do not believe in yourself has very little chance of
>> benefitting the general public, so thanks but no thanks, I'll
>> pass."
>
> I hope that my questions did not seem to fall into this catagory.

Not at all.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-15 23:33                 ` metastore Randal L. Schwartz
  2007-09-16  0:37                   ` metastore david
@ 2007-09-17 13:04                   ` Francis Moreau
  2007-09-17 15:32                     ` metastore Randal L. Schwartz
  1 sibling, 1 reply; 72+ messages in thread
From: Francis Moreau @ 2007-09-17 13:04 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: Grzegorz Kulewski, martin f krafft, git

Hello,

On 9/16/07, Randal L. Schwartz <merlyn@stonehenge.com> wrote:
> >>>>> "Grzegorz" == Grzegorz Kulewski <kangur@polcom.net> writes:
>
> Grzegorz> Not only for tracking /etc or /home but also for example for "web
> Grzegorz> applications" (for example in PHP). In that case file and directory
> Grzegorz> permissions can be as important as the source code tracked and it is pain to
> Grzegorz> chmod (and sometimes chown) all files to different values after each
> Grzegorz> checkout. Not speaking about potential race.
>
> Uh, works just fine for me to manage my web site content.  The point is
> that I treat git for what it is... a source code management system.
> And then I have a Makefile that "installs" my source code into the live
> directory, with the right modes during installation.
>
> Why does everyone keep wanting "work dir == live dir".  Ugh!  The work dir is
> the *source*... it gets *copied* into your live dir *somehow*.  And *that* is
> where the meta information needs to be.  In that "somehow".
>

Interesting. Could you show us what this makefile actually looks ?

How would you create a repo to track /etc ? I'm thinking of importing
this directory by using tar, do you think it's correct ?

thanks.
-- 
Francis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-16 22:37                             ` metastore david
@ 2007-09-17 13:30                               ` martin f krafft
  2007-09-17 17:17                                 ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-09-17 13:30 UTC (permalink / raw)
  To: git
  Cc: david, Daniel Barkalow, Junio C Hamano, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

[-- Attachment #1: Type: text/plain, Size: 3285 bytes --]

also sprach david@lang.hm <david@lang.hm> [2007.09.17.0037 +0200]:
>> While we're at it, you probably don't even want to write the
>> permission file to the live filesystem. It's just one more thing
>> that could leak information, and changes to the permissions of
>> files that you record by committing the live filesystem would
>> presumably be done by changing the permissions of files in the
>> filesystem, not by changing the text file.
>
> the permissions and ACL's can be queried directly from the
> filesystem, so I don't see any security problems with writing the
> permission file to the filesystem.
>
> changing the permissions would be done by changing the files
> themselves (when you are running as root on a filesystem that
> supports the changes, otherwise it would need to fall back to
> writing the file and getting the changes there, but that should be
> able to be a local config option)
>
> I don't like the idea of having a file that doesn't appear on the
> local filesystem at any point, it just makes troubleshooting too
> hard.

Reading over your thoughts, I get this uneasy feeling about such
a permissions file, because it stores redundant information, and
redundant information has a tendency to get out of sync. If we
cannot attach attributes to objects in the git database, then
I understand the need for such a metastore. But I don't think it
should be checked out and visible, or maybe we should think of it
not in terms of a file anyway, but a metastore. Or how do you want
to resolve the situation when a user might edit the file, changing
a mode from 644 to 640, while in the filesystem, it was changed by
other means to 600.

.gitattributes is a different story since it stores git-specificy
attributes, which are present nowhere else in the checkout.

I still maintain it would be best if git allowed extra data to be
attached to object nodes. When you start thinking about
cherry-picking or even simple merges, I think that makes most sense.
And we don't need conflict markers, we could employ an iterative
merge process as e.g. git-rebase uses:

  "a conflict has been found in the file mode of ...
   ... 2750 vs. 2755 ...
   please set the file mode as it should be and do git-merge
   --continue. Or git-merge --abort. ..."

>> (Of course, you could check out the same commits as ordinary source, with
>> developer-owned 644 files and a 644 "permissions" file, and there you'd
>> have the permissions file appear in the work tree, and you could edit it
>> and check it in in a totally mundane way.)
>
> right, and the same thing if the filesystem doesn't support something in the 
> permission file.

I'd much rather see something like `git-attr chmod 644
file-in-index` to make this change, rather than a file, which
introduces the potential for syntax errors.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

"to me, vi is zen. to use vi is to practice zen. every command is
 a koan. profound to the user, unintelligible to the uninitiated.
 you discover truth everytime you use it."
                                       -- reddy ät lion.austin.ibm.com

spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16 15:51                     ` Jan Hudec
  2007-09-16 19:43                       ` david
@ 2007-09-17 13:31                       ` martin f krafft
  1 sibling, 0 replies; 72+ messages in thread
From: martin f krafft @ 2007-09-17 13:31 UTC (permalink / raw)
  To: git
  Cc: Jan Hudec, Johannes Schindelin, Daniel Barkalow,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

[-- Attachment #1: Type: text/plain, Size: 2121 bytes --]

also sprach Jan Hudec <bulb@ucw.cz> [2007.09.16.1751 +0200]:
> > If the filesystem does not support owners, chown() would not
> > exist. I actually tend to think of things the other way around:
> > instead of a fallback when chown() does not work (what would
> > such a fallback be other than not chown()ing?), it would only
> > try chown() if such functionality existed.
>·
> There's a problem. You need to know that the functionality is
> missing and not try to read attributes back, but instead consider
> them unchanged. Nothing that can't be taken care of, but it needs
> to be handled carefuly.

This is a good consideration. One way of implementing this seems to
be to iterate over all file attributes recorded in the object cache
(or metastore) and try to apply each. For every attribute that was
properly applied to the worktree, a note is attached to the object's
data in the index. Tools identifying differences between index and
worktree would then only pay attention to these attributes.

> But if you tar that up again, the owners will be different. But
> you don't want the change.

As per my above suggestion, this would solve itself. Untarring as
non-root simply means that the chmod/chown/whatever calls would fail
or not be tried at all. Thus, they would not be recorded in the
index and later commits would never consider changes to these
attributes.

One could probably simplify the implementation such that failure to
chmod/chown/whatever a single file would make the attribute be 
ignored when worktree and index are compared. Then, it would all 
boil down to a combination of configuration and functionality: the
attributes the user wants to have tracked (configuration) and those
which can be applied to the worktree when logically and'ed result in
the final mask of attributes to consider when identifying changes.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

a gourmet concerned about calories
is like a punter eyeing the clock.

spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17 13:04                   ` metastore Francis Moreau
@ 2007-09-17 15:32                     ` Randal L. Schwartz
  0 siblings, 0 replies; 72+ messages in thread
From: Randal L. Schwartz @ 2007-09-17 15:32 UTC (permalink / raw)
  To: Francis Moreau; +Cc: Grzegorz Kulewski, martin f krafft, git

>>>>> "Francis" == Francis Moreau <francis.moro@gmail.com> writes:

Francis> Interesting. Could you show us what this makefile actually looks ?

In fact, I wrote a magazine article about it. :)

http://www.stonehenge.com/merlyn/LinuxMag/col38.html

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17 13:30                               ` metastore martin f krafft
@ 2007-09-17 17:17                                 ` david
  2007-09-17 19:46                                   ` metastore Josh England
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-09-17 17:17 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Daniel Barkalow, Junio C Hamano, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Mon, 17 Sep 2007, martin f krafft wrote:

> also sprach david@lang.hm <david@lang.hm> [2007.09.17.0037 +0200]:
>>> While we're at it, you probably don't even want to write the
>>> permission file to the live filesystem. It's just one more thing
>>> that could leak information, and changes to the permissions of
>>> files that you record by committing the live filesystem would
>>> presumably be done by changing the permissions of files in the
>>> filesystem, not by changing the text file.
>>
>> the permissions and ACL's can be queried directly from the
>> filesystem, so I don't see any security problems with writing the
>> permission file to the filesystem.
>>
>> changing the permissions would be done by changing the files
>> themselves (when you are running as root on a filesystem that
>> supports the changes, otherwise it would need to fall back to
>> writing the file and getting the changes there, but that should be
>> able to be a local config option)
>>
>> I don't like the idea of having a file that doesn't appear on the
>> local filesystem at any point, it just makes troubleshooting too
>> hard.
>
> Reading over your thoughts, I get this uneasy feeling about such
> a permissions file, because it stores redundant information, and
> redundant information has a tendency to get out of sync. If we
> cannot attach attributes to objects in the git database, then
> I understand the need for such a metastore. But I don't think it
> should be checked out and visible, or maybe we should think of it
> not in terms of a file anyway, but a metastore. Or how do you want
> to resolve the situation when a user might edit the file, changing
> a mode from 644 to 640, while in the filesystem, it was changed by
> other means to 600.

each local repository would need to be configured to either recreate the 
permissions file at checkin time or to use the permission file and ignore 
the actual permissions on the file.

while I agree that it would be ideal to store this data inside git, I'm 
more interested in getting a functional implementation, and given the 
reluctance of the git core team to allow any changes to support this 
use-case anything that can be done to minimize the changes needed to 
support this use-case is a good thing.

> .gitattributes is a different story since it stores git-specificy
> attributes, which are present nowhere else in the checkout.
>
> I still maintain it would be best if git allowed extra data to be
> attached to object nodes. When you start thinking about
> cherry-picking or even simple merges, I think that makes most sense.
> And we don't need conflict markers, we could employ an iterative
> merge process as e.g. git-rebase uses:
>
>  "a conflict has been found in the file mode of ...
>   ... 2750 vs. 2755 ...
>   please set the file mode as it should be and do git-merge
>   --continue. Or git-merge --abort. ..."

and there's nothing to prevent the checkin hook from running such a 
comparison if you want it to.

>>> (Of course, you could check out the same commits as ordinary source, with
>>> developer-owned 644 files and a 644 "permissions" file, and there you'd
>>> have the permissions file appear in the work tree, and you could edit it
>>> and check it in in a totally mundane way.)
>>
>> right, and the same thing if the filesystem doesn't support something in the
>> permission file.
>
> I'd much rather see something like `git-attr chmod 644
> file-in-index` to make this change, rather than a file, which
> introduces the potential for syntax errors.

first make this useable, then if it starts getting used widely (which 
would not at all surprise me, many distros are looking for good options 
for doing this sort of thing, I wouldn't be surprised to see several of 
them start useing git if it did the job well) things can be moved from 
external scripts and storage to internal capabilities as appropriate.

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17  4:23                                 ` metastore Junio C Hamano
  2007-09-17  4:35                                   ` metastore david
@ 2007-09-17 17:42                                   ` Daniel Barkalow
  2007-09-17 19:19                                     ` metastore Junio C Hamano
  1 sibling, 1 reply; 72+ messages in thread
From: Daniel Barkalow @ 2007-09-17 17:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: david, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

On Sun, 16 Sep 2007, Junio C Hamano wrote:

> david@lang.hm writes:
> 
> >> Post-checkout trigger is something I can say I can live with
> >> without looking at the actual patch, but that does not mean it
> >> would be a better approach at all.
> >
> > we agree on this much at least :-)
> >
> >> I would not be able to answer the first question right now; that
> >> needs a patch to prove that it can be done with a well contained
> >> set of changes that results in a maintainable code.
> >
> > you cannot answer the question in the affirmitive, but you could say
> > that any changes in that area would be completely unacceptable to you
> > (and for a while it sounded like you were saying exactly that). in
> > which case any effort put into preparing patches would be a waste of
> > time
> 
> I tend to disagree.  It's far from a waste of time.  While, as I
> said, I am skeptical that such a patch would be small impact, if
> it helps people's needs, somebody will pick it up and carry
> forward, even if that somebody is not me.  It can then mature
> out of tree and later could be merged.  We simply do not know
> unless somebody tries.  And I am quite happy that you seem to be
> motivated enough to see how it goes.

There's certainly the possibility that a changeset could consist of some 
patches that make the index/filesystem handling more clear, some patches 
that make the tree/index handling more clear, and some patches that allow 
a hook to replace one of these entirely. Things can be a lot more 
acceptable if the intrusive changes are improvements for the 
maintainability of the normal case, and the special case code is no longer 
intrusive at all.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17 17:42                                   ` metastore Daniel Barkalow
@ 2007-09-17 19:19                                     ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2007-09-17 19:19 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: david, Johannes Schindelin, martin f krafft, git,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz,
	David Härdeman

Daniel Barkalow <barkalow@iabervon.org> writes:

> .... Things can be a lot more 
> acceptable if the intrusive changes are improvements for the 
> maintainability of the normal case, and the special case code is no longer 
> intrusive at all.

Very well said.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-09-17 17:17                                 ` metastore david
@ 2007-09-17 19:46                                   ` Josh England
  0 siblings, 0 replies; 72+ messages in thread
From: Josh England @ 2007-09-17 19:46 UTC (permalink / raw)
  To: david
  Cc: martin f krafft, git, Daniel Barkalow, Junio C Hamano,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz, David Härdeman

I'd like to point out the following two posts, as I think they are
relevant to this thread:

[PATCH] example hook script to save/restore file permissions/ownership
http://marc.info/?l=git&m=118953004817642&w=2

[PATCH] post_merge hook, related documentation, and tests
http://marc.info/?l=git&m=118953004730496&w=2

The hook script above runs in a pre-commit hook to write out file
metadata to a file in the repository.  It can then be run from the
post-merge hook (patch above) to restore permissions.  Running it from a
post-checkout hook may be more appropriate, but post-merge seems to work
well for my purposes.  The script handles merge conflicts and (in my
testing) does the right thing.  I'm using it now to track metadata for
not just /etc, but an entire linux image.

It will handle merge conflicts by recognizing that the metadata file had
a conflict, and will direct the user to resolve the conflict and reset
working dir perms before allowing a commit.

-JE

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-16  6:08                 ` martin f krafft
@ 2007-09-19 19:16                   ` David Härdeman
  2007-10-02 19:53                     ` martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: David Härdeman @ 2007-09-19 19:16 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Daniel Barkalow, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz

On Sun, Sep 16, 2007 at 08:08:59AM +0200, martin f krafft wrote:
>also sprach Daniel Barkalow <barkalow@iabervon.org> [2007.09.15.2156 +0200]:
>> Configuration options only apply to the local aspects of the repository. 
>> That is, when you clone a repository, you don't get the configuration 
>> options from it, in general. And changing configuration options on a 
>> repository does not have any effect on the content it contains. So 
>> configuration options aren't appropriate.
>
>Sure they are. Just like git-commit figures out your email address 
>if user.email is missing from git-config, or core.sharedRepository 
>or core.umask deal with permissions only when you tell them to, 
>you'd have to enable core.track or else git would just do what it
>does right now.
>
>> Git doesn't have any way to represent owners or groups, and they
>> would need to be represented carefully in order to make sense
>> across multiple computers. If you're adding support for
>> metadata-as-content (for more than "is this a script?"), you
>> should be able to cover all of the common cases of extended stuff,
>> like AFS-style ACLs.
>
>Ideally, git should be able to store an open-ended number of
>properties for each object, yes.

I haven't followed the discussion at all I must admit (I wrote metastore 
as a quick hack to store some extended metadata and it works for my 
purposes as long as I don't do anything fancy). But I agree, if any 
changes were made to git, I'd advocate adding arbitrary attributes to 
files (much like xattrs) in name=value pairs, then any extended metadata 
could be stored in those attributes and external scripts/tools could use 
them in some way that makes sense...and also make sure to only update 
them when it makes sense.

-- 
David Härdeman

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-09-19 19:16                   ` David Härdeman
@ 2007-10-02 19:53                     ` martin f krafft
  2007-10-02 19:58                       ` David Härdeman
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-10-02 19:53 UTC (permalink / raw)
  To: git
  Cc: David Härdeman, Daniel Barkalow, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz

[-- Attachment #1: Type: text/plain, Size: 774 bytes --]

also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
> But I agree, if any changes were made to git, I'd advocate adding
> arbitrary attributes to files (much like xattrs) in name=value
> pairs, then any extended metadata could be stored in those
> attributes and external scripts/tools could use them in some way
> that makes sense...and also make sure to only update them when it
> makes sense.

So where would those metdata be stored in your opinion?

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
seen on an advertising for an elaborate swiss men's watch:
  "almost as complicated as a woman. except it's on time"
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-10-02 19:53                     ` martin f krafft
@ 2007-10-02 19:58                       ` David Härdeman
  2007-10-02 20:04                         ` metastore David Kastrup
  2007-10-02 21:02                         ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  0 siblings, 2 replies; 72+ messages in thread
From: David Härdeman @ 2007-10-02 19:58 UTC (permalink / raw)
  To: martin f krafft
  Cc: git, Daniel Barkalow, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz

On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
>> But I agree, if any changes were made to git, I'd advocate adding
>> arbitrary attributes to files (much like xattrs) in name=value
>> pairs, then any extended metadata could be stored in those
>> attributes and external scripts/tools could use them in some way
>> that makes sense...and also make sure to only update them when it
>> makes sense.
>
>So where would those metdata be stored in your opinion?

I'm not sufficiently versed in the internals of git to have an informed 
opinion :)

-- 
David Härdeman

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 19:58                       ` David Härdeman
@ 2007-10-02 20:04                         ` David Kastrup
  2007-10-02 20:18                           ` metastore david
  2007-10-02 21:15                           ` metastore David Härdeman
  2007-10-02 21:02                         ` metastore (was: Track /etc directory using Git) Daniel Barkalow
  1 sibling, 2 replies; 72+ messages in thread
From: David Kastrup @ 2007-10-02 20:04 UTC (permalink / raw)
  To: David Härdeman
  Cc: martin f krafft, git, Daniel Barkalow, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz

David Härdeman <david@hardeman.nu> writes:

> On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>>also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
>>> But I agree, if any changes were made to git, I'd advocate adding
>>> arbitrary attributes to files (much like xattrs) in name=value
>>> pairs, then any extended metadata could be stored in those
>>> attributes and external scripts/tools could use them in some way
>>> that makes sense...and also make sure to only update them when it
>>> makes sense.
>>
>>So where would those metdata be stored in your opinion?
>
> I'm not sufficiently versed in the internals of git to have an
> informed opinion :)

I think we have something like a length count for file names in index
and/or tree.  We could just put the (sorted) attributes after a NUL
byte in the file name and include them in the count.  It would also
make those artificially longer file names work more or less when
sorting them for deltification.

However, this requires implementing _policies_: it must be possible to
specify per repository exactly what will and what won't get tracked,
or one will get conflicts that are not necessary or appropriate.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:04                         ` metastore David Kastrup
@ 2007-10-02 20:18                           ` david
  2007-10-02 20:23                             ` metastore martin f krafft
  2007-10-02 21:15                           ` metastore David Härdeman
  1 sibling, 1 reply; 72+ messages in thread
From: david @ 2007-10-02 20:18 UTC (permalink / raw)
  To: David Kastrup
  Cc: David Härdeman, martin f krafft, git, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1461 bytes --]

On Tue, 2 Oct 2007, David Kastrup wrote:

> David Härdeman <david@hardeman.nu> writes:
>
>> On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>>> also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
>>>> But I agree, if any changes were made to git, I'd advocate adding
>>>> arbitrary attributes to files (much like xattrs) in name=value
>>>> pairs, then any extended metadata could be stored in those
>>>> attributes and external scripts/tools could use them in some way
>>>> that makes sense...and also make sure to only update them when it
>>>> makes sense.
>>>
>>> So where would those metdata be stored in your opinion?
>>
>> I'm not sufficiently versed in the internals of git to have an
>> informed opinion :)
>
> I think we have something like a length count for file names in index
> and/or tree.  We could just put the (sorted) attributes after a NUL
> byte in the file name and include them in the count.  It would also
> make those artificially longer file names work more or less when
> sorting them for deltification.

the problem with this is dealing with the attributes outside of git 
(especially when the filesystem can't store the attributes nativly, 
specificly including things like owners when not running as root)

this is one of the reasons for talking about useing a seperate file for 
the attributes (the other being the ability to minimize the impact to 
git-core of tracking attributes)

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:18                           ` metastore david
@ 2007-10-02 20:23                             ` martin f krafft
  2007-10-02 20:29                               ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-10-02 20:23 UTC (permalink / raw)
  To: david, David Kastrup, David Härdeman, git, Daniel Barkalow

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

also sprach david@lang.hm <david@lang.hm> [2007.10.02.2118 +0100]:
> the problem with this is dealing with the attributes outside of git 
> (especially when the filesystem can't store the attributes nativly, 
> specificly including things like owners when not running as root)

In which case you should not be able to manipulate them (as you
could not test the result) and any commits could not affect them,
meaning they'd just stay unchanged.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
the unix philosophy basically involves
giving you enough rope to hang yourself.
and then some more, just to be sure.
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:23                             ` metastore martin f krafft
@ 2007-10-02 20:29                               ` david
  2007-10-02 20:39                                 ` metastore martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-10-02 20:29 UTC (permalink / raw)
  To: martin f krafft
  Cc: David Kastrup, David Härdeman, git, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

On Tue, 2 Oct 2007, martin f krafft wrote:

> also sprach david@lang.hm <david@lang.hm> [2007.10.02.2118 +0100]:
>> the problem with this is dealing with the attributes outside of git
>> (especially when the filesystem can't store the attributes nativly,
>> specificly including things like owners when not running as root)
>
> In which case you should not be able to manipulate them (as you
> could not test the result) and any commits could not affect them,
> meaning they'd just stay unchanged.

two problems with this

1. you do want to be able to manipulate them

1a. how do you reconcile a conflict during a merge?

2. git is a series of snapshots, what does it mean to 'stay unchanged'?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:29                               ` metastore david
@ 2007-10-02 20:39                                 ` martin f krafft
  2007-10-02 20:54                                   ` metastore david
  0 siblings, 1 reply; 72+ messages in thread
From: martin f krafft @ 2007-10-02 20:39 UTC (permalink / raw)
  To: david, David Kastrup, David Härdeman, git, Daniel Barkalow

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

also sprach david@lang.hm <david@lang.hm> [2007.10.02.2129 +0100]:
> 1. you do want to be able to manipulate them
>
> 1a. how do you reconcile a conflict during a merge?

How could there be a conflict if you can't make local changes
because you can't represent the attributes locally/natively?

> 2. git is a series of snapshots, what does it mean to 'stay unchanged'?

In simple terms, let (content,A,B) be an object with content
"content" and extended attributes A,B, and B cannot be represented
locally, but a new object is committed with a change to attribute
A (content2,A2), then the result is (content2,A2,B), as B simply
comes from the (corresponding object of the) parent.

Or am I totally misunderstanding?

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

when compared to windoze, unix is an operating system.

spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:39                                 ` metastore martin f krafft
@ 2007-10-02 20:54                                   ` david
  2007-10-02 21:42                                     ` metastore martin f krafft
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-10-02 20:54 UTC (permalink / raw)
  To: martin f krafft
  Cc: David Kastrup, David Härdeman, git, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

On Tue, 2 Oct 2007, martin f krafft wrote:

> also sprach david@lang.hm <david@lang.hm> [2007.10.02.2129 +0100]:
>> 1. you do want to be able to manipulate them
>>
>> 1a. how do you reconcile a conflict during a merge?
>
> How could there be a conflict if you can't make local changes
> because you can't represent the attributes locally/natively?

you merge two uptream branches that disagree about the attributes

>> 2. git is a series of snapshots, what does it mean to 'stay unchanged'?
>
> In simple terms, let (content,A,B) be an object with content
> "content" and extended attributes A,B, and B cannot be represented
> locally, but a new object is committed with a change to attribute
> A (content2,A2), then the result is (content2,A2,B), as B simply
> comes from the (corresponding object of the) parent.
>
> Or am I totally misunderstanding?

it's very possible that I am misunderstanding, but do we really want to 
have to go back to the parent to duplicate things when creating a new 
commit?

and aren't you supposed to be able to have more then one parent? if you 
do, which one would you use?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore (was: Track /etc directory using Git)
  2007-10-02 19:58                       ` David Härdeman
  2007-10-02 20:04                         ` metastore David Kastrup
@ 2007-10-02 21:02                         ` Daniel Barkalow
  1 sibling, 0 replies; 72+ messages in thread
From: Daniel Barkalow @ 2007-10-02 21:02 UTC (permalink / raw)
  To: David Härdeman
  Cc: martin f krafft, git, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1796 bytes --]

On Tue, 2 Oct 2007, David Härdeman wrote:

> On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
> >also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
> > > But I agree, if any changes were made to git, I'd advocate adding
> > > arbitrary attributes to files (much like xattrs) in name=value
> > > pairs, then any extended metadata could be stored in those
> > > attributes and external scripts/tools could use them in some way
> > > that makes sense...and also make sure to only update them when it
> > > makes sense.
> >
> >So where would those metdata be stored in your opinion?
> 
> I'm not sufficiently versed in the internals of git to have an informed
> opinion :)

My theory was that we would provide an API for getting the "current state" 
listing with all of the filenames and matching contents, and leave it up 
to metastore to put things in the filesystem; in the other direction, 
metastore would build up this state, and we'd store it.

People who are using this in practice would set a config option to 
delegate the "working tree" filesystem I/O to metastore, while other 
people could interact with the state as files describing the state, and 
could therefore specify operations that are impossible or prohibited on 
the filesystems that their development is done on.

(This would effectively be like giving people a convenient way of setting 
attributes on entries in a tar file, such that they can edit it to 
represent a stste that they can't necessarily create in their own 
filesystems, and version controlling that; but more convenient, since the 
file contents are represented as file contents and the attributes are 
plain text in a listing of some sort)

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:04                         ` metastore David Kastrup
  2007-10-02 20:18                           ` metastore david
@ 2007-10-02 21:15                           ` David Härdeman
  2007-10-02 21:44                             ` metastore martin f krafft
  2007-10-02 23:32                             ` metastore Julian Phillips
  1 sibling, 2 replies; 72+ messages in thread
From: David Härdeman @ 2007-10-02 21:15 UTC (permalink / raw)
  To: David Kastrup
  Cc: martin f krafft, git, Daniel Barkalow, Johannes Schindelin,
	Thomas Harning Jr., Francis Moreau, Nicolas Vilz

On Tue, Oct 02, 2007 at 10:04:56PM +0200, David Kastrup wrote:
>David Härdeman <david@hardeman.nu> writes:
>
>> On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>>>also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
>>>> But I agree, if any changes were made to git, I'd advocate adding
>>>> arbitrary attributes to files (much like xattrs) in name=value
>>>> pairs, then any extended metadata could be stored in those
>>>> attributes and external scripts/tools could use them in some way
>>>> that makes sense...and also make sure to only update them when it
>>>> makes sense.
>>>
>>>So where would those metdata be stored in your opinion?
>>
>> I'm not sufficiently versed in the internals of git to have an
>> informed opinion :)
>
>I think we have something like a length count for file names in index
>and/or tree.  We could just put the (sorted) attributes after a NUL
>byte in the file name and include them in the count.  It would also
>make those artificially longer file names work more or less when
>sorting them for deltification.

Or perhaps the index format could be extended to include a new field for 
value=name pairs instead of overloading the name field.

But as I said, I have no idea how feasible it would be to change git to 
support another arbitrary length field in the index/tree file.

>However, this requires implementing _policies_: it must be possible to
>specify per repository exactly what will and what won't get tracked,
>or one will get conflicts that are not necessary or appropriate.

I think the opposite approach would be better. Let git provide 
set/get/delete attribute operations and leave it at that. Then external 
programs can do what they want with that data and add/remove/modify tags 
as necessary (and also include the smarts to not, e.g. remove the 
permissions on all files if the git repo is checked out to a FAT fs).

-- 
David Härdeman

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 20:54                                   ` metastore david
@ 2007-10-02 21:42                                     ` martin f krafft
  0 siblings, 0 replies; 72+ messages in thread
From: martin f krafft @ 2007-10-02 21:42 UTC (permalink / raw)
  To: git
  Cc: david, David Kastrup, David Härdeman, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

[-- Attachment #1: Type: text/plain, Size: 636 bytes --]

also sprach david@lang.hm <david@lang.hm> [2007.10.02.2154 +0100]:
>> How could there be a conflict if you can't make local changes
>> because you can't represent the attributes locally/natively?
>
> you merge two uptream branches that disagree about the attributes

You win. :)

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
"mein gott, selbst ein huhn kann debian installieren, wenn du genug
 koerner auf die enter-taste legst."
                       -- thomas koehler in de.alt.sysadmin.recovery
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 21:15                           ` metastore David Härdeman
@ 2007-10-02 21:44                             ` martin f krafft
  2007-10-02 23:32                             ` metastore Julian Phillips
  1 sibling, 0 replies; 72+ messages in thread
From: martin f krafft @ 2007-10-02 21:44 UTC (permalink / raw)
  To: git
  Cc: David Härdeman, David Kastrup, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

[-- Attachment #1: Type: text/plain, Size: 482 bytes --]

also sprach David Härdeman <david@hardeman.nu> [2007.10.02.2215 +0100]:
> I think the opposite approach would be better. Let git provide
> set/get/delete attribute operations and leave it at that.

I like that idea.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
"information superhighway"
 is just an anagram for
"i'm on a huge wispy rhino fart".
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 21:15                           ` metastore David Härdeman
  2007-10-02 21:44                             ` metastore martin f krafft
@ 2007-10-02 23:32                             ` Julian Phillips
  2007-10-03  0:52                               ` metastore david
  1 sibling, 1 reply; 72+ messages in thread
From: Julian Phillips @ 2007-10-02 23:32 UTC (permalink / raw)
  To: David Härdeman
  Cc: David Kastrup, martin f krafft, git, Daniel Barkalow,
	Johannes Schindelin, Thomas Harning Jr., Francis Moreau,
	Nicolas Vilz

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3128 bytes --]

On Tue, 2 Oct 2007, David Härdeman wrote:

> On Tue, Oct 02, 2007 at 10:04:56PM +0200, David Kastrup wrote:
>> David Härdeman <david@hardeman.nu> writes:
>> 
>> >  On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>> > > also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 +0100]:
>> > > >  But I agree, if any changes were made to git, I'd advocate adding
>> > > >  arbitrary attributes to files (much like xattrs) in name=value
>> > > >  pairs, then any extended metadata could be stored in those
>> > > >  attributes and external scripts/tools could use them in some way
>> > > >  that makes sense...and also make sure to only update them when it
>> > > >  makes sense.
>> > > 
>> > > So where would those metdata be stored in your opinion?
>> > 
>> >  I'm not sufficiently versed in the internals of git to have an
>> >  informed opinion :)
>> 
>> I think we have something like a length count for file names in index
>> and/or tree.  We could just put the (sorted) attributes after a NUL
>> byte in the file name and include them in the count.  It would also
>> make those artificially longer file names work more or less when
>> sorting them for deltification.
>
> Or perhaps the index format could be extended to include a new field for 
> value=name pairs instead of overloading the name field.
>
> But as I said, I have no idea how feasible it would be to change git to 
> support another arbitrary length field in the index/tree file.
>
>> However, this requires implementing _policies_: it must be possible to
>> specify per repository exactly what will and what won't get tracked,
>> or one will get conflicts that are not necessary or appropriate.
>
> I think the opposite approach would be better. Let git provide set/get/delete 
> attribute operations and leave it at that. Then external programs can do what 
> they want with that data and add/remove/modify tags as necessary (and also 
> include the smarts to not, e.g. remove the permissions on all files if the 
> git repo is checked out to a FAT fs).

You need more than that.  You need to be able to log, blame etc on the 
attributes.  One of the big annoyances of Subversion properties is being 
unable to find out when or why a property value was changed.

I still don't see why the attributes need to be stored in git directly - 
particularly if you are going to use an external program to actually apply 
any settings - why not store the attributes as normal file (or files) of 
some sort tracked by git?  You could use any number of methods - e.g. use 
an sqlite database stored in the root of your tree, or a .<name>.props 
file alongside each path that you have properties for.  You could even 
write a system that uses such a method and was then SCM agnostic, allowing 
you to keep your attribute tracking system if/when something better than 
git comes along - or simply share it with less-fortunate souls stuck in an 
inferior system.

-- 
Julian

  ---
A strong conviction that something must be done is the parent of many
bad measures.
 		-- Daniel Webster

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-03  0:52                               ` metastore david
@ 2007-10-03  0:52                                 ` Johannes Schindelin
  0 siblings, 0 replies; 72+ messages in thread
From: Johannes Schindelin @ 2007-10-03  0:52 UTC (permalink / raw)
  To: david
  Cc: Julian Phillips, David H?rdeman, martin f krafft, git,
	Daniel Barkalow, Thomas Harning Jr., Francis Moreau, Nicolas Vilz

Hi,

On Tue, 2 Oct 2007, david@lang.hm wrote:

> in the discussion a few weeks ago I was told that there is a way to look 
> at the contents of a file that hasn't been checked out yet (somehow it 
> exists in a useable form 'in the index') but when I asked for 
> information about how to do this I never got a response.

git show :<filename>

(Note the ":")

Hth,
Dscho

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: metastore
  2007-10-02 23:32                             ` metastore Julian Phillips
@ 2007-10-03  0:52                               ` david
  2007-10-03  0:52                                 ` metastore Johannes Schindelin
  0 siblings, 1 reply; 72+ messages in thread
From: david @ 2007-10-03  0:52 UTC (permalink / raw)
  To: Julian Phillips
  Cc: David Härdeman, David Kastrup, martin f krafft, git,
	Daniel Barkalow, Johannes Schindelin, Thomas Harning Jr.,
	Francis Moreau, Nicolas Vilz

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4935 bytes --]

On Wed, 3 Oct 2007, Julian Phillips wrote:

> Subject: Re: metastore
> 
> On Tue, 2 Oct 2007, David Härdeman wrote:
>
>> On Tue, Oct 02, 2007 at 10:04:56PM +0200, David Kastrup wrote:
>>> David Härdeman <david@hardeman.nu> writes:
>>> 
>>> >  On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
>>> > > also sprach David Härdeman <david@hardeman.nu> [2007.09.19.2016 
>>> +0100]:
>>> > > >  But I agree, if any changes were made to git, I'd advocate adding
>>> > > >  arbitrary attributes to files (much like xattrs) in name=value
>>> > > >  pairs, then any extended metadata could be stored in those
>>> > > >  attributes and external scripts/tools could use them in some way
>>> > > >  that makes sense...and also make sure to only update them when it
>>> > > >  makes sense.
>>> > > > > So where would those metdata be stored in your opinion?
>>> > >  I'm not sufficiently versed in the internals of git to have an
>>> >  informed opinion :)
>>> 
>>> I think we have something like a length count for file names in index
>>> and/or tree.  We could just put the (sorted) attributes after a NUL
>>> byte in the file name and include them in the count.  It would also
>>> make those artificially longer file names work more or less when
>>> sorting them for deltification.
>> 
>> Or perhaps the index format could be extended to include a new field for 
>> value=name pairs instead of overloading the name field.
>> 
>> But as I said, I have no idea how feasible it would be to change git to 
>> support another arbitrary length field in the index/tree file.
>> 
>>> However, this requires implementing _policies_: it must be possible to
>>> specify per repository exactly what will and what won't get tracked,
>>> or one will get conflicts that are not necessary or appropriate.
>> 
>> I think the opposite approach would be better. Let git provide 
>> set/get/delete attribute operations and leave it at that. Then external 
>> programs can do what they want with that data and add/remove/modify tags as 
>> necessary (and also include the smarts to not, e.g. remove the permissions 
>> on all files if the git repo is checked out to a FAT fs).
>
> You need more than that.  You need to be able to log, blame etc on the 
> attributes.  One of the big annoyances of Subversion properties is being 
> unable to find out when or why a property value was changed.
>
> I still don't see why the attributes need to be stored in git directly - 
> particularly if you are going to use an external program to actually apply 
> any settings - why not store the attributes as normal file (or files) of some 
> sort tracked by git?  You could use any number of methods - e.g. use an 
> sqlite database stored in the root of your tree, or a .<name>.props file 
> alongside each path that you have properties for.  You could even write a 
> system that uses such a method and was then SCM agnostic, allowing you to 
> keep your attribute tracking system if/when something better than git comes 
> along - or simply share it with less-fortunate souls stuck in an inferior 
> system.

one other big advantage of keeping things in a normal file, it's easier to 
get the results accepted into git!

don't forget that the core git maintainers don't really see this as a 
worthwhile effort, so the more intrusive the result is the less likely it 
is to be accepted. It may end up that storing the attributes inside of git 
_is_ the best thing to do, but it's gong to be a whole lot easier to get a 
patch to implement this accepted if it's a migration from an existing, 
heavily used, implementation then if it's from the 'outside' with people 
saying "this is a neat thing, we think people would use it if it only had 
this"

and even if an internal implementation does end up being the right thing, 
the exact shape of the API is an item that will require a lot of debate 
(and probably a few false starts) to get right. let's figure out the 
real-world useage patterns first, and then work from there as appropriate.

shifting back onto implementaion details

in the discussion a few weeks ago I was told that there is a way to look 
at the contents of a file that hasn't been checked out yet (somehow it 
exists in a useable form 'in the index') but when I asked for information 
about how to do this I never got a response.

the reason for needing this is that the routines writing the files need to 
be able to access this information when they are dong so, but that file 
may not be checked out.

for that matter, .gitattributes should have a similar problem (if 
.gitattibutes for a directory hasn't been checked out yet how do you know 
if you could do the line ending conversions on a file or not?). how is the 
problem addressed there? (or is it the case that all the use so far has 
really not used the per-directory files and everything is in the master 
file, and that doesn't change enough to find these problems?

David Lang

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2007-10-03  0:53 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <38b2ab8a0709130511q7a506c5cvb0f8785a1d7ed7ad@mail.gmail.com>
     [not found] ` <20070913123137.GA31735@piper.oerlikon.madduck.net>
     [not found]   ` <38b2ab8a0709140108v2a9c3569i93b39f351f1d4ec3@mail.gmail.com>
     [not found]     ` <20070914091545.GA26432@piper.oerlikon.madduck.net>
2007-09-14 17:31       ` Track /etc directory using Git Thomas Harning Jr.
2007-09-14 21:26         ` Nicolas Vilz
2007-09-15 14:29           ` Pierre Habouzit
2007-09-15 15:24             ` martin f krafft
2007-09-15 15:27               ` Pierre Habouzit
2007-09-15 15:42                 ` martin f krafft
2007-09-15 13:26         ` metastore (was: Track /etc directory using Git) martin f krafft
2007-09-15 14:10           ` Johannes Schindelin
2007-09-15 14:16             ` metastore David Kastrup
2007-09-15 14:54             ` metastore (was: Track /etc directory using Git) martin f krafft
2007-09-15 16:22               ` Grzegorz Kulewski
2007-09-15 17:43                 ` Johannes Schindelin
2007-09-15 23:33                 ` metastore Randal L. Schwartz
2007-09-16  0:37                   ` metastore david
2007-09-16  1:10                     ` metastore Randal L. Schwartz
2007-09-16  1:49                       ` metastore david
2007-09-17 13:04                   ` metastore Francis Moreau
2007-09-17 15:32                     ` metastore Randal L. Schwartz
2007-09-15 19:56               ` metastore (was: Track /etc directory using Git) Daniel Barkalow
2007-09-15 22:14                 ` Johannes Schindelin
2007-09-16  1:30                   ` david
2007-09-16  2:48                     ` Johannes Schindelin
2007-09-16  3:00                       ` david
2007-09-16  8:06                     ` metastore Junio C Hamano
2007-09-16  8:30                       ` metastore David Kastrup
2007-09-16 20:19                         ` metastore david
2007-09-16 15:51                       ` metastore Daniel Barkalow
2007-09-16 21:12                         ` metastore david
2007-09-16 21:28                           ` metastore Junio C Hamano
2007-09-16 21:45                             ` metastore Daniel Barkalow
2007-09-16 21:53                             ` metastore david
2007-09-16 22:02                           ` metastore Daniel Barkalow
2007-09-16 22:37                             ` metastore david
2007-09-17 13:30                               ` metastore martin f krafft
2007-09-17 17:17                                 ` metastore david
2007-09-17 19:46                                   ` metastore Josh England
2007-09-16 21:45                       ` metastore david
2007-09-16 22:11                         ` metastore Junio C Hamano
2007-09-16 22:52                           ` metastore david
2007-09-17  0:58                             ` metastore Junio C Hamano
2007-09-17  2:31                               ` metastore david
2007-09-17  4:23                                 ` metastore Junio C Hamano
2007-09-17  4:35                                   ` metastore david
2007-09-17  6:06                                     ` metastore Junio C Hamano
2007-09-17 17:42                                   ` metastore Daniel Barkalow
2007-09-17 19:19                                     ` metastore Junio C Hamano
2007-09-16 15:59                     ` metastore (was: Track /etc directory using Git) Jan Hudec
2007-09-16 20:36                       ` david
2007-09-16  6:14                   ` martin f krafft
2007-09-16 15:51                     ` Jan Hudec
2007-09-16 19:43                       ` david
2007-09-17 13:31                       ` martin f krafft
2007-09-16  1:35                 ` david
2007-09-16  6:08                 ` martin f krafft
2007-09-19 19:16                   ` David Härdeman
2007-10-02 19:53                     ` martin f krafft
2007-10-02 19:58                       ` David Härdeman
2007-10-02 20:04                         ` metastore David Kastrup
2007-10-02 20:18                           ` metastore david
2007-10-02 20:23                             ` metastore martin f krafft
2007-10-02 20:29                               ` metastore david
2007-10-02 20:39                                 ` metastore martin f krafft
2007-10-02 20:54                                   ` metastore david
2007-10-02 21:42                                     ` metastore martin f krafft
2007-10-02 21:15                           ` metastore David Härdeman
2007-10-02 21:44                             ` metastore martin f krafft
2007-10-02 23:32                             ` metastore Julian Phillips
2007-10-03  0:52                               ` metastore david
2007-10-03  0:52                                 ` metastore Johannes Schindelin
2007-10-02 21:02                         ` metastore (was: Track /etc directory using Git) Daniel Barkalow
     [not found] ` <20070913122002.GO671@genesis.frugalware.org>
     [not found]   ` <38b2ab8a0709140120k50f5b474oc8a841ea0a5fda50@mail.gmail.com>
2007-09-15 16:32     ` Track /etc directory using Git martin f krafft
2007-09-15 16:57       ` David Kastrup

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).