git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* GIT_OBJECT_DIRECTORY
@ 2006-04-18 13:38 Jörn Engel
  2006-04-18 15:25 ` GIT_OBJECT_DIRECTORY Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 13:38 UTC (permalink / raw
  To: git

Hi!

I recently noticed GIT_OBJECT_DIRECTORY in the git manpage and wanted
to play with it.  But it looks as if it doesn't work, the
documentation is wrong/insufficient or I can't properly read the
documentation.  So let me figure out, which one it is.

$ set | grep GIT_OBJECT_DIRECTORY
GIT_OBJECT_DIRECTORY=/home/joern/.git

$ ls -l /home/joern/.git
total 288
drwxrwxr-x  2 joern joern 4096 Apr 16 01:22 0f
[...]

$ git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git foo
[ stored >200M of data under foo/.git/objects ]


The above looks as if new objects are not stored under
/home/joern/.git, as specified by GIT_OBJECT_DIRECTORY.  The manpage
tells me:

       GIT_OBJECT_DIRECTORY
              If  the  object storage directory is specified via this environ-
              ment variable then the sha1 directories are created underneath -
              otherwise the default $GIT_DIR/objects directory is used.

And I would interpret this as "store all new objects under
/home/joern/.git".  So far, things don't seem to imply me being too
stupid.  What went wrong?

Jörn

-- 
Why do musicians compose symphonies and poets write poems?
They do it because life wouldn't have any meaning for them if they didn't.
That's why I draw cartoons.  It's my life.
-- Charles Shultz

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
@ 2006-04-18 14:10 linux
  2006-04-18 14:16 ` GIT_OBJECT_DIRECTORY Jörn Engel
  0 siblings, 1 reply; 14+ messages in thread
From: linux @ 2006-04-18 14:10 UTC (permalink / raw
  To: joern; +Cc: git

Just to cover the obvious "is it plugged in?" questions, did you
also "export GIT_OBJECT_DIRECTORY"?  That is, what does
	env | grep GIT_OBJECT_DIRECTORY
produce?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 14:10 GIT_OBJECT_DIRECTORY linux
@ 2006-04-18 14:16 ` Jörn Engel
  0 siblings, 0 replies; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 14:16 UTC (permalink / raw
  To: linux; +Cc: git

On Tue, 18 April 2006 10:10:50 -0400, linux@horizon.com wrote:
> 
> Just to cover the obvious "is it plugged in?" questions, did you
> also "export GIT_OBJECT_DIRECTORY"?  That is, what does
> 	env | grep GIT_OBJECT_DIRECTORY
> produce?

$ env | grep GIT_OBJECT_DIRECTORY
GIT_OBJECT_DIRECTORY=/home/joern/.git

And maybe for the record, I just the debian unstable package:
$ git --version
git version 1.2.1


Jörn

-- 
Anything that can go wrong, will.
-- Finagle's Law

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 13:38 GIT_OBJECT_DIRECTORY Jörn Engel
@ 2006-04-18 15:25 ` Linus Torvalds
  2006-04-18 17:58   ` GIT_OBJECT_DIRECTORY Jörn Engel
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2006-04-18 15:25 UTC (permalink / raw
  To: Jörn Engel; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 805 bytes --]



On Tue, 18 Apr 2006, Jörn Engel wrote:
> 
> $ git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git foo
> [ stored >200M of data under foo/.git/objects ]
> 
> The above looks as if new objects are not stored under
> /home/joern/.git, as specified by GIT_OBJECT_DIRECTORY.

The "rsync" protocol really doesn't honor git rules. It's basically just a 
big recursive copy, and it will copy things from the place they were 
before.

I suspect that if you had used a real git-aware protocol instead, you'd 
have been fine, ie

	git clone git://git.kernel.org/... foo/

would probably work. (I say "probably", because very few people likely use 
GIT_OBJECT_DIRECTORY, and it makes a lot less sense with pack-files than 
it did originally, so it's not getting any testing).

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 15:25 ` GIT_OBJECT_DIRECTORY Linus Torvalds
@ 2006-04-18 17:58   ` Jörn Engel
  2006-04-18 18:07     ` GIT_OBJECT_DIRECTORY Sam Ravnborg
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 17:58 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

On Tue, 18 April 2006 08:25:47 -0700, Linus Torvalds wrote:
> On Tue, 18 Apr 2006, Jörn Engel wrote:
> > 
> > $ git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git foo
> > [ stored >200M of data under foo/.git/objects ]
> > 
> > The above looks as if new objects are not stored under
> > /home/joern/.git, as specified by GIT_OBJECT_DIRECTORY.
> 
> The "rsync" protocol really doesn't honor git rules. It's basically just a 
> big recursive copy, and it will copy things from the place they were 
> before.
> 
> I suspect that if you had used a real git-aware protocol instead, you'd 
> have been fine, ie
> 
> 	git clone git://git.kernel.org/... foo/

Is it possible for non-owners of a kernel.org account to do this?

> would probably work. (I say "probably", because very few people likely use 
> GIT_OBJECT_DIRECTORY, and it makes a lot less sense with pack-files than 
> it did originally, so it's not getting any testing).

Well, .git/objects for your kernel still consumes 121M.  It's not
gigabytes but I still wouldn't want too many copies of that lying
around.  Right now, I already feel slightly motivated to move the
whole content-addressable idea into the kernel.  It has disadvantages,
but the effect on disk- and pagecache-footprint for people like me
would come in handy.

Jörn

-- 
The cheapest, fastest and most reliable components of a computer
system are those that aren't there.
-- Gordon Bell, DEC labratories

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 17:58   ` GIT_OBJECT_DIRECTORY Jörn Engel
@ 2006-04-18 18:07     ` Sam Ravnborg
  2006-04-18 18:08     ` GIT_OBJECT_DIRECTORY Linus Torvalds
  2006-04-18 18:20     ` GIT_OBJECT_DIRECTORY Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Sam Ravnborg @ 2006-04-18 18:07 UTC (permalink / raw
  To: J?rn Engel; +Cc: Linus Torvalds, git

On Tue, Apr 18, 2006 at 07:58:53PM +0200, J?rn Engel wrote:
 > 
> > 	git clone git://git.kernel.org/... foo/
> 
> Is it possible for non-owners of a kernel.org account to do this?
Yes - everyone can do so.
I never use rsync myself anymore.

	Sam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 17:58   ` GIT_OBJECT_DIRECTORY Jörn Engel
  2006-04-18 18:07     ` GIT_OBJECT_DIRECTORY Sam Ravnborg
@ 2006-04-18 18:08     ` Linus Torvalds
  2006-04-18 18:26       ` GIT_OBJECT_DIRECTORY Jörn Engel
  2006-04-18 18:20     ` GIT_OBJECT_DIRECTORY Junio C Hamano
  2 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2006-04-18 18:08 UTC (permalink / raw
  To: Jörn Engel; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1200 bytes --]



On Tue, 18 Apr 2006, Jörn Engel wrote:
> > 
> > 	git clone git://git.kernel.org/... foo/
> 
> Is it possible for non-owners of a kernel.org account to do this?

Yes, kernel.org runs the git daemon.

If a repo isn't packed enough, the git protocol can be pretty CPU 
intensive, but I'm hoping that everybody keeps their repos mostly packed, 
at which point the git protocol should actually be a lot faster than 
rsync.

> > GIT_OBJECT_DIRECTORY, and it makes a lot less sense with pack-files than 
> > it did originally, so it's not getting any testing).
> 
> Well, .git/objects for your kernel still consumes 121M.  It's not
> gigabytes but I still wouldn't want too many copies of that lying
> around.

Right. However, these days we have better approaches than 
GIT_OBJECT_DIRECTORY for that.

In particular, if you create local clones, use "git clone -l -s", which 
shares its base objects with the thing you clone from. It makes the clone 
incredibly fast too (the only real cost is the check-out, which can 
obviously be pretty expensive), and you can then use

	git repack -a -d -l

on all the to repack just the _local_ objects to avoid having packs 
duplicate objects unnecessarily.

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 17:58   ` GIT_OBJECT_DIRECTORY Jörn Engel
  2006-04-18 18:07     ` GIT_OBJECT_DIRECTORY Sam Ravnborg
  2006-04-18 18:08     ` GIT_OBJECT_DIRECTORY Linus Torvalds
@ 2006-04-18 18:20     ` Junio C Hamano
  2006-04-18 18:45       ` GIT_OBJECT_DIRECTORY Jörn Engel
  2 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2006-04-18 18:20 UTC (permalink / raw
  To: Jörn Engel; +Cc: git

Jörn Engel <joern@wohnheim.fh-wedel.de> writes:

> Well, .git/objects for your kernel still consumes 121M.  It's not
> gigabytes but I still wouldn't want too many copies of that lying
> around.

That is what "git clone -l -s" is for.  

The alternates pointer mechanism used with the above largely
makes GIT_OBJECT_DIRECTORY unnecessary for end users these days.
It is a fine mechanism as an implementation detail of the
lowlevel and Porcelains, and that is the reason the
documentation still mentions the environment.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 18:08     ` GIT_OBJECT_DIRECTORY Linus Torvalds
@ 2006-04-18 18:26       ` Jörn Engel
  2006-04-18 18:47         ` GIT_OBJECT_DIRECTORY Linus Torvalds
  2006-04-19  4:51         ` GIT_OBJECT_DIRECTORY H. Peter Anvin
  0 siblings, 2 replies; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 18:26 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

On Tue, 18 April 2006 11:08:40 -0700, Linus Torvalds wrote:
> On Tue, 18 Apr 2006, Jörn Engel wrote:
> > > 
> > > 	git clone git://git.kernel.org/... foo/
> > 
> > Is it possible for non-owners of a kernel.org account to do this?
> 
> Yes, kernel.org runs the git daemon.

Excellent!  I have a faint memory of hpa recently saying that the git
daemon would be too resource-hungry.  One of the cases where being
wrong is a Good Thing.

> > 
> > Well, .git/objects for your kernel still consumes 121M.  It's not
> > gigabytes but I still wouldn't want too many copies of that lying
> > around.
> 
> Right. However, these days we have better approaches than 
> GIT_OBJECT_DIRECTORY for that.
> 
> In particular, if you create local clones, use "git clone -l -s", which 
> shares its base objects with the thing you clone from. It makes the clone 
> incredibly fast too (the only real cost is the check-out, which can 
> obviously be pretty expensive), and you can then use
> 
> 	git repack -a -d -l
> 
> on all the to repack just the _local_ objects to avoid having packs 
> duplicate objects unnecessarily.

This still isn't good enough for me.  Before git, all my trees were
hard-linked (cowlinked, actually) and another copy barely consumed any
space.  "git clone -l -s"  creates a copy of the currently 311M of
kernel source, quite a bit more expensive.

But it appears as if I could "cp -lr" the git tree and work with that.
The nice thing of having cowlinks is that I don't have to rely on git
breaking the hard links - which it probably won't.  But since the
estimated user base of cowlinks is 1, that won't help too many people.

Jörn

-- 
Good warriors cause others to come to them and do not go to others.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 18:20     ` GIT_OBJECT_DIRECTORY Junio C Hamano
@ 2006-04-18 18:45       ` Jörn Engel
  0 siblings, 0 replies; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 18:45 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Tue, 18 April 2006 11:20:58 -0700, Junio C Hamano wrote:
> Jörn Engel <joern@wohnheim.fh-wedel.de> writes:
> 
> > Well, .git/objects for your kernel still consumes 121M.  It's not
> > gigabytes but I still wouldn't want too many copies of that lying
> > around.
> 
> That is what "git clone -l -s" is for.  

See my response to Linus.  .git/objects is currently the smaller
problem.  The larger problem is 311M of raw kernel source - without
any SCM overhead of any flavour.  Like many others, I solved the
larger problem with hardlink trees.  "git clone -l -s" is imo nearly
unethical, as it solved the smaller problem and leaves the larger one
unaffected.  It reeks of hypocricy.

Hardlink trees still aren't perfect.  If I take one tree, "cp -lr" it
twice and apply the same patches to both copies, the changed files
exist twice for both copies.  That sucks, but it is a fairly small
problem and there is no simple solution to it.

If git was able to deal with hardlink trees and properly break the
links when working in one copy, "cp -lr" would be a lot smarter than
"git clone -l -s".  It just happens that I have written some kernel
patches that automatically break hardlinks, even if applications don't
know how to do it.  So for my personal use, git has this ability.

Now, going one step further, GIT_OBJECT_DIRECTORY could solve another
problem.  Just like source files, git object can be pulled twice into
two copies of a tree.  But for git objects, there appeared to be an
easy solution: the central object storage.  So we're back where this
thread started.

Except that I get the idea of GIT_OBJECT_DIRECTORY not being a simple
solution to a small problem, so maybe I should just forget about it.

Jörn

-- 
To my face you have the audacity to advise me to become a thief - the worst
kind of thief that is conceivable, a thief of spiritual things, a thief of
ideas! It is insufferable, intolerable!
-- M. Binet in Scarabouche

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 18:26       ` GIT_OBJECT_DIRECTORY Jörn Engel
@ 2006-04-18 18:47         ` Linus Torvalds
  2006-04-18 18:58           ` GIT_OBJECT_DIRECTORY Jörn Engel
  2006-04-19  4:51         ` GIT_OBJECT_DIRECTORY H. Peter Anvin
  1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2006-04-18 18:47 UTC (permalink / raw
  To: Jörn Engel; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 923 bytes --]



On Tue, 18 Apr 2006, Jörn Engel wrote:
> 
> But it appears as if I could "cp -lr" the git tree and work with that.

That should work. I just personally fear cowlinks, because some things 
will edit the files in place, and then you're screwed.

I _think_ it should be ok for the .git subdirectory, but quite frankly, 
I'm not going to guarantee it. Also, you will break the cow-linking when 
you ever re-pack either the source or the destination, so you'd actually 
be _better_ off with something that does

	# clone the git directories by hand, no checkout (-n).
	git clone -l -s -n src dst

	# cow-link the checked-out state
	(cd src ; git ls-files | cpio -pudml dst)

	# make sure to refresh the index
	git update-index --refresh

or something like that.

TOTALLY UNTESTED!!  (And you need to have made "dst" be an absolute path, 
of course, since we want it to work even after we've done the "cd src" 
thing).

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 18:47         ` GIT_OBJECT_DIRECTORY Linus Torvalds
@ 2006-04-18 18:58           ` Jörn Engel
  0 siblings, 0 replies; 14+ messages in thread
From: Jörn Engel @ 2006-04-18 18:58 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

On Tue, 18 April 2006 11:47:53 -0700, Linus Torvalds wrote:
> On Tue, 18 Apr 2006, Jörn Engel wrote:
> > 
> > But it appears as if I could "cp -lr" the git tree and work with that.
> 
> That should work. I just personally fear cowlinks, because some things 
> will edit the files in place, and then you're screwed.

s/cowlinks/hardlinks/ ?

The reason for me to write the cowlink patches was exactly the fear
you are talking about.  With those patches, links are broken whenever
such a thing happens.

> I _think_ it should be ok for the .git subdirectory, but quite frankly, 
> I'm not going to guarantee it. Also, you will break the cow-linking when 
> you ever re-pack either the source or the destination, so you'd actually 

In that case, cowlinks should still turn a blatant bug into some
wasted space - which is a hell of a lot better.

> 	# cow-link the checked-out state

And this happens to be a problem.  Creating the links when the copy is
created is simple.  Detecting identical files and linking them after
the fact is racy, complicated, racy and, well, racy.  I wouldn't want
to touch it with a ten foot pole.  Not without kernel support.

Jörn

-- 
Linux is more the core point of a concept that surrounds "open source"
which, in turn, is based on a false concept. This concept is that
people actually want to look at source code.
-- Rob Enderle

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-18 18:26       ` GIT_OBJECT_DIRECTORY Jörn Engel
  2006-04-18 18:47         ` GIT_OBJECT_DIRECTORY Linus Torvalds
@ 2006-04-19  4:51         ` H. Peter Anvin
  2006-04-19  5:00           ` GIT_OBJECT_DIRECTORY Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: H. Peter Anvin @ 2006-04-19  4:51 UTC (permalink / raw
  To: Jörn Engel; +Cc: Linus Torvalds, git

Jörn Engel wrote:
> On Tue, 18 April 2006 11:08:40 -0700, Linus Torvalds wrote:
>> On Tue, 18 Apr 2006, Jörn Engel wrote:
>>>> 	git clone git://git.kernel.org/... foo/
>>> Is it possible for non-owners of a kernel.org account to do this?
>> Yes, kernel.org runs the git daemon.
> 
> Excellent!  I have a faint memory of hpa recently saying that the git
> daemon would be too resource-hungry.  One of the cases where being
> wrong is a Good Thing.
> 

Well, we ended up making some tweaks to the git daemon, and it hasn't 
been a problem since.

	-hpa

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GIT_OBJECT_DIRECTORY
  2006-04-19  4:51         ` GIT_OBJECT_DIRECTORY H. Peter Anvin
@ 2006-04-19  5:00           ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2006-04-19  5:00 UTC (permalink / raw
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> writes:

> Jörn Engel wrote:
>>
>> Excellent!  I have a faint memory of hpa recently saying that the git
>> daemon would be too resource-hungry.  One of the cases where being
>> wrong is a Good Thing.
>
> Well, we ended up making some tweaks to the git daemon, and it hasn't
> been a problem since.

Ah, I am glad the daemon expert was listening...  Do you have
comments on recent patch from Serge E. Hallyn?  It looks OK to
me, but that standalone daemon part is not something I run
myself, so...

-- >8 --
[PATCH] socksetup: don't return on set_reuse_addr() error

The set_reuse_addr() error case was the only error case in
socklist() where we returned rather than continued.  Not sure
why.  Either we must free the socklist, or continue.  This patch
continues on error.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 daemon.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

0032d548db56eac9ea09b4ba05843365f6325b85
diff --git a/daemon.c b/daemon.c
index a1ccda3..776749e 100644
--- a/daemon.c
+++ b/daemon.c
@@ -535,7 +535,7 @@ #endif
 
 		if (set_reuse_addr(sockfd)) {
 			close(sockfd);
-			return 0;	/* not fatal */
+			continue;
 		}
 
 		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
-- 
1.3.0.rc4.g5247-dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2006-04-19  5:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-18 13:38 GIT_OBJECT_DIRECTORY Jörn Engel
2006-04-18 15:25 ` GIT_OBJECT_DIRECTORY Linus Torvalds
2006-04-18 17:58   ` GIT_OBJECT_DIRECTORY Jörn Engel
2006-04-18 18:07     ` GIT_OBJECT_DIRECTORY Sam Ravnborg
2006-04-18 18:08     ` GIT_OBJECT_DIRECTORY Linus Torvalds
2006-04-18 18:26       ` GIT_OBJECT_DIRECTORY Jörn Engel
2006-04-18 18:47         ` GIT_OBJECT_DIRECTORY Linus Torvalds
2006-04-18 18:58           ` GIT_OBJECT_DIRECTORY Jörn Engel
2006-04-19  4:51         ` GIT_OBJECT_DIRECTORY H. Peter Anvin
2006-04-19  5:00           ` GIT_OBJECT_DIRECTORY Junio C Hamano
2006-04-18 18:20     ` GIT_OBJECT_DIRECTORY Junio C Hamano
2006-04-18 18:45       ` GIT_OBJECT_DIRECTORY Jörn Engel
  -- strict thread matches above, loose matches on Subject: below --
2006-04-18 14:10 GIT_OBJECT_DIRECTORY linux
2006-04-18 14:16 ` GIT_OBJECT_DIRECTORY Jörn Engel

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).