git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* space compression (again)
@ 2005-04-15 17:19  2% C. Scott Ananian
    0 siblings, 1 reply; 200+ results
From: C. Scott Ananian @ 2005-04-15 17:19 UTC (permalink / raw)
  To: git

I've been reading the archives (a bad idea, I know).  Here's a concrete 
suggestion for GIT space-compression which is (I believe) consistent with 
the philosophy of GIT.

Why are blobs per-file?  [After all, Linus insists that files are an 
illusion.]  Why not just have 'chunks', and assemble *these* 
into blobs (read, 'files')?  A good chunk size would fit evenly into some 
number of disk blocks (no wasted space!).

We already have the rsync algorithm which can scan through a file and 
efficiently tell which existing chunks match (portions of) it, using a 
rolling checksum. (Here's a refresher:
    http://samba.anu.edu.au/rsync/tech_report/node2.html
).  Why not treat the 'chunk' as the fundamental unit, and compose files 
from chunks?

This should get better space utilization: a small change to file X 
will only require storage to save the changed chunk, plus meta data to 
describe the chunks composing the new file.  I propose keeping this only 
one-level deep: we can only specify chunks, not pieces of files.

Unlike xdelta schemes, there is no 'file' dependency.  Chunks for a blob 
can be and are shared among *all the other files and versions in the 
repository*.  Moving pieces from file 'a' to file 'b' "just works".

Best of all, I believe this can be done in a completely layered fashion. 
From git's perspective, it's still 'open this blob' or 'write this blob'. 
It just turns out that the filesystem representation of a blob is slightly 
more fragmented.  Even better, you ought to be able to convert your 
on-disk store from one representation to the other: the named blob doesn't 
change, just 'how to fetch the blob' changes.  So, for example, Linus' 
tree can be unchunked for speed, but the release tree (say) can pull 
pruned history from Linus into a chunked on-disk representation that can 
be efficiently wget'ted (only new chunks need be transferred).

My first concern is possible fragmentation: would we end up with a large 
number of very small chunks, and end up representing files as a list of 
lines (effectively)?  Maybe someone can think of an effective coalescing 
strategy, or maybe it is sufficient just to avoid creating chunks smaller 
than a certain size (ie, possibly writing redundant data to a new chunk, 
just to improve the possibility of reuse).

I'm also not sure what the best 'chunk' size is.  Smaller chunks save more 
space but cost more to access (# of disk seeks per file/blob).  Picking a 
chunk half the average file size should reduce space by ~50% while only 
requiring ~2 additional seeks per file-read. OTOH, rsync experience 
suggests 500-1000 byte chunk sizes.  Probably empirical testing is best.

Lastly, we want to avoid hitting the dcache to check the existence of 
chunks while encoding.  In a large repository, there will be a very large 
number of chunks.  We don't *have* to index all of them, but our 
compression gets better the more chunks we know about.  The rsync 
algorithm creates hash tables of chunks at different levels of granularity 
to avoid doing a full check at every byte of the input file.  How large 
should this cached-on-disk chunk hash table be to avoid saturating it as 
the repository grows (maybe the standard grow-as-you-go hash table is 
fine; you only need one bit per entry anyway)?

Thoughts?  Is the constant-factor overhead of indirection-per-blob going 
to kill git's overwhelming speed?
  --scott

JUBILIST explosion MKULTRA HTAUTOMAT Indonesia Shoal Bay RUCKUS ammunition 
GPFLOOR Hager SDI MKDELTA KUBARK Dictionary Soviet  BLUEBIRD Delta Force
                          ( http://cscott.net/ )

^ permalink raw reply	[relevance 2%]

* Re: space compression (again)
@ 2005-04-15 19:33  2% Ray Heasman
  2005-04-16 12:29  0% ` David Lang
  0 siblings, 1 reply; 200+ results
From: Ray Heasman @ 2005-04-15 19:33 UTC (permalink / raw)
  To: git

For for this email not threading properly, I have been lurking on the
mail list archives and just had to reply to this message.

I was planning to ask exactly this question, and Scott beat me to to. I
even wanted to call them "chunks" too. :-)

It's probably worthwhile for anyone discussing this subject to read this
link: http://www.cs.bell-labs.com/sys/doc/venti/venti.pdf . I know it's
been posted before, but it really is worth reading. :-)

On Fri, 15 Apr 2005, Linus Torvalds wrote:
> On Fri, 15 Apr 2005, C. Scott Ananian wrote:
> > 
> > Why are blobs per-file?  [After all, Linus insists that files are an 
> > illusion.]  Why not just have 'chunks', and assemble *these* 
> > into blobs (read, 'files')?  A good chunk size would fit evenly into some 
> > number of disk blocks (no wasted space!).
>
> I actually considered that. I ended up not doing it, because it's not 
> obvious how to "block" things up (and even more so because while I like 
> the notion, it flies in the face of the other issues I had: performance 
> and simplicity).

I don't think it's as bad as you think.

Let's conceptually have two types of files - Pobs (Proxy Objects, or
Pointer Objects), and chunks. Both are stored and referenced by their
content hash, as usual. Pobs just contain a list of hashes referencing
the chunks in a file. When a file is initially stored, we chunk it so
each chunk fits comfortably in a block, but otherwise we aren't too
critical about sizes. When a file is changed (say, a single line edit),
we update the chunk that contains that line, hash it and store it with
its new name, and update the Pob, which we rehash and restore. If a
chunk grows to be very large (say > 2 disk blocks), we can rechunk it
and update the Pob to include the new chunks.

> The problem with chunking is:
>  - it complicates a lot of the routines. Things like "is this file 
>    unchanged" suddenly become "is this file still the same set of chunks",
>    which is just a _lot_ more code and a lot more likely to have bugs.

You're half right; it will be more complex, but I don't think it's as
bad as you think. Pobs are stored by hash just like anything else. If
some chunks are different, the pob is different, which means it has a
different hash. It's exactly the same as dealing with changed file now.
Sure, when you have to fetch the data, you have to read the pob and get
a list of chunks to concatenate and return, but your example given
doesn't change.

>  - you have to find a blocking factor. I thought of just going it fixed 
>    chunks, and that just doesn't help at all. 

Just use the block size of the filesystem. Some filesystems do tail
packing, so space isn't an issue, though speed can be. We don't actually
care how big a chunk is, except to make it easy on the filesystem.
Individual chunks can be any size.

>  - we already have wasted space due to the low-level filesystem (as 
>    opposed to "git") usually being block-based, which means that space 
>    utilization for small objects tends to suck. So you really want to 
>    prefer objects that are several kB (compressed), and a small block just
>    wastes tons of space.

If a chunk is smaller than a disk block, this is true. However, if we
size it right this is no worse than any other file. Small files (less
than a block) can't be made any larger, so they waste space anyway.
Large files end up wasting space in one block unless they are a perfect
multiple of the block size.

When we increase the size of a chunk, it will waste space, but we would
have created an entire new file, so we win there too.

Admittedly, Pobs will be wasting space too.

On the other hand, I use ReiserFS, so I don't care. ;-)

>  - there _is_ a natural blocking factor already. That's what a file 
>    boundary really is within the project, and finding any other is really 
>    quite hard.

Nah. I think I've made a good case it isn't.

> So I'm personally 100% sure that it's not worth it. But I'm not opposed to
> the _concept_: it makes total sense in the "filesystem" view, and is 100%
> equivalent to having an inode with pointers to blocks. I just don't think 
> the concept plays out well in reality.

Well, the reason I think this would be worth it is that you really win
when you have multiple parallel copies of a source tree, and changes are
cheaper too. If you store all the chunks for all your git repositories
in one place, and otherwise treat your trees of Pobs as the real
repository, your copied trees only cost you space for the Pobs.
Obviously this also applies for file updates within past revisions of a
tree, but I don't know how much it would save. It fits beautifully into
the current abstraction, and saves space without having to resort to
rolling hashes or xdeltas.

The _real_ reason why I am excited about git is that I have a vision of
using this as the filesystem (in a FUSE wrapper or something) for my
home directory. MP3s and AVIs aside, it will make actual work much
easier for me. I have a dream; a dream where I save files using the same
name, safe in the knowledge that I can get to any version I want. I will
live in a world of autosaves, deletes without confirmation, and /etcs
immune from the vagaries of my package management systems, not to
mention users not asking me leading questions about backups. *sigh*
*sniff* Excuse me, I think I have to go now.

-Ray



^ permalink raw reply	[relevance 2%]

* Re: space compression (again)
  2005-04-15 19:33  2% Ray Heasman
@ 2005-04-16 12:29  0% ` David Lang
  0 siblings, 0 replies; 200+ results
From: David Lang @ 2005-04-16 12:29 UTC (permalink / raw)
  To: Ray Heasman; +Cc: git

we alrady have the concept of objects that contain objects and therefor 
don'e need to be re-checked (directories), the chunks inside a file could 
be the same type of thing.

currently we say that if the hash on the directory is the same we don't 
need to re-check each of the files in that directory, this would be that 
if the hash on the file hasn't changed we don't need to re-check the 
chunks inside that file.

David Lang


  On Fri, 15 Apr 2005, Ray Heasman wrote:

> Date: Fri, 15 Apr 2005 12:33:03 -0700
> From: Ray Heasman <lists@mythral.org>
> To: git@vger.kernel.org
> Subject: Re: space compression (again)
> 
> For for this email not threading properly, I have been lurking on the
> mail list archives and just had to reply to this message.
>
> I was planning to ask exactly this question, and Scott beat me to to. I
> even wanted to call them "chunks" too. :-)
>
> It's probably worthwhile for anyone discussing this subject to read this
> link: http://www.cs.bell-labs.com/sys/doc/venti/venti.pdf . I know it's
> been posted before, but it really is worth reading. :-)
>
> On Fri, 15 Apr 2005, Linus Torvalds wrote:
>> On Fri, 15 Apr 2005, C. Scott Ananian wrote:
>>>
>>> Why are blobs per-file?  [After all, Linus insists that files are an
>>> illusion.]  Why not just have 'chunks', and assemble *these*
>>> into blobs (read, 'files')?  A good chunk size would fit evenly into some
>>> number of disk blocks (no wasted space!).
>>
>> I actually considered that. I ended up not doing it, because it's not
>> obvious how to "block" things up (and even more so because while I like
>> the notion, it flies in the face of the other issues I had: performance
>> and simplicity).
>
> I don't think it's as bad as you think.
>
> Let's conceptually have two types of files - Pobs (Proxy Objects, or
> Pointer Objects), and chunks. Both are stored and referenced by their
> content hash, as usual. Pobs just contain a list of hashes referencing
> the chunks in a file. When a file is initially stored, we chunk it so
> each chunk fits comfortably in a block, but otherwise we aren't too
> critical about sizes. When a file is changed (say, a single line edit),
> we update the chunk that contains that line, hash it and store it with
> its new name, and update the Pob, which we rehash and restore. If a
> chunk grows to be very large (say > 2 disk blocks), we can rechunk it
> and update the Pob to include the new chunks.
>
>> The problem with chunking is:
>>  - it complicates a lot of the routines. Things like "is this file
>>    unchanged" suddenly become "is this file still the same set of chunks",
>>    which is just a _lot_ more code and a lot more likely to have bugs.
>
> You're half right; it will be more complex, but I don't think it's as
> bad as you think. Pobs are stored by hash just like anything else. If
> some chunks are different, the pob is different, which means it has a
> different hash. It's exactly the same as dealing with changed file now.
> Sure, when you have to fetch the data, you have to read the pob and get
> a list of chunks to concatenate and return, but your example given
> doesn't change.
>
>>  - you have to find a blocking factor. I thought of just going it fixed
>>    chunks, and that just doesn't help at all.
>
> Just use the block size of the filesystem. Some filesystems do tail
> packing, so space isn't an issue, though speed can be. We don't actually
> care how big a chunk is, except to make it easy on the filesystem.
> Individual chunks can be any size.
>
>>  - we already have wasted space due to the low-level filesystem (as
>>    opposed to "git") usually being block-based, which means that space
>>    utilization for small objects tends to suck. So you really want to
>>    prefer objects that are several kB (compressed), and a small block just
>>    wastes tons of space.
>
> If a chunk is smaller than a disk block, this is true. However, if we
> size it right this is no worse than any other file. Small files (less
> than a block) can't be made any larger, so they waste space anyway.
> Large files end up wasting space in one block unless they are a perfect
> multiple of the block size.
>
> When we increase the size of a chunk, it will waste space, but we would
> have created an entire new file, so we win there too.
>
> Admittedly, Pobs will be wasting space too.
>
> On the other hand, I use ReiserFS, so I don't care. ;-)
>
>>  - there _is_ a natural blocking factor already. That's what a file
>>    boundary really is within the project, and finding any other is really
>>    quite hard.
>
> Nah. I think I've made a good case it isn't.
>
>> So I'm personally 100% sure that it's not worth it. But I'm not opposed to
>> the _concept_: it makes total sense in the "filesystem" view, and is 100%
>> equivalent to having an inode with pointers to blocks. I just don't think
>> the concept plays out well in reality.
>
> Well, the reason I think this would be worth it is that you really win
> when you have multiple parallel copies of a source tree, and changes are
> cheaper too. If you store all the chunks for all your git repositories
> in one place, and otherwise treat your trees of Pobs as the real
> repository, your copied trees only cost you space for the Pobs.
> Obviously this also applies for file updates within past revisions of a
> tree, but I don't know how much it would save. It fits beautifully into
> the current abstraction, and saves space without having to resort to
> rolling hashes or xdeltas.
>
> The _real_ reason why I am excited about git is that I have a vision of
> using this as the filesystem (in a FUSE wrapper or something) for my
> home directory. MP3s and AVIs aside, it will make actual work much
> easier for me. I have a dream; a dream where I save files using the same
> name, safe in the knowledge that I can get to any version I want. I will
> live in a world of autosaves, deletes without confirmation, and /etcs
> immune from the vagaries of my package management systems, not to
> mention users not asking me leading questions about backups. *sigh*
> *sniff* Excuse me, I think I have to go now.
>
> -Ray
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[relevance 0%]

* Re: space compression (again)
  @ 2005-04-16 15:11  3%         ` C. Scott Ananian
  2005-04-16 17:37  0%           ` Martin Uecker
  0 siblings, 1 reply; 200+ results
From: C. Scott Ananian @ 2005-04-16 15:11 UTC (permalink / raw)
  To: Martin Uecker; +Cc: git

On Sat, 16 Apr 2005, Martin Uecker wrote:

> The right thing (TM) is to switch from SHA1 of compressed
> content for the complete monolithic file to a merkle hash tree
> of the uncompressed content. This would make the hash
> independent of the actual storage method (chunked or not).

It would certainly be nice to change to a hash of the uncompressed 
content, rather than a hash of the compressed content, but it's not 
strictly necessary, since files are fetched all at once: there's not 'read 
subrange' operation on blobs.

I assume 'merkle hash tree' is talking about:
   http://www.open-content.net/specs/draft-jchapweske-thex-02.html
..which is very interesting, but not quite what I was thinking.
The merkle hash approach seems to require fixed chunk boundaries.
The rsync approach does not use fixed chunk boundaries; this is necessary 
to ensure good storage reuse for the expected case (ie; inserting a single 
line at the start or in the middle of the file, which changes all the 
chunk boundaries).

Further, in the absence of subrange reads on blobs, it's not entirely 
clear what using a merkle hash would buy you.
  --scott

WASHTUB supercomputer security Mk 48 justice ODUNIT radar COBRA JANE 
SSBN 731 BATF KUJUMP SECANT operation class struggle SYNCARP KGB ODACID
                          ( http://cscott.net/ )

^ permalink raw reply	[relevance 3%]

* Re: space compression (again)
  2005-04-16 15:11  3%         ` C. Scott Ananian
@ 2005-04-16 17:37  0%           ` Martin Uecker
  0 siblings, 0 replies; 200+ results
From: Martin Uecker @ 2005-04-16 17:37 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

On Sat, Apr 16, 2005 at 11:11:00AM -0400, C. Scott Ananian wrote:
> On Sat, 16 Apr 2005, Martin Uecker wrote:
> 
> >The right thing (TM) is to switch from SHA1 of compressed
> >content for the complete monolithic file to a merkle hash tree
> >of the uncompressed content. This would make the hash
> >independent of the actual storage method (chunked or not).
> 
> It would certainly be nice to change to a hash of the uncompressed 
> content, rather than a hash of the compressed content, but it's not 
> strictly necessary, since files are fetched all at once: there's not 'read 
> subrange' operation on blobs.
> 
> I assume 'merkle hash tree' is talking about:
>   http://www.open-content.net/specs/draft-jchapweske-thex-02.html
> ..which is very interesting, but not quite what I was thinking.
> The merkle hash approach seems to require fixed chunk boundaries.

I don't know what is written there, but I don't
consider fixed chunk boundaries part of the definition.

> The rsync approach does not use fixed chunk boundaries; this is necessary 
> to ensure good storage reuse for the expected case (ie; inserting a single 
> line at the start or in the middle of the file, which changes all the 
> chunk boundaries).

Yes. The chunk boundaries should be determined deterministically
from local properties of the data. Use a rolling checksum over
some small window and split the file it it hits a special value (0).
This is what the rsyncable patch to zlib does.

> Further, in the absence of subrange reads on blobs, it's not entirely 
> clear what using a merkle hash would buy you.

The whole design of git is a hash tree. If you extend
this tree structure into files you end up with merkle
hash trees. Everything else is just more complicated.

Martin
 

-- 
One night, when little Giana from Milano was fast asleep,
she had a strange dream.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[relevance 0%]

* [PATCH] Get commits from remote repositories by HTTP
@ 2005-04-16 22:03  7% Daniel Barkalow
  2005-04-16 22:24  3% ` Tony Luck
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-16 22:03 UTC (permalink / raw)
  To: git

This adds a program to download a commit, the trees, and the blobs in them
from a remote repository using HTTP. It skips anything you already have.

There are a number of improvements possible, to be done if this catches
on, including, significantly, checking if the response was correct (or
even not an error).

It makes fsck-cache and rev-tree give harmless warnings, because it
includes some code that should probably be shared with them in revision.h

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>

Index: Makefile
===================================================================
--- ed4f6e454b40650b904ab72048b2f93a068dccc3/Makefile  (mode:100644 sha1:b39b4ea37586693dd707d1d0750a9b580350ec50)
+++ a65375b46154c90e7499b7e76998d430cd9cd29d/Makefile  (mode:100644 sha1:d41860aed161a14ca61e7b6c7f591f65928bd61f)
@@ -14,7 +14,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-	check-files ls-tree merge-tree
+	check-files ls-tree merge-tree http-get
 
 all: $(PROG)
 
@@ -23,6 +23,9 @@
 
 LIBS= -lssl -lz
 
+http-get:%:%.o read-cache.o
+	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
 init-db: init-db.o
 
 update-cache: update-cache.o read-cache.o
Index: http-get.c
===================================================================
--- /dev/null  (tree:ed4f6e454b40650b904ab72048b2f93a068dccc3)
+++ a65375b46154c90e7499b7e76998d430cd9cd29d/http-get.c  (mode:100644 sha1:6a36cfa079519a7a3ad5b1618be8711c5127b531)
@@ -0,0 +1,175 @@
+#include <sys/socket.h>
+#include <netdb.h>
+#include <netinet/in.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "revision.h"
+#include <errno.h>
+
+static struct sockaddr_in sockad;
+static char *url;
+static char *base;
+
+static int target_url(char *target)
+{
+	char *name;
+	struct hostent *entry;
+	if (memcmp(target, "http://", 7))
+		return -1;
+	url = target;
+	base = strchr(target + 7, '/');
+	name = malloc(base - (target + 7) + 1);
+	memcpy(name, target + 7, base - (target + 7));
+	name[base - (target + 7)] = '\0';
+	printf("Connect to %s\n", name);
+	entry = gethostbyname(name);
+	memcpy(&sockad.sin_addr.s_addr,
+	       &((struct in_addr *)entry->h_addr)->s_addr, 4);
+	sockad.sin_port = htons(80);
+	sockad.sin_family = AF_INET;
+}
+
+static int get_connection()
+{
+	int fd = socket(AF_INET, SOCK_STREAM, 0);
+	if (connect(fd, (struct sockaddr*) &sockad,
+		    sizeof(struct sockaddr_in))) {
+		perror(url);
+	}
+	return fd;
+}
+
+static void release_connection(int fd) {
+	close(fd);
+}
+
+static int fetch(unsigned char *sha1)
+{
+	int header_end_posn = 0;
+	int local;
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+	char buffer[4096];
+	int fd;
+	struct stat st;
+
+	if (!stat(filename, &st)) {
+		return 0;
+	}
+
+	fd = get_connection();
+	if (fd < 0) {
+		return 1;
+	}
+
+	write(fd, "GET ", 4);
+	write(fd, base, strlen(base));
+	write(fd, "objects/", 8);
+	write(fd, hex, 2);
+	write(fd, "/", 1);
+	write(fd, hex + 2, 38);
+	write(fd, " HTTP/1.0\r\n", 11);
+	write(fd, "\r\n", 2);
+
+	local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+
+	do {
+		int sz = read(fd, buffer, 4096);
+		if (!sz) {
+			break;
+		}
+		if (sz < 0) {
+			perror("Reading from connection");
+			unlink(filename);
+			close(local);
+			return 1;
+		}
+		if (header_end_posn < 4) {
+			int i = 0;
+			char *flag = "\r\n\r\n";
+			while (i < sz && header_end_posn < 4) {
+				if (buffer[i] == flag[header_end_posn]) {
+					header_end_posn++;
+				} else {
+					header_end_posn = 0;
+				}
+				i++;
+			}
+			if (i < sz) {
+				write(local, buffer + i, sz - i);
+			}
+			continue;
+		}
+		write(local, buffer, sz);
+	} while (1);
+
+	close(local);
+	
+	release_connection(fd);
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	void *buffer;
+        unsigned long size;
+        char type[20];
+
+        buffer = read_sha1_file(sha1, type, &size);
+	if (!buffer)
+		return -1;
+	if (strcmp(type, "tree"))
+		return -1;
+	while (size) {
+		int len = strlen(buffer) + 1;
+		unsigned char *sha1 = buffer + len;
+		unsigned int mode;
+		int retval;
+
+		if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1)
+			return -1;
+
+		buffer = sha1 + 20;
+		size -= len + 20;
+
+		retval = fetch(sha1);
+		if (retval)
+			return -1;
+
+		if (S_ISDIR(mode)) {
+			retval = process_tree(sha1);
+			if (retval)
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct revision *rev = lookup_rev(sha1);
+	if (parse_commit_object(rev))
+		return -1;
+	
+	fetch(rev->tree);
+	process_tree(rev->tree);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id = argv[1];
+	char *url = argv[2];
+
+	unsigned char sha1[20];
+
+	get_sha1_hex(commit_id, sha1);
+
+	target_url(url);
+
+	fetch(sha1);
+	return process_commit(sha1);
+}
Index: revision.h
===================================================================
--- ed4f6e454b40650b904ab72048b2f93a068dccc3/revision.h  (mode:100664 sha1:28d0de3261a61f68e4e0948a25a416a515cd2e83)
+++ a65375b46154c90e7499b7e76998d430cd9cd29d/revision.h  (mode:100664 sha1:523bde6e14e18bb0ecbded8f83ad4df93fc467ab)
@@ -24,6 +24,7 @@
 	unsigned int flags;
 	unsigned char sha1[20];
 	unsigned long date;
+	unsigned char tree[20];
 	struct parent *parent;
 };
 
@@ -111,4 +112,29 @@
 	}
 }
 
+static int parse_commit_object(struct revision *rev)
+{
+	if (!(rev->flags & SEEN)) {
+		void *buffer, *bufptr;
+		unsigned long size;
+		char type[20];
+		unsigned char parent[20];
+
+		rev->flags |= SEEN;
+		buffer = bufptr = read_sha1_file(rev->sha1, type, &size);
+		if (!buffer || strcmp(type, "commit"))
+			return -1;
+		get_sha1_hex(bufptr + 5, rev->tree);
+		bufptr += 46; /* "tree " + "hex sha1" + "\n" */
+		while (!memcmp(bufptr, "parent ", 7) && 
+		       !get_sha1_hex(bufptr+7, parent)) {
+			add_relationship(rev, parent);
+			bufptr += 48;   /* "parent " + "hex sha1" + "\n" */
+		}
+		//rev->date = parse_commit_date(bufptr);
+		free(buffer);
+	}
+	return 0;
+}
+
 #endif /* REVISION_H */


^ permalink raw reply	[relevance 7%]

* Re: [PATCH] Get commits from remote repositories by HTTP
  2005-04-16 22:03  7% [PATCH] Get commits from remote repositories by HTTP Daniel Barkalow
@ 2005-04-16 22:24  3% ` Tony Luck
  2005-04-16 22:33  0%   ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Tony Luck @ 2005-04-16 22:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

On 4/16/05, Daniel Barkalow <barkalow@iabervon.org> wrote:
> +        buffer = read_sha1_file(sha1, type, &size);

You never free this buffer.

It would also be nice if you saved "tree" objects in some temporary file
and did not install them until after you had fetched all the blobs and
trees that this tree references.  Then if your connection is interrupted
you can just restart it.

Otherwise this looks really nice.  I was going to script something
similar using "wget" ... but that would have made zillions of seperate
connections.  Not so kind to the server.

-Tony

^ permalink raw reply	[relevance 3%]

* Re: [PATCH] Get commits from remote repositories by HTTP
  2005-04-16 22:24  3% ` Tony Luck
@ 2005-04-16 22:33  0%   ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-16 22:33 UTC (permalink / raw)
  To: Tony Luck; +Cc: git

On Sat, 16 Apr 2005, Tony Luck wrote:

> On 4/16/05, Daniel Barkalow <barkalow@iabervon.org> wrote:
> > +        buffer = read_sha1_file(sha1, type, &size);
> 
> You never free this buffer.

Ideally, this should all be rearranged to share the code with
read-tree, and it should be fixed in common.

> It would also be nice if you saved "tree" objects in some temporary file
> and did not install them until after you had fetched all the blobs and
> trees that this tree references.  Then if your connection is interrupted
> you can just restart it.

It looks over everything relevant, even if it doesn't need to download
anything, so it should work to continue if it stops in between.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* [PATCH] Use libcurl to use HTTP to get repositories
@ 2005-04-17  0:14  7% Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-17  0:14 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds

This enables the use of HTTP to download commits and associated objects
from remote repositories. It now uses libcurl instead of local hack code.

Still causes warnings for fsck-cache and rev-tree, due to unshared code.

Still leaks a bit of memory due to bug copied from read-tree.

Needs libcurl post 7.7 or so.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>

Index: Makefile
===================================================================
--- ed4f6e454b40650b904ab72048b2f93a068dccc3/Makefile  (mode:100644 sha1:b39b4ea37586693dd707d1d0750a9b580350ec50)
+++ d332a8ddffb50c1247491181af458970bf639942/Makefile  (mode:100644 sha1:ca5dfd41b750cb1339128e4431afbbbc21bf57bb)
@@ -14,7 +14,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-	check-files ls-tree merge-tree
+	check-files ls-tree merge-tree http-get
 
 all: $(PROG)
 
@@ -23,6 +23,11 @@
 
 LIBS= -lssl -lz
 
+http-get: LIBS += -lcurl
+
+http-get:%:%.o read-cache.o
+	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
 init-db: init-db.o
 
 update-cache: update-cache.o read-cache.o
Index: http-get.c
===================================================================
--- /dev/null  (tree:ed4f6e454b40650b904ab72048b2f93a068dccc3)
+++ d332a8ddffb50c1247491181af458970bf639942/http-get.c  (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44)
@@ -0,0 +1,126 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "revision.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+
+	char *url;
+	char *posn;
+	FILE *local;
+	struct stat st;
+
+	if (!stat(filename, &st)) {
+		return 0;
+	}
+
+	local = fopen(filename, "w");
+
+	if (!local) {
+		fprintf(stderr, "Couldn't open %s\n", filename);
+		return -1;
+	}
+
+	curl_easy_setopt(curl, CURLOPT_FILE, local);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	curl_easy_perform(curl);
+
+	fclose(local);
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	void *buffer;
+        unsigned long size;
+        char type[20];
+
+        buffer = read_sha1_file(sha1, type, &size);
+	if (!buffer)
+		return -1;
+	if (strcmp(type, "tree"))
+		return -1;
+	while (size) {
+		int len = strlen(buffer) + 1;
+		unsigned char *sha1 = buffer + len;
+		unsigned int mode;
+		int retval;
+
+		if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1)
+			return -1;
+
+		buffer = sha1 + 20;
+		size -= len + 20;
+
+		retval = fetch(sha1);
+		if (retval)
+			return -1;
+
+		if (S_ISDIR(mode)) {
+			retval = process_tree(sha1);
+			if (retval)
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct revision *rev = lookup_rev(sha1);
+	if (parse_commit_object(rev))
+		return -1;
+	
+	fetch(rev->tree);
+	process_tree(rev->tree);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id = argv[1];
+	char *url = argv[2];
+
+	unsigned char sha1[20];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	fetch(sha1);
+	process_commit(sha1);
+
+	curl_global_cleanup();
+	return 0;
+}
Index: revision.h
===================================================================
--- ed4f6e454b40650b904ab72048b2f93a068dccc3/revision.h  (mode:100664 sha1:28d0de3261a61f68e4e0948a25a416a515cd2e83)
+++ d332a8ddffb50c1247491181af458970bf639942/revision.h  (mode:100664 sha1:523bde6e14e18bb0ecbded8f83ad4df93fc467ab)
@@ -24,6 +24,7 @@
 	unsigned int flags;
 	unsigned char sha1[20];
 	unsigned long date;
+	unsigned char tree[20];
 	struct parent *parent;
 };
 
@@ -111,4 +112,29 @@
 	}
 }
 
+static int parse_commit_object(struct revision *rev)
+{
+	if (!(rev->flags & SEEN)) {
+		void *buffer, *bufptr;
+		unsigned long size;
+		char type[20];
+		unsigned char parent[20];
+
+		rev->flags |= SEEN;
+		buffer = bufptr = read_sha1_file(rev->sha1, type, &size);
+		if (!buffer || strcmp(type, "commit"))
+			return -1;
+		get_sha1_hex(bufptr + 5, rev->tree);
+		bufptr += 46; /* "tree " + "hex sha1" + "\n" */
+		while (!memcmp(bufptr, "parent ", 7) && 
+		       !get_sha1_hex(bufptr+7, parent)) {
+			add_relationship(rev, parent);
+			bufptr += 48;   /* "parent " + "hex sha1" + "\n" */
+		}
+		//rev->date = parse_commit_date(bufptr);
+		free(buffer);
+	}
+	return 0;
+}
+
 #endif /* REVISION_H */


^ permalink raw reply	[relevance 7%]

* [3/5] Add http-pull
  @ 2005-04-17 15:31  7% ` Daniel Barkalow
  2005-04-17 18:10  4%   ` Petr Baudis
  2005-04-17 18:58  7%   ` [3.1/5] " Daniel Barkalow
  0 siblings, 2 replies; 200+ results
From: Daniel Barkalow @ 2005-04-17 15:31 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

http-pull is a program that downloads from a (normal) HTTP server a commit
and all of the tree and blob objects it refers to (but not other commits,
etc.). Options could be used to make it download a larger or different
selection of objects. It depends on libcurl, which I forgot to mention in
the README again.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: Makefile
===================================================================
--- d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile  (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44)
+++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/Makefile  (mode:100644 sha1:940ef8578cf469354002cd8feaec25d907015267)
@@ -14,7 +14,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-	check-files ls-tree merge-base
+	check-files ls-tree http-pull merge-base
 
 SCRIPT=	parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \
 	gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \
@@ -35,6 +35,7 @@
 
 LIBS= -lssl -lz
 
+http-pull: LIBS += -lcurl
 
 $(PROG):%: %.o $(COMMON)
 	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
Index: http-pull.c
===================================================================
--- /dev/null  (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a)
+++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c  (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44)
@@ -0,0 +1,126 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "revision.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+
+	char *url;
+	char *posn;
+	FILE *local;
+	struct stat st;
+
+	if (!stat(filename, &st)) {
+		return 0;
+	}
+
+	local = fopen(filename, "w");
+
+	if (!local) {
+		fprintf(stderr, "Couldn't open %s\n", filename);
+		return -1;
+	}
+
+	curl_easy_setopt(curl, CURLOPT_FILE, local);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	curl_easy_perform(curl);
+
+	fclose(local);
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	void *buffer;
+        unsigned long size;
+        char type[20];
+
+        buffer = read_sha1_file(sha1, type, &size);
+	if (!buffer)
+		return -1;
+	if (strcmp(type, "tree"))
+		return -1;
+	while (size) {
+		int len = strlen(buffer) + 1;
+		unsigned char *sha1 = buffer + len;
+		unsigned int mode;
+		int retval;
+
+		if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1)
+			return -1;
+
+		buffer = sha1 + 20;
+		size -= len + 20;
+
+		retval = fetch(sha1);
+		if (retval)
+			return -1;
+
+		if (S_ISDIR(mode)) {
+			retval = process_tree(sha1);
+			if (retval)
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct revision *rev = lookup_rev(sha1);
+	if (parse_commit_object(rev))
+		return -1;
+	
+	fetch(rev->tree);
+	process_tree(rev->tree);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id = argv[1];
+	char *url = argv[2];
+
+	unsigned char sha1[20];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	fetch(sha1);
+	process_commit(sha1);
+
+	curl_global_cleanup();
+	return 0;
+}


^ permalink raw reply	[relevance 7%]

* Re: [3/5] Add http-pull
  2005-04-17 15:31  7% ` [3/5] Add http-pull Daniel Barkalow
@ 2005-04-17 18:10  4%   ` Petr Baudis
  2005-04-17 18:49  0%     ` Daniel Barkalow
  2005-04-17 18:58  7%   ` [3.1/5] " Daniel Barkalow
  1 sibling, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-17 18:10 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

Dear diary, on Sun, Apr 17, 2005 at 05:31:16PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> http-pull is a program that downloads from a (normal) HTTP server a commit
> and all of the tree and blob objects it refers to (but not other commits,
> etc.). Options could be used to make it download a larger or different
> selection of objects. It depends on libcurl, which I forgot to mention in
> the README again.
> 
> Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>

So, while you will be resending the patch, please update the README.

> Index: Makefile
> ===================================================================
> --- d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile  (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44)
> +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/Makefile  (mode:100644 sha1:940ef8578cf469354002cd8feaec25d907015267)
> @@ -35,6 +35,7 @@
>  
>  LIBS= -lssl -lz
>  
> +http-pull: LIBS += -lcurl
>  
>  $(PROG):%: %.o $(COMMON)
>  	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)

Whew. Looks like an awful trick, you say this works?! :-)

At times, I wouldn't want to be a GNU make parser.

> Index: http-pull.c
> ===================================================================
> --- /dev/null  (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a)
> +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c  (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44)
> @@ -0,0 +1,126 @@
> +	if (!stat(filename, &st)) {
> +		return 0;
> +	}

access()

> +	url = malloc(strlen(base) + 50);

Off-by-one. What about the trailing NUL?

> +	strcpy(url, base);
> +	posn = url + strlen(base);
> +	strcpy(posn, "objects/");
> +	posn += 8;
> +	memcpy(posn, hex, 2);
> +	posn += 2;
> +	*(posn++) = '/';
> +	strcpy(posn, hex + 2);


> +static int process_tree(unsigned char *sha1)
> +{
> +	void *buffer;
> +        unsigned long size;
> +        char type[20];
> +
> +        buffer = read_sha1_file(sha1, type, &size);

Something with your whitespaces is wrong here. ;-)

> +	fetch(rev->tree);
> +	process_tree(rev->tree);

> +	fetch(sha1);
> +	process_commit(sha1);

You are ignoring return codes of own routines everywhere.
You should use error() instead of plain -1, BTW.


I think you should have at least two disjunct modes - either you are
downloading everything related to the given commit, or you are
downloading all commit records for commit predecessors.

Even if you might not want all the intermediate trees, you definitively
want the intermediate commits, to keep the history graph contignuous.

So in git pull, I'd imagine to do

	http-pull -c $new_head
	http-pull -t $(tree-id $new_head)

So, -c would fetch a given commit and all its predecessors until it hits
what you already have on your side. -t would fetch a given tree with all
files and subtrees and everything. http-pull shouldn't default on
either, since they are mutually exclusive.

What do you think?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 4%]

* Re: [3/5] Add http-pull
  2005-04-17 18:10  4%   ` Petr Baudis
@ 2005-04-17 18:49  0%     ` Daniel Barkalow
  2005-04-17 19:08  0%       ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-17 18:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Sun, 17 Apr 2005, Petr Baudis wrote:

> > Index: Makefile
> > ===================================================================
> > --- d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile  (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44)
> > +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/Makefile  (mode:100644 sha1:940ef8578cf469354002cd8feaec25d907015267)
> > @@ -35,6 +35,7 @@
> >  
> >  LIBS= -lssl -lz
> >  
> > +http-pull: LIBS += -lcurl
> >  
> >  $(PROG):%: %.o $(COMMON)
> >  	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
> 
> Whew. Looks like an awful trick, you say this works?! :-)
> 
> At times, I wouldn't want to be a GNU make parser.

Yup. GNU make is big on the features which do the obvious thing, even when
you can't believe they work. This is probably why nobody's managed to
replace it.

> > Index: http-pull.c
> > ===================================================================
> > --- /dev/null  (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a)
> > +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c  (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44)
> > +	url = malloc(strlen(base) + 50);
> 
> Off-by-one. What about the trailing NUL?

I get length(base) + "object/"=8 + 40 SHA1 + 1 for '/' and 1 for NUL = 50.

> I think you should have at least two disjunct modes - either you are
> downloading everything related to the given commit, or you are
> downloading all commit records for commit predecessors.
> 
> Even if you might not want all the intermediate trees, you definitively
> want the intermediate commits, to keep the history graph contignuous.
> 
> So in git pull, I'd imagine to do
> 
> 	http-pull -c $new_head
> 	http-pull -t $(tree-id $new_head)
> 
> So, -c would fetch a given commit and all its predecessors until it hits
> what you already have on your side. -t would fetch a given tree with all
> files and subtrees and everything. http-pull shouldn't default on
> either, since they are mutually exclusive.
> 
> What do you think?

I think I'd rather keep the current behavior and add a -c for getting the
history of commits, and maybe a -a for getting the history of commits and
their tress.

There's some trickiness for the history of commits thing for stopping at
the point where you have everything, but also behaving appropriately if
you try once, fail partway through, and then try again. It's on my queue
of things to think about.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* [3.1/5] Add http-pull
  2005-04-17 15:31  7% ` [3/5] Add http-pull Daniel Barkalow
  2005-04-17 18:10  4%   ` Petr Baudis
@ 2005-04-17 18:58  7%   ` Daniel Barkalow
  1 sibling, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-17 18:58 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

http-pull is a program that downloads from a (normal) HTTP server a commit
and all of the tree and blob objects it refers to (but not other commits,
etc.). Options could be used to make it download a larger or different
selection of objects.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: Makefile
===================================================================
--- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/Makefile  (mode:100644 sha1:346e3850de026485802e41e16a1180be2df85e4a)
+++ 3eae85f66143160a26f5545d197862c89e2a8fb8/Makefile  (mode:100644 sha1:0e84e3cd12f836602b420c197e08fabefe975493)
@@ -14,7 +17,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-	check-files ls-tree merge-base
+	check-files ls-tree http-pull merge-base
 
 SCRIPT=	parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \
 	gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \
@@ -35,6 +38,7 @@
 
 LIBS= -lssl -lz
 
+http-pull: LIBS += -lcurl
 
 $(PROG):%: %.o $(COMMON)
 	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
Index: README
===================================================================
--- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/README  (mode:100664 sha1:0170eafb60ad9009ca41c6536cecd6d1fdee5b86)
+++ 3eae85f66143160a26f5545d197862c89e2a8fb8/README  (mode:100664 sha1:921d552d810394e665323ec82b4826914918689c)
@@ -120,7 +120,7 @@
 	diff, patch
 	libssl
 	rsync
-
+	curl (later than 7.7, according to the docs)
 
 
 	The "core GIT"
Index: http-pull.c
===================================================================
--- /dev/null  (tree:45f926575d2c44072bfcf2317dbf3f0fbb513a4e)
+++ 3eae85f66143160a26f5545d197862c89e2a8fb8/http-pull.c  (mode:100644 sha1:7ba4ad67f6dac34addb537ee147ae3de0550a484)
@@ -0,0 +1,139 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "revision.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+
+	char *url;
+	char *posn;
+	FILE *local;
+
+	if (!access(filename, R_OK)) {
+		return 0;
+	}
+
+	local = fopen(filename, "w");
+
+	if (!local) {
+		return error("Couldn't open %s", filename);
+	}
+
+	curl_easy_setopt(curl, CURLOPT_FILE, local);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	if (curl_easy_perform(curl)) {
+		fclose(local);
+		unlink(filename);
+		return error("Error downloading %s from %s",
+			     sha1_to_hex(sha1), url);
+	}
+
+	fclose(local);
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	void *buffer;
+	unsigned long size;
+	char type[20];
+
+	buffer = read_sha1_file(sha1, type, &size);
+	if (!buffer)
+	 	return error("Couldn't read %s.",
+			     sha1_to_hex(sha1));
+	if (strcmp(type, "tree"))
+		return error("Expected %s to be a tree, but was a %s.",
+			     sha1_to_hex(sha1), type);
+	while (size) {
+		int len = strlen(buffer) + 1;
+		unsigned char *sha1 = buffer + len;
+		unsigned int mode;
+		int retval;
+
+		if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1)
+			return error("Invalid tree object");
+
+		buffer = sha1 + 20;
+		size -= len + 20;
+
+		retval = fetch(sha1);
+		if (retval)
+			return retval;
+
+		if (S_ISDIR(mode)) {
+			retval = process_tree(sha1);
+			if (retval)
+				return retval;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	int retval;
+	struct revision *rev = lookup_rev(sha1);
+	if (parse_commit_object(rev))
+		return error("Couldn't parse commit %s\n", sha1_to_hex(sha1));
+
+	retval = fetch(rev->tree);
+	if (retval)
+		return retval;
+	retval = process_tree(rev->tree);
+	return retval;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id = argv[1];
+	char *url = argv[2];
+	int retval;
+
+	unsigned char sha1[20];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	retval = fetch(sha1);
+	if (retval)
+		return 1;
+	retval = process_commit(sha1);
+	if (retval)
+		return 1;
+
+	curl_global_cleanup();
+	return 0;
+}


^ permalink raw reply	[relevance 7%]

* Re: [3/5] Add http-pull
  2005-04-17 18:49  0%     ` Daniel Barkalow
@ 2005-04-17 19:08  0%       ` Petr Baudis
  2005-04-17 19:24  3%         ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-17 19:08 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

Dear diary, on Sun, Apr 17, 2005 at 08:49:11PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> On Sun, 17 Apr 2005, Petr Baudis wrote:
> > > Index: http-pull.c
> > > ===================================================================
> > > --- /dev/null  (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a)
> > > +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c  (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44)
> > > +	url = malloc(strlen(base) + 50);
> > 
> > Off-by-one. What about the trailing NUL?
> 
> I get length(base) + "object/"=8 + 40 SHA1 + 1 for '/' and 1 for NUL = 50.

Sorry, counted one '/' more. :-)

> > I think you should have at least two disjunct modes - either you are
> > downloading everything related to the given commit, or you are
> > downloading all commit records for commit predecessors.
> > 
> > Even if you might not want all the intermediate trees, you definitively
> > want the intermediate commits, to keep the history graph contignuous.
> > 
> > So in git pull, I'd imagine to do
> > 
> > 	http-pull -c $new_head
> > 	http-pull -t $(tree-id $new_head)
> > 
> > So, -c would fetch a given commit and all its predecessors until it hits
> > what you already have on your side. -t would fetch a given tree with all
> > files and subtrees and everything. http-pull shouldn't default on
> > either, since they are mutually exclusive.
> > 
> > What do you think?
> 
> I think I'd rather keep the current behavior and add a -c for getting the
> history of commits, and maybe a -a for getting the history of commits and
> their tress.

I'm not too kind at this. Either make it totally separate commands, or
make a required switch specifying what to do. Otherwise it implies the
switches would just modify what it does, but they make it do something
completely different.

-a would be fine too - basically a combination of -c and -t. I'd imagine
that is what Linus would want to use, e.g.

> There's some trickiness for the history of commits thing for stopping at
> the point where you have everything, but also behaving appropriately if
> you try once, fail partway through, and then try again. It's on my queue
> of things to think about.

Can't you just stop the recursion when you hit a commit you already
have?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [3/5] Add http-pull
  2005-04-17 19:08  0%       ` Petr Baudis
@ 2005-04-17 19:24  3%         ` Daniel Barkalow
  2005-04-17 19:59  0%           ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-17 19:24 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Sun, 17 Apr 2005, Petr Baudis wrote:

> Dear diary, on Sun, Apr 17, 2005 at 08:49:11PM CEST, I got a letter
> where Daniel Barkalow <barkalow@iabervon.org> told me that...
> 
> I'm not too kind at this. Either make it totally separate commands, or
> make a required switch specifying what to do. Otherwise it implies the
> switches would just modify what it does, but they make it do something
> completely different.

That's a good point. I'll require a -t for now, and add more later.

> -a would be fine too - basically a combination of -c and -t. I'd imagine
> that is what Linus would want to use, e.g.

Well, -c -t would give you the current tree and the whole commit log, but
not old trees. -a would additionally give you old trees.

> > There's some trickiness for the history of commits thing for stopping at
> > the point where you have everything, but also behaving appropriately if
> > you try once, fail partway through, and then try again. It's on my queue
> > of things to think about.
> 
> Can't you just stop the recursion when you hit a commit you already
> have?

The problem is that, if you've fetched the final commit already, and then
the server dies, and you try again later, you already have the last one,
and so you think you've got everything.

At this point, I also want to put off doing much further with recursion
and commits until revision.h and such are sorted out.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* Re: Re-done kernel archive - real one?
  @ 2005-04-17 19:33  3%       ` Linus Torvalds
  2005-04-17 19:51  0%         ` Russell King
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-04-17 19:33 UTC (permalink / raw)
  To: Russell King; +Cc: Git Mailing List, Peter Anvin



On Sun, 17 Apr 2005, Russell King wrote:
> 
> I still need to work out how to make my noddy script follow different
> branches which may be present though.  However, for my common work
> flow, it fits what I require.

The way to handle that is that you need to 

 - remember (or re-fetch) what the latest HEAD was that you merged with in 
   my tree.

   if you didn't remember, you can just get all my objects and do a

	merge-head $(cat .git/HEAD) $linus-current-head

   or something (using the current git archive that has a "merge-head" 
   program. That gives you the most recent common head.

 - use "rev-tree" to show reachability

	rev-tree $my-current-head $last-merge-head
		| sort -n		# sort by date rather than sha1
		| cut -d' ' -f2		# get the sha1 + "flags" mask
		| grep :1		# show the ones that are only
					# reachable from $my-current-head

and you now have a nice list of sha1's ordered by date.

Or something. I didn't test the above. Testing is for users.

		Linus

^ permalink raw reply	[relevance 3%]

* Re: Re-done kernel archive - real one?
  2005-04-17 19:33  3%       ` Linus Torvalds
@ 2005-04-17 19:51  0%         ` Russell King
  0 siblings, 0 replies; 200+ results
From: Russell King @ 2005-04-17 19:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

(Dropped HPA from the CC line - I think he was only copied for the
master.kernel.org issues.)

On Sun, Apr 17, 2005 at 12:33:22PM -0700, Linus Torvalds wrote:
> On Sun, 17 Apr 2005, Russell King wrote:
> > I still need to work out how to make my noddy script follow different
> > branches which may be present though.  However, for my common work
> > flow, it fits what I require.
> 
> The way to handle that is that you need to 
> 
>  - remember (or re-fetch) what the latest HEAD was that you merged with in 
>    my tree.
> 
>    if you didn't remember, you can just get all my objects and do a
> 
> 	merge-head $(cat .git/HEAD) $linus-current-head
> 
>    or something (using the current git archive that has a "merge-head" 
>    program. That gives you the most recent common head.

My script currently sends between two commit-ids, so...

>  - use "rev-tree" to show reachability
> 
> 	rev-tree $my-current-head $last-merge-head
> 		| sort -n		# sort by date rather than sha1
> 		| cut -d' ' -f2		# get the sha1 + "flags" mask
> 		| grep :1		# show the ones that are only
> 					# reachable from $my-current-head
> 
> and you now have a nice list of sha1's ordered by date.

This will (and does) do exactly what I want.  I'll also read into the
above a request that you want it in forward date order. 8)

-- 
Russell King


^ permalink raw reply	[relevance 0%]

* Re: [3/5] Add http-pull
  2005-04-17 19:24  3%         ` Daniel Barkalow
@ 2005-04-17 19:59  0%           ` Petr Baudis
  2005-04-21  3:27  3%             ` Brad Roberts
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-17 19:59 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

Dear diary, on Sun, Apr 17, 2005 at 09:24:27PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> On Sun, 17 Apr 2005, Petr Baudis wrote:
> 
> > Dear diary, on Sun, Apr 17, 2005 at 08:49:11PM CEST, I got a letter
> > where Daniel Barkalow <barkalow@iabervon.org> told me that...
> > > There's some trickiness for the history of commits thing for stopping at
> > > the point where you have everything, but also behaving appropriately if
> > > you try once, fail partway through, and then try again. It's on my queue
> > > of things to think about.
> > 
> > Can't you just stop the recursion when you hit a commit you already
> > have?
> 
> The problem is that, if you've fetched the final commit already, and then
> the server dies, and you try again later, you already have the last one,
> and so you think you've got everything.

Hmm, some kind of journaling? ;-)

> At this point, I also want to put off doing much further with recursion
> and commits until revision.h and such are sorted out.

Agreed.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: full kernel history, in patchset format
  @ 2005-04-18  0:06  2%       ` David Woodhouse
  2005-04-18  0:35  0%         ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: David Woodhouse @ 2005-04-18  0:06 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, Ingo Molnar, git

On Mon, 2005-04-18 at 01:39 +0200, Petr Baudis wrote:
> I think this is bad, bad, bad. If you don't keep around all the
> _commits_, you get into all sorts of troubles - when merging, when doing
> git log, etc. And the commits themselves are probably actually pretty
> small portion of the thing. I didn't do any actual measurement but I
> would be pretty surprised if it would be much more than few megabytes of
> data for the kernel history.

I'm not sure it's that bad -- and everyone already seems perfectly happy
not to have history going back before 2.6.12-rc2. We're not talking
about doing this by _default_ -- we're talking about allowing people to
keep trees pruned if they _want_ to. So I might want to drop history
before 2.6.0 on my laptop, for example.

> Of course an entirely different thing are _trees_ associated with those
> commits. As long as you stay with a simple three-way merge, you
> basically never want to look at trees which aren't heads and which you
> don't specifically request to look at. And the trees and what they carry
> inside is the main bulk of data.

If the trees are absent and you're trying to merge, what do you gain
from having the commit objects? And for the case of 'git log', I
certainly think it's acceptable that you lose out on those parts of
prehistory which you've explicitly removed from your local tree --
that's a feature, not a bug. 

For the special case of removing history before 2.6.12-rc2 from the
trees, I certainly think we can do it by leaving out all the commits,
not just the trees. We can do that easily, but there's no way we can
_add_ that history retrospectively if we omit it in the first place.

For history older than 2.6.12-rc2 I'd suggest that it would be available
in a different place, and absent from the 'main' working tree that
everyone uses by default. The only difference we'd see in the working
tree is that the 2.6.12-rc2 commit -- the oldest commit in that tree --
would actually have an absentee parent instead of appearing to be an
import. And all the sha1 hashes of all subsequent commits would be
different, of course.

To allow pruning of older objects in the general case would be a little
bit harder than that, because as things stand you'd be re-fetching them
every time you rsync from elsewhere -- but that wouldn't really be hard
to fix if we care.

Either way, I think it can probably be done by omitting the commit
objects as well as the trees -- but the important point is that we
_should_ include a 'parent' pointer in the oldest commit of the tree
we're working with, pointing back to the imported history.

-- 
dwmw2


^ permalink raw reply	[relevance 2%]

* Re: full kernel history, in patchset format
  2005-04-18  0:06  2%       ` David Woodhouse
@ 2005-04-18  0:35  0%         ` Petr Baudis
    0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-18  0:35 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Linus Torvalds, Ingo Molnar, git

Dear diary, on Mon, Apr 18, 2005 at 02:06:43AM CEST, I got a letter
where David Woodhouse <dwmw2@infradead.org> told me that...
> On Mon, 2005-04-18 at 01:39 +0200, Petr Baudis wrote:
> > Of course an entirely different thing are _trees_ associated with those
> > commits. As long as you stay with a simple three-way merge, you
> > basically never want to look at trees which aren't heads and which you
> > don't specifically request to look at. And the trees and what they carry
> > inside is the main bulk of data.
> 
> If the trees are absent and you're trying to merge, what do you gain
> from having the commit objects?

merge-base

> For the special case of removing history before 2.6.12-rc2 from the
> trees, I certainly think we can do it by leaving out all the commits,
> not just the trees. We can do that easily, but there's no way we can
> _add_ that history retrospectively if we omit it in the first place.

I'm confused by this paragraph, but that might be my English skills
failing somehow.

> For history older than 2.6.12-rc2 I'd suggest that it would be available
> in a different place, and absent from the 'main' working tree that
> everyone uses by default. The only difference we'd see in the working
> tree is that the 2.6.12-rc2 commit -- the oldest commit in that tree --
> would actually have an absentee parent instead of appearing to be an
> import. And all the sha1 hashes of all subsequent commits would be
> different, of course.

Yes, that's what I suggested too.

> To allow pruning of older objects in the general case would be a little
> bit harder than that, because as things stand you'd be re-fetching them
> every time you rsync from elsewhere -- but that wouldn't really be hard
> to fix if we care.

I think http-pull is very promising. :-)

It could be actually much faster than rsync, since you don't need to
build directory listings etc, which actually takes non-trivial amount of
time already with the kernel git repository.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: full kernel history, in patchset format
  @ 2005-04-18  0:51  3%               ` David Woodhouse
  2005-04-18  0:59  3%                 ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: David Woodhouse @ 2005-04-18  0:51 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, Ingo Molnar, git

On Mon, 2005-04-18 at 02:50 +0200, Petr Baudis wrote:
> I think I will make git-pasky's default behaviour (when we get
> http-pull, that is) to keep the complete commit history but only trees
> you need/want; togglable to both sides.

I think the default behaviour should probably be to fetch everything.

-- 
dwmw2


^ permalink raw reply	[relevance 3%]

* Re: full kernel history, in patchset format
  2005-04-18  0:51  3%               ` David Woodhouse
@ 2005-04-18  0:59  3%                 ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-04-18  0:59 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Linus Torvalds, Ingo Molnar, git

Dear diary, on Mon, Apr 18, 2005 at 02:51:59AM CEST, I got a letter
where David Woodhouse <dwmw2@infradead.org> told me that...
> On Mon, 2005-04-18 at 02:50 +0200, Petr Baudis wrote:
> > I think I will make git-pasky's default behaviour (when we get
> > http-pull, that is) to keep the complete commit history but only trees
> > you need/want; togglable to both sides.
> 
> I think the default behaviour should probably be to fetch everything.

I think fetching gigs of data just won't work for many people,
especially if they could do with a fraction of that.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: SCSI trees, merges and git status
  @ 2005-04-19  0:10  3%       ` David Woodhouse
  2005-04-19  0:16  0%         ` James Bottomley
  0 siblings, 1 reply; 200+ results
From: David Woodhouse @ 2005-04-19  0:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: James Bottomley, git

On Mon, 2005-04-18 at 17:03 -0700, Linus Torvalds wrote:
> Git does work like BK in the way that you cannot remove history when you
> have distributed it. Once it's there, it's there.

But older history can be pruned, and there's really no reason why an
http-based 'git pull' couldn't simply refrain from fetching commits
older than a certain threshold.

However, we can't _add_ the history if the current commits don't refer
to it. I really think we should take the imported git history and make
our 'current' tree refer to it -- even if just by having an appropriate
'parent' record in what is currently the oldest changeset in our tree;
the 2.6.12-rc2 import.

It doesn't matter that our oldest commit object refers to a nonexistent
parent, but that does allow us to import historical data if we _want_
to, and have it all work properly.

We should have the full historical git repo available within a day or
so, I believe. It would be really useful if we could make the current
trees refer back to that, instead of starting at 2.6.12-rc2.

-- 
dwmw2


^ permalink raw reply	[relevance 3%]

* Re: SCSI trees, merges and git status
  2005-04-19  0:10  3%       ` David Woodhouse
@ 2005-04-19  0:16  0%         ` James Bottomley
  0 siblings, 0 replies; 200+ results
From: James Bottomley @ 2005-04-19  0:16 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Linus Torvalds, git

On Tue, 2005-04-19 at 10:10 +1000, David Woodhouse wrote:
> On Mon, 2005-04-18 at 17:03 -0700, Linus Torvalds wrote:
> > Git does work like BK in the way that you cannot remove history when you
> > have distributed it. Once it's there, it's there.
> 
> But older history can be pruned, and there's really no reason why an
> http-based 'git pull' couldn't simply refrain from fetching commits
> older than a certain threshold.

Yes, that's what I did to get back to the commit just before the merge:

fsck-cache --unreachable 54ff646c589dcc35182d01c5b557806759301aa3|awk
'/^unreachable /{print $2}'|sed 's:^\(..\):.git/objects/\1/:'|xargs rm

removes all the objects from the tree prior to the bogus commit---it's
based on your (Linus') git-prune-script.

James



^ permalink raw reply	[relevance 0%]

* More patches
@ 2005-04-19  1:48  3% Daniel Barkalow
  2005-04-19  1:57  8% ` [3/4] Add http-pull Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-19  1:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Junio C Hamano

Here are the things I was saving for after the previous set:

 1: Report the actual contents of trees
 2: Add functions for scanning history by date
 3: Add http-pull, a program to fetch the objects you need by HTTP
 4: Change merge-base to find the most recent common ancestor

1 and 2 are core extensions. 3 might be best for the pasky tree. 4 is
mostly a demo of 2 and because Linus thought it was a better algorithm.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* [3/4] Add http-pull
  2005-04-19  1:48  3% More patches Daniel Barkalow
@ 2005-04-19  1:57  8% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-19  1:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Petr Baudis

This adds a command to pull a commit and dependant objects from an HTTP
server.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: Makefile
===================================================================
--- 50afb5dd4184842d8da1da8dcb9ca6a591dfc5b0/Makefile  (mode:100644 sha1:803f1d49c436efa570d779db6d350efbceb29ddd)
+++ f7f62e0d2a822ad0937fd98a826f65ac7f938217/Makefile  (mode:100644 sha1:a3d26213c085e8b6bbc1ec352df0996e558e7c38)
@@ -15,7 +15,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-	check-files ls-tree merge-base merge-cache unpack-file
+	check-files ls-tree merge-base merge-cache unpack-file http-pull
 
 all: $(PROG)
 
@@ -81,6 +81,11 @@
 unpack-file: unpack-file.o $(LIB_FILE)
 	$(CC) $(CFLAGS) -o unpack-file unpack-file.o $(LIBS)
 
+http-pull: LIBS += -lcurl
+
+http-pull: http-pull.o $(LIB_FILE)
+	$(CC) $(CFLAGS) -o http-pull http-pull.o $(LIBS)
+
 blob.o: $(LIB_H)
 cat-file.o: $(LIB_H)
 check-files.o: $(LIB_H)
@@ -105,6 +110,7 @@
 usage.o: $(LIB_H)
 unpack-file.o: $(LIB_H)
 write-tree.o: $(LIB_H)
+http-pull.o: $(LIB_H)
 
 clean:
 	rm -f *.o $(PROG) $(LIB_FILE)
Index: http-pull.c
===================================================================
--- /dev/null  (tree:50afb5dd4184842d8da1da8dcb9ca6a591dfc5b0)
+++ f7f62e0d2a822ad0937fd98a826f65ac7f938217/http-pull.c  (mode:100644 sha1:bd251f9e0748784bbd2cd5cf720f126d852fe888)
@@ -0,0 +1,170 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int tree = 0;
+static int commits = 0;
+static int all = 0;
+
+static int has(unsigned char *sha1)
+{
+	char *filename = sha1_file_name(sha1);
+	struct stat st;
+
+	if (!stat(filename, &st))
+		return 1;
+	return 0;
+}
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+
+	char *url;
+	char *posn;
+	FILE *local;
+	struct stat st;
+
+	if (!stat(filename, &st)) {
+		return 0;
+	}
+
+	local = fopen(filename, "w");
+
+	if (!local)
+		return error("Couldn't open %s\n", filename);
+
+	curl_easy_setopt(curl, CURLOPT_FILE, local);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	printf("Getting %s\n", hex);
+
+	if (curl_easy_perform(curl))
+		return error("Couldn't get %s for %s\n", url, hex);
+
+	fclose(local);
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		if (fetch(entries->item.tree->object.sha1))
+			return -1;
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1))
+		return -1;
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!all)
+			tree = 0;
+	}
+	if (commits) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	char *url;
+	int arg = 1;
+	unsigned char sha1[20];
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't') {
+			tree = 1;
+		} else if (argv[arg][1] == 'c') {
+			commits = 1;
+		} else if (argv[arg][1] == 'a') {
+			all = 1;
+			tree = 1;
+			commits = 1;
+		}
+		arg++;
+	}
+	if (argc < arg + 2) {
+		usage("http-pull [-c] [-t] [-a] commit-id url");
+		return 1;
+	}
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	if (fetch(sha1))
+		return 1;
+	if (process_commit(sha1))
+		return 1;
+
+	curl_global_cleanup();
+	return 0;
+}


^ permalink raw reply	[relevance 8%]

* Re: [GIT PATCH] I2C and W1 bugfixes for 2.6.12-rc2
  @ 2005-04-19 22:27  3%             ` Daniel Jacobowitz
  2005-04-19 22:33  0%               ` Greg KH
  2005-04-19 22:47  0%               ` Linus Torvalds
  0 siblings, 2 replies; 200+ results
From: Daniel Jacobowitz @ 2005-04-19 22:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, Git Mailing List

On Tue, Apr 19, 2005 at 03:00:04PM -0700, Linus Torvalds wrote:
> 
> 
> On Tue, 19 Apr 2005, Greg KH wrote:
> > 
> > It looks like your domain name isn't set up properly for your box (which
> > is why it worked for you, but not me before, causing that patch).
> 
> No, I think it's a bug in your domainname changes. I don't think you
> should do the domainname at all if the hostname has a dot in it.
> 
> Most machines I have access to (and that includes machines that are
> professionally maintained, not just my own cruddy setup) says "(none)" to
> domainname and have the full hostname in hostname.
> 
> And even the ones that use domainname tend to not have a fully qualified 
> DNS domain there. You need to use dnsdomainname to get that, and I don't 
> even know how to do it with standard libc.
> 
> So how about something like this?
> 
> (Somebody who actually knows how these things should be done - please feel 
> free to pipe up).

The glibc documentation blows for this, but what getdomainname comes
from uname(2), not from any DNS-related configuration.  Debian only
ever sets this if you're using NIS.

There's no real great way to get the current hostname; a lot of
applications do a reverse DNS lookup on the primary network interface,
with appropriate handwaving to define primary.

Easiest might be to punt to hostname --fqdn, or an equivalent to its
algorithm - which appears to be fetch the hostname from uname, do a DNS
lookup on that, and a reverse DNS lookup on the result.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[relevance 3%]

* Re: [GIT PATCH] I2C and W1 bugfixes for 2.6.12-rc2
  2005-04-19 22:27  3%             ` Daniel Jacobowitz
@ 2005-04-19 22:33  0%               ` Greg KH
  2005-04-19 22:47  0%               ` Linus Torvalds
  1 sibling, 0 replies; 200+ results
From: Greg KH @ 2005-04-19 22:33 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Linus Torvalds, Git Mailing List

On Tue, Apr 19, 2005 at 06:27:38PM -0400, Daniel Jacobowitz wrote:
> On Tue, Apr 19, 2005 at 03:00:04PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Tue, 19 Apr 2005, Greg KH wrote:
> > > 
> > > It looks like your domain name isn't set up properly for your box (which
> > > is why it worked for you, but not me before, causing that patch).
> > 
> > No, I think it's a bug in your domainname changes. I don't think you
> > should do the domainname at all if the hostname has a dot in it.
> > 
> > Most machines I have access to (and that includes machines that are
> > professionally maintained, not just my own cruddy setup) says "(none)" to
> > domainname and have the full hostname in hostname.
> > 
> > And even the ones that use domainname tend to not have a fully qualified 
> > DNS domain there. You need to use dnsdomainname to get that, and I don't 
> > even know how to do it with standard libc.
> > 
> > So how about something like this?
> > 
> > (Somebody who actually knows how these things should be done - please feel 
> > free to pipe up).
> 
> The glibc documentation blows for this, but what getdomainname comes
> from uname(2), not from any DNS-related configuration.  Debian only
> ever sets this if you're using NIS.

Well, somehow Gentoo sets this up properly, and I'm not using NIS.  Hm,
my SuSE boxes on the other hand...

> There's no real great way to get the current hostname; a lot of
> applications do a reverse DNS lookup on the primary network interface,
> with appropriate handwaving to define primary.
> 
> Easiest might be to punt to hostname --fqdn, or an equivalent to its
> algorithm - which appears to be fetch the hostname from uname, do a DNS
> lookup on that, and a reverse DNS lookup on the result.

Ick.  Let's stick with Linus's patch for now...

thanks,

greg k-h

^ permalink raw reply	[relevance 0%]

* Re: [GIT PATCH] I2C and W1 bugfixes for 2.6.12-rc2
  2005-04-19 22:27  3%             ` Daniel Jacobowitz
  2005-04-19 22:33  0%               ` Greg KH
@ 2005-04-19 22:47  0%               ` Linus Torvalds
  1 sibling, 0 replies; 200+ results
From: Linus Torvalds @ 2005-04-19 22:47 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Greg KH, Git Mailing List



On Tue, 19 Apr 2005, Daniel Jacobowitz wrote:
> 
> Easiest might be to punt to hostname --fqdn, or an equivalent to its
> algorithm - which appears to be fetch the hostname from uname, do a DNS
> lookup on that, and a reverse DNS lookup on the result.

Hah. I'll just commit my patch, and for any setup where that doesn't work, 
people can set COMMIT_AUTHOR_EMAIL by hand.

		Linus

^ permalink raw reply	[relevance 0%]

* [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
@ 2005-04-20 20:56  2% Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-04-20 20:56 UTC (permalink / raw)
  To: git

  Hello,

  so I've "released" git-pasky-0.6.2 (my SCMish layer on top of Linus
Torvalds' git tree history storage system), find it at the usual

	http://pasky.or.cz/~pasky/dev/git/

  git-pasky-0.6 has couple of big changes; mainly enhanced git diff,
git patch (to be renamed to cg mkpatch), enhanced git pull and
completely reworked git merge - it now uses the git-core facilities for
merging, and does the merges in-tree. Plenty of smaller stuff, some
bugfixes and some new bugs, and of course regular merging with Linus.

  The most important change for current users is the objects database
SHA1 keys change and (comparatively minor) directory cache format
change. This makes "pulling up" from older revisions rather difficult.
Linus' instructions _should_ work for you too, basically (you should
replace cat .git/HEAD with cat .git/heads/* or equivalent - note that
convert-tree does not accept multiple arguments so you need to invoke it
multiple times), but I didn't test it well (I did it the lowlevel way
completely since I needed to simultaneously merge with Linus).

  But if you can't be bothered by this or fear touching stuff like that,
and you do not have any local commits in your tree (it would be pretty
strange if you had and still fear), just fetch the tarball (which is
preferrable than git init for me since it eats up _significantly_
smaller portion of my bandwidth).

  I had to release git-pasky-0.6.1 since Linus changed the directory
cache format during me releasing git-pasky-0.6. And git-pasky-0.6.2
fixes gitmerge-file.sh script missing in the list of scripts for
install.


  So, now for the heads-up part. We will undergo at least two major
changes now. First, I'll probably make git-pasky to use the directory
cache for the add/rm queues now that we have diff-cache.

  Second, I've decided to straighten up the naming now that we still
have a chance. There will be no git-pasky-0.7, sorry. You'll get
cogito-0.7 instead. I've decided for it since after some consideration
having it named differently is the right thing (tm).

  The short command version will change from 'git' to 'cg', which should
be shorter to type and free the 'git' command for possible eventual
entry gate for the git commands (so that they are more
namespace-friendly, and it might make most sense anyway if we get fully
libgitized; but this is more of long-term ideas).

  The usage changes:

  cg patch -> cg mkpatch	('patch' is the program which _applies_ it)
  cg apply -> cg patch		(analogically to diff | patch)

  cg pull will now always only pull, never merge.

  cg update will do pull + merge.

  cg track will either just set the default for cg update if you pass it
no parameters, or disappear altogether; I think it could default to the
'origin' branch (or 'master' branch for non-master branches if no 'origin'
branch is around), and I'd rather set up some "cg admin" where you could
set all this stuff - from this to e.g. the committer details [*1*]. You
likely don't need to change the default every day.

  I must say that I'm pretty happy with the Cogito's command set
otherwise, though. I actually think it has now (almost?) all commands
it needs, and it is not too likely that (many) more will be added -
simple means easy to use, which is Cogito's goal. Compare with
the command set of GNU arch clones. ;-)


  [*1*] The committer details in .git would override the environemnt
variables to discourage people of trying to alter them based on
whatever, since that's not what they are supposed to do. They can always
just change the .git stuff if they _really_ need to.


  Comments welcomed, as well as new ideas. Persuading me to change what
I sketched here will need some good arguments, though. ;-)

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 2%]

* Re: [3/5] Add http-pull
  2005-04-17 19:59  0%           ` Petr Baudis
@ 2005-04-21  3:27  3%             ` Brad Roberts
  2005-04-21  4:28  3%               ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Brad Roberts @ 2005-04-21  3:27 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Daniel Barkalow, git

On Sun, 17 Apr 2005, Petr Baudis wrote:

> Date: Sun, 17 Apr 2005 21:59:00 +0200
> From: Petr Baudis <pasky@ucw.cz>
> To: Daniel Barkalow <barkalow@iabervon.org>
> Cc: git@vger.kernel.org
> Subject: Re: [3/5] Add http-pull
>
> Dear diary, on Sun, Apr 17, 2005 at 09:24:27PM CEST, I got a letter
> where Daniel Barkalow <barkalow@iabervon.org> told me that...
> > On Sun, 17 Apr 2005, Petr Baudis wrote:
> >
> > > Dear diary, on Sun, Apr 17, 2005 at 08:49:11PM CEST, I got a letter
> > > where Daniel Barkalow <barkalow@iabervon.org> told me that...
> > > > There's some trickiness for the history of commits thing for stopping at
> > > > the point where you have everything, but also behaving appropriately if
> > > > you try once, fail partway through, and then try again. It's on my queue
> > > > of things to think about.
> > >
> > > Can't you just stop the recursion when you hit a commit you already
> > > have?
> >
> > The problem is that, if you've fetched the final commit already, and then
> > the server dies, and you try again later, you already have the last one,
> > and so you think you've got everything.
>
> Hmm, some kind of journaling? ;-)

How about fetching in the inverse order.  Ie, deepest parents up towards
current.  With that method the repository is always self consistent, even
if not yet current.

Later,
Brad


^ permalink raw reply	[relevance 3%]

* Re: [3/5] Add http-pull
  2005-04-21  3:27  3%             ` Brad Roberts
@ 2005-04-21  4:28  3%               ` Daniel Barkalow
  2005-04-21 22:05  0%                 ` tony.luck
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-21  4:28 UTC (permalink / raw)
  To: Brad Roberts; +Cc: Petr Baudis, git

On Wed, 20 Apr 2005, Brad Roberts wrote:

> How about fetching in the inverse order.  Ie, deepest parents up towards
> current.  With that method the repository is always self consistent, even
> if not yet current.

You don't know the deepest parents to fetch until you've read everything
more recent, since the history you'd have to walk is the history you're
downloading.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* Re: Linux 2.6.12-rc3
       [not found]           ` <20050421190009.GC475@openzaurus.ucw.cz>
@ 2005-04-21 19:09  3%         ` Petr Baudis
  2005-04-21 21:38  0%           ` Pavel Machek
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-21 19:09 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linus Torvalds, kernel list, git

Dear diary, on Thu, Apr 21, 2005 at 09:00:09PM CEST, I got a letter
where Pavel Machek <pavel@ucw.cz> told me that...
> Hi!

Hi,

> > > Well, not sure.
> > > 
> > > I did 
> > > 
> > > git track linus
> > > git cancel
> > > 
> > > but Makefile still contains -rc2. (Is "git cancel" right way to check
> > > out the tree?)
> > 
> > No. git cancel does what it says - cancels your local changes to the
> > working tree. git track will only set that next time you pull from
> > linus, the changes will be automatically merged. (Note that this will
> > change with the big UI change.)
> 
> Is there way to say "forget those changes in my repository, I want
> just plain vanilla" without rm -rf?

git cancel will give you "plain last commit". If you need plain vanilla,
the "hard way" now is to just do

	commit-id >.git/HEAD

but your current HEAD will be lost forever. Or do

	git fork vanilla ~/vanilla linus

and you will have the vanilla tree tracking linus in ~/vanilla.

I'm not yet sure if we should have some Cogito interface for doing this
and what its semantics should be.

> I see quite a lot of problems with fsck-tree. Is that normal?
> (I ran out of disk space few times during different operations...)

Actually, in case your tree is older than about two days, I hope you did
the convert-cache magic or fetched a fresh tree?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: Linux 2.6.12-rc3
  2005-04-21 19:09  3%         ` Linux 2.6.12-rc3 Petr Baudis
@ 2005-04-21 21:38  0%           ` Pavel Machek
  2005-04-21 21:41  0%             ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Pavel Machek @ 2005-04-21 21:38 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, kernel list, git

Hi!

It seems that someone should write "Kernel hacker's guide to
git"... Documentation/git.txt seems like good place. I guess I'll do
it.

> > just plain vanilla" without rm -rf?
> 
> git cancel will give you "plain last commit". If you need plain vanilla,
> the "hard way" now is to just do
> 
> 	commit-id >.git/HEAD
> 
> but your current HEAD will be lost forever. Or do
> 
> 	git fork vanilla ~/vanilla linus
> 
> and you will have the vanilla tree tracking linus in ~/vanilla.

Ok, thanks.

> I'm not yet sure if we should have some Cogito interface for doing this
> and what its semantics should be.

What is Cogito, BTW?

> > I see quite a lot of problems with fsck-tree. Is that normal?
> > (I ran out of disk space few times during different operations...)
> 
> Actually, in case your tree is older than about two days, I hope you did
> the convert-cache magic or fetched a fresh tree?

No, I did not anything like that. I guess it is rm -rf time, then...

									Pavel
-- 
Boycott Kodak -- for their patent abuse against Java.

^ permalink raw reply	[relevance 0%]

* Re: Linux 2.6.12-rc3
  2005-04-21 21:38  0%           ` Pavel Machek
@ 2005-04-21 21:41  0%             ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-04-21 21:41 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linus Torvalds, kernel list, git

Dear diary, on Thu, Apr 21, 2005 at 11:38:11PM CEST, I got a letter
where Pavel Machek <pavel@ucw.cz> told me that...
> Hi!
> 
> It seems that someone should write "Kernel hacker's guide to
> git"... Documentation/git.txt seems like good place. I guess I'll do
> it.

I've also started writing some tutorial-like guide to Cogito on my
notebook, but I have time for that only during lectures. :^)

> > I'm not yet sure if we should have some Cogito interface for doing this
> > and what its semantics should be.
> 
> What is Cogito, BTW?

New name for git-pasky. Everyone will surely rejoice as the usage will
change significantly. But better let's clean it up now.

(For more details, check git@ archives for git-pasky-0.6 announcement.)

> > > I see quite a lot of problems with fsck-tree. Is that normal?
> > > (I ran out of disk space few times during different operations...)
> > 
> > Actually, in case your tree is older than about two days, I hope you did
> > the convert-cache magic or fetched a fresh tree?
> 
> No, I did not anything like that. I guess it is rm -rf time, then...

That's the root of all your problems then.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [3/5] Add http-pull
  2005-04-21  4:28  3%               ` Daniel Barkalow
@ 2005-04-21 22:05  0%                 ` tony.luck
  2005-04-22 19:46  0%                   ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: tony.luck @ 2005-04-21 22:05 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Brad Roberts, Petr Baudis, git

On Wed, 20 Apr 2005, Brad Roberts wrote:
> How about fetching in the inverse order.  Ie, deepest parents up towards
> current.  With that method the repository is always self consistent, even
> if not yet current.

Daniel Barkalow replied:
> You don't know the deepest parents to fetch until you've read everything
> more recent, since the history you'd have to walk is the history you're
> downloading.

You "just" need to defer adding tree/commit objects to the repository until
after you have inserted all objects on which they depend.  That's what my
"wget" based version does ... it's very crude, in that it loads all tree
& commit objects into a temporary repository (.gittmp) ... since you can
only use "cat-file" and "ls-tree" on things if they live in objects/xx/xxx..xxx
The blobs can go directly into the real repo (but to be really safe you'd
have to ensure that the whole blob had been pulled from the network before
inserting it ... it's probably a good move to validate everything that you
pull from the outside world too).

-Tony

^ permalink raw reply	[relevance 0%]

* Re: proposal: delta based git archival
  @ 2005-04-22  9:49  3% ` Jaime Medrano
  0 siblings, 0 replies; 200+ results
From: Jaime Medrano @ 2005-04-22  9:49 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: git

On 4/22/05, Michel Lespinasse <walken@zoy.org> wrote:
> I noticed people on this mailing list start talking about using blob deltas
> for compression, and the basic issue that the resulting files are too small
> for efficient filesystem storage. I thought about this a little and decided
> I should send out my ideas for discussion.
> 

I've been thinking in another simpler approach.

The main benefit of using deltas is reducing the bandwith use in
pull/push. My idea is leaving the blob storage as it is by now and
adding a new kind of object (remote) that acts as a link to an object
in another repository.

So that, when you rsync, you don't have to get all the blobs (which
can be a lot of data), but only the sha1 of the new objects created.
Then a remote object is created for each new object in the local
repository pointing to its location in the external repository.

Once the rsync is done, when git has to access any of the new objects
they can be fetched from the original location, so that only necessary
objects are transfered.

This way, the cost of a sync in terms of bandwith is nearly zero.

I've been working on this, so if you think it to be a good idea, I can
send a patch when I get it fully working.

Regards,
Jaime Medrano.
http://jmedrano.sl-form.com

^ permalink raw reply	[relevance 3%]

* Re: First web interface and service API draft
  @ 2005-04-22 12:10  3% ` Petr Baudis
       [not found]       ` <1114176579.3233.42.camel@localhost>
    1 sibling, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-22 12:10 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

Dear diary, on Fri, Apr 22, 2005 at 12:41:56PM CEST, I got a letter
where Christian Meder <chris@absolutegiganten.org> told me that...
> Hi,

Hi,

> /<project>
> 
> Ok. The URI should start by stating the project name
> e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> that we want to have one root namespace per git archive in the long
> run. Additionally you can always put rewriting or redirecting rules at
> the root level for additional convenience when there's an obvious
> default project.
> 
> Should provide some meta data, stats, etc. if available.

I don't think this makes much sense. I think you should just apply -p1
to all the directories, and define that there should be some / page
which should contain some metadata regarding the repository you are
accessing (probably branches, tags, and such).

> -------
> /<project>/blob/<blob-sha1>
> /<project>/commit/<commit-sha1>
> 
> These are the easy ones: the web interface should be able to spit out
> the plain text data of a blob and a commit at these URIs. Users would
> be probably scripts and other downloads.
> Open questions:
> * Blob data should be probably binary ?

What do you mean by binary?

> * Should it be commit or changeset ? Linus seems to have changed
> nomenclature in the REAME

We call it commit everywhere but in the README. :-)

The "changeset" name is bad anyway. It is a commit of a complete tree
state, diff against one of its parent commits is the set of changes.

> -------
> /<project>/tree/<tree-sha1>
> 
> Tree objects are served in binary form. Primary audience are scripts,
> etc. Human beings will probably get a heart attack when they
> accidentally visit this URI.

Binary form is unusable for scripts.

Anything wrong with putting ls-tree output there?


We should also have /gitobj/<sha1> for fetching the raw git objects.

> -------
> /<project>/blob/<blob-sha1>.html
> /<project>/commit/<commit-sha1>.html
> /<project>/tree/<tree-sha1>.html
> 
> A HTML version of blob, commit and tree fully linked aimed at human
> beings.

How can I imagine an "HTML version of blob"?


> -------
> /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> 
> Non recursive HTML view of the objects which are contained in the diff
> fully linked with the individual HTML views.

Why not .html?

> -------
> /<project>/changelog/<time-spec>

I'd personally prefer /log/, but whatever.

For consistency, I'd stay with the plaintext output by default, .html if
requested.

And I think abusing directories for this is bad. Query string seems much
more appropriate, since this is something that changes dynamically a
lot, not a permanent resource identifier.

OTOH, I'd use

	/log/<commit>

to specify what commit to start at. It just does not make sense
otherwise, you would not know where to start.

I think the <commit> should follow the same or similar rules as Cogito
id decoding. E.g. to get latest Linus' changelog, you'd do

	/log/linus

> -------
> /<project>/changelog/<time-spec>/search/<regexp>
> 
> HTML changelog for the given <time-spec> filtered by the <regexp>.
> 
> * again plain version needed ?
> 
> ------
> /<project>/changelog/<time-spec>/search/author/<regexp>
> /<project>/changelog/<time-spec>/search/committer/<regexp>
> /<project>/changelog/<time-spec>/search/signedoffby/<regexp>
> 
> convenience wrappers for generic search restricted to these fields.

Same here. just ?author=...&committer=...&signedoffby=... etc. You can
even combine several criteria.

> ------
> 
> open questions:
> * how to generate and publish additional merge information ?

I don't understand....

> * how to generate and publish tree and blob history information ? This
> is probably expensive with git.

...this either.

> * how to represent branches ? should we code up the branches in the
> project id like linux-2.6-mm or whatever ?

See above.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: [3/5] Add http-pull
  2005-04-21 22:05  0%                 ` tony.luck
@ 2005-04-22 19:46  0%                   ` Daniel Barkalow
  2005-04-22 22:40  0%                     ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-22 19:46 UTC (permalink / raw)
  To: tony.luck; +Cc: Brad Roberts, Petr Baudis, git

On Thu, 21 Apr 2005 tony.luck@intel.com wrote:

> On Wed, 20 Apr 2005, Brad Roberts wrote:
> > How about fetching in the inverse order.  Ie, deepest parents up towards
> > current.  With that method the repository is always self consistent, even
> > if not yet current.
> 
> Daniel Barkalow replied:
> > You don't know the deepest parents to fetch until you've read everything
> > more recent, since the history you'd have to walk is the history you're
> > downloading.
> 
> You "just" need to defer adding tree/commit objects to the repository until
> after you have inserted all objects on which they depend.  That's what my
> "wget" based version does ... it's very crude, in that it loads all tree
> & commit objects into a temporary repository (.gittmp) ... since you can
> only use "cat-file" and "ls-tree" on things if they live in objects/xx/xxx..xxx
> The blobs can go directly into the real repo (but to be really safe you'd
> have to ensure that the whole blob had been pulled from the network before
> inserting it ... it's probably a good move to validate everything that you
> pull from the outside world too).

The problem with this general scheme is that it means that you have to
start over if something goes wrong, rather than resuming from where you
left off (and being able to use what you got until then). I think a better
solution is to track what things you mean to have and what things you
expect you could get from where.

As for validation, I now have my programs (which I haven't gotten a chance
to send out recently) checking everything as it is downloaded to make sure
it is complete (zlib likes it) and has the correct hash.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* Re: [3/5] Add http-pull
  2005-04-22 19:46  0%                   ` Daniel Barkalow
@ 2005-04-22 22:40  0%                     ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-04-22 22:40 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: tony.luck, Brad Roberts, git

Dear diary, on Fri, Apr 22, 2005 at 09:46:35PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> On Thu, 21 Apr 2005 tony.luck@intel.com wrote:
> 
> > On Wed, 20 Apr 2005, Brad Roberts wrote:
> > > How about fetching in the inverse order.  Ie, deepest parents up towards
> > > current.  With that method the repository is always self consistent, even
> > > if not yet current.
> > 
> > Daniel Barkalow replied:
> > > You don't know the deepest parents to fetch until you've read everything
> > > more recent, since the history you'd have to walk is the history you're
> > > downloading.
> > 
> > You "just" need to defer adding tree/commit objects to the repository until
> > after you have inserted all objects on which they depend.  That's what my
> > "wget" based version does ... it's very crude, in that it loads all tree
> > & commit objects into a temporary repository (.gittmp) ... since you can
> > only use "cat-file" and "ls-tree" on things if they live in objects/xx/xxx..xxx
> > The blobs can go directly into the real repo (but to be really safe you'd
> > have to ensure that the whole blob had been pulled from the network before
> > inserting it ... it's probably a good move to validate everything that you
> > pull from the outside world too).
> 
> The problem with this general scheme is that it means that you have to
> start over if something goes wrong, rather than resuming from where you
> left off (and being able to use what you got until then).

Huh. Why? You just go back to history until you find a commit you
already have. If you did it the way as Tony described, if you have that
commit, you can be sure that you have everything it depends on too.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: First web interface and service API draft
       [not found]       ` <1114176579.3233.42.camel@localhost>
@ 2005-04-22 22:57  0%     ` Petr Baudis
  2005-04-24 22:29  3%       ` Christian Meder
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-04-22 22:57 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

Dear diary, on Fri, Apr 22, 2005 at 03:29:39PM CEST, I got a letter
where Christian Meder <chris@absolutegiganten.org> told me that...
> > > /<project>
> > > 
> > > Ok. The URI should start by stating the project name
> > > e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> > > that we want to have one root namespace per git archive in the long
> > > run. Additionally you can always put rewriting or redirecting rules at
> > > the root level for additional convenience when there's an obvious
> > > default project.
> > > 
> > > Should provide some meta data, stats, etc. if available.
> > 
> > I don't think this makes much sense. I think you should just apply -p1
> > to all the directories, and define that there should be some / page
> > which should contain some metadata regarding the repository you are
> > accessing (probably branches, tags, and such).
> 
> Hi,

Hi,

> remember that I want to stay stateless as long as possible so everything
> important has to be encoded in the url. So somewhere in the url the git
> archive to show has to be encoded. If I remove the <project> portion how
> do I know on the server side which repo to show ?

since you are configured appropriately.

You need to be anyway. Someone needs to tell you or your web server
"this lives at http://pasky.or.cz/wit/". So you bind "this" to the
given repository.

No problem with an additional configuration possibility to say "at that
place, clone your life place for the given repositories", but if I want
to have just a single repository at a given URL, it should be possible.

I'm just trying to argue that having it _forced_ to have <project> as
the part of the URL is useless; this is matter of configuration.

> > > * Blob data should be probably binary ?
> > 
> > What do you mean by binary?
> 
> content-type: binary/octet-stream

Ah. So just as-is, you mean?

> > Anything wrong with putting ls-tree output there?
> 
> ls-tree output should be in .html (see below)

What if I actually want to process it by a script?

> > > -------
> > > /<project>/tree/<tree-sha1>
> > > 
> > > Tree objects are served in binary form. Primary audience are scripts,
> > > etc. Human beings will probably get a heart attack when they
> > > accidentally visit this URI.
> > 
> > Binary form is unusable for scripts.
> 
> Why should it be unusable for a downloading script. It's just the raw
> git object.
> 
> > We should also have /gitobj/<sha1> for fetching the raw git objects.
> 
> Everything above is supposed to be raw git objects. No special encoding
> whatever.

You have a consistency problem here.

Raw git objects as in database contain the leading object type at the
start, then possibly some more stuff, then '\0' and then compressed
binary stuff. You mean you are exporting _this_ stuff through this?

That's not very useful except for http-pull, if you as me. It also does
not blend well with the fact that you say commits are in text or so.

> > > -------
> > > /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> > > 
> > > Non recursive HTML view of the objects which are contained in the diff
> > > fully linked with the individual HTML views.
> > 
> > Why not .html?
> 
> I think .html isn't very clear because it would
> be ..../<ancestor-tree-sha1>.html which somehow looks like it has
> anything to do with the ancestor-tree. But it's the html version of the
> _diff_ and not the ancestor-tree.

Perhaps /tree/<sha1>.html/diff/<ancestor> ?

I'd lend to ?diff=<ancestor> more and more. The path part of URI is
there to express _hierarchy_, I think you are abusing that when there is
no hierarchy.

> > For consistency, I'd stay with the plaintext output by default, .html if
> > requested.
> 
> Remember that I'm just sitting on top of git and not git-pasky right
> now. So there's no canonical changelog plaintext output for me. But I'm
> not religious about that.

But there is canonical HTML output for you? ;-)

> > OTOH, I'd use
> > 
> > 	/log/<commit>
> > 
> > to specify what commit to start at. It just does not make sense
> > otherwise, you would not know where to start
> 
> Start for the changelog is always head, but I guess that's pretty
> standard. With git log you always start at the head too.

If you are sitting on top of git and not git-pasky, you have no assured
HEAD information at all.

> If you want to start at a specific commit. Why not start
> at /linux-2.6/commit/<sha1>.html ?

And how does that give me the changelog?

> > I think the <commit> should follow the same or similar rules as Cogito
> > id decoding. E.g. to get latest Linus' changelog, you'd do
> > 
> > 	/log/linus
> 
> Like I said above I think the shown head should be encoded in the
> project id.

I thought the project was mapped to repository? But I might just have
blindly assumed that. ;-) (That does not make me like your approach
more, though.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: First web interface and service API draft
  @ 2005-04-23  6:39  3%     ` Jon Seymour
  0 siblings, 0 replies; 200+ results
From: Jon Seymour @ 2005-04-23  6:39 UTC (permalink / raw)
  To: Christian Meder; +Cc: Jan Harkes, git

On 4/23/05, Christian Meder <chris@absolutegiganten.org> wrote:
> On Fri, 2005-04-22 at 10:23 -0400, Jan Harkes wrote:
> > On Fri, Apr 22, 2005 at 12:41:56PM +0200, Christian Meder wrote:
> > > -------
> > > /<project>/blob/<blob-sha1>
> > > /<project>/commit/<commit-sha1>
> >
> > It is trivial to find an object when given a sha, but to know the object
> > type you'd have to decompress it and check inside. Also the way git
> > stores these things you can't have both a blob and a commit with the
> > same sha anyways.
> >
> > So why not use,
> >     /<project/<hexadecimal sha1 representation>
> >       will give you the raw object.
> 
> Hmm. I'm not sure about throwing away the <objecttype> information in
> the url. I think I'd prefer to retain the blob, tree and commit
> namespaces because I think they help API users to explicitly state what
> kind of object they expect. I can't think of a scenario where I'd want a
> <sha1> of unknown type. Do you have a specific use case in mind ?
> 

I was initially inclined to agree with Jan, but on brief reflection I
think Christian is correct to want to preserve the type info in the
URI. There are numerous reasons why this is a good idea:

- both carbon and silicon users of the URI who don't have direct
access to the repository can infer what the URI refers to without
actually fetching it

- programmatically the web server can make request routing decisions
based on the URI alone and is not forced to perfom a relatively
expensive and unnecessary db hit to derive the type.

That said, I can see some value in providing a web-based
type-resolution service.

So, given a URI of the form

     /<project>/object/<hexadecimal sha1 representation>

the server should resolve the type of the named object and issue an
HTTP re-direct to the typed URI, e.g.

     /<project>/blob/<hexadecimal sha1 representation>

Because browsers tend not to remember redirection sources, external
entities end up recording the typed URIs, but all the benefits of
Jan's suggestion still accrue.

jon.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply	[relevance 3%]

* Re: Git-commits mailing list feed.
  @ 2005-04-23 19:30  3%                 ` Linus Torvalds
  2005-04-23 20:49  4%                   ` Jan Harkes
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-04-23 19:30 UTC (permalink / raw)
  To: Jan Harkes
  Cc: David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
	Git Mailing List



On Sat, 23 Apr 2005, Jan Harkes wrote:
> 
> Why not keep the tags object outside of the tree in the tags/ directory.

Because then you have all those special cases with fetching them and with 
fsck, and with shared object directories. In other words: no. 

You can have symlinks (or even better, just a single file with all the
tags listed, which you can create with "fsck", for example) from the tags/
directory, but the thing is, objects go in the object directory and
nowhere else.

			Linus

^ permalink raw reply	[relevance 3%]

* Re: Git-commits mailing list feed.
  2005-04-23 19:30  3%                 ` Linus Torvalds
@ 2005-04-23 20:49  4%                   ` Jan Harkes
  2005-04-23 21:28  0%                     ` Git transfer protocols (was: Re: Git-commits mailing list feed) Mike Taht
  2005-04-23 23:29  3%                     ` Git-commits mailing list feed Linus Torvalds
  0 siblings, 2 replies; 200+ results
From: Jan Harkes @ 2005-04-23 20:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
	Git Mailing List

On Sat, Apr 23, 2005 at 12:30:38PM -0700, Linus Torvalds wrote:
> On Sat, 23 Apr 2005, Jan Harkes wrote:
> > 
> > Why not keep the tags object outside of the tree in the tags/ directory.
> 
> Because then you have all those special cases with fetching them and with 
> fsck, and with shared object directories. In other words: no. 

I respectfully disagree,

rsync works fine for now, but people are already looking at implementing
smarter (more efficient) ways to synchronize git repositories by
grabbing missing commits, and from there fetching any missing tree and
file blobs. However there is no such linkage to discover missing tag
objects, only a full rsync would be able to get them and for that it has
to send the name of every object in the repository to the other side to
check for any missing ones.

So fetching tags is already going to be a special case.

And any form of validation of a tag is a special operation. In fact tags
could be as simple as a the sha of an (like pasky's tags) followed by
the detached pgp signature of the tagged object instead of trying to
signing the tag itself. That also avoids having to strip the signature
part from the tag when we want to validate it.

Jan

^ permalink raw reply	[relevance 4%]

* Git transfer protocols (was: Re: Git-commits mailing list feed)
  2005-04-23 20:49  4%                   ` Jan Harkes
@ 2005-04-23 21:28  0%                     ` Mike Taht
  2005-04-23 22:22  0%                       ` Jan Harkes
  2005-04-23 23:29  3%                     ` Git-commits mailing list feed Linus Torvalds
  1 sibling, 1 reply; 200+ results
From: Mike Taht @ 2005-04-23 21:28 UTC (permalink / raw)
  To: Jan Harkes
  Cc: Linus Torvalds, David Woodhouse, Jan Dittmer, Greg KH,
	Git Mailing List

Jan Harkes wrote:

> rsync works fine for now, but people are already looking at implementing
> smarter (more efficient) ways to synchronize git repositories by
> grabbing missing commits, and from there fetching any missing tree and
> file blobs. However there is no such linkage to discover missing tag
> objects, only a full rsync would be able to get them and for that it has
> to send the name of every object in the repository to the other side to
> check for any missing ones.

I think that one reason why rsync is inefficient for git is that it 
appears to need an acknowledgement after every file. (at least, that's 
what what the rhythm of the packets looked like when I sniffed it 
earlier, I don't know anything else about it) For a series of very small 
files this interacts badly with tcp's flow control mechanisms. Perhaps 
rsync could be modified for a "sliding file acknowledgement window".

Most "swarming protocols" (e.g BitTorrent, eDonkey) work well for one 
big file shared among multiple hosts, but poorly for lots of small files.

*Nothing* out there matches the simplicity of git's sha1 filename 
length... but

Something like robcast or fcast/flute might be of interest:

http://www.inrialpes.fr/planete/people/roca/mcl/mcl_in_short.html

Or one of the multicast netnews experiments:

"mcntp" http://mcntp.sourceforge.net/
"newscaster" http://www.dmn.tzi.org/en/newscaster.html

lastly, Monotone has it's own "netsync" protocol
(via http://www.venge.net/monotone/faq.html)

"[netsync] is a bi-directional pipelined protocol for synchronizing 
collections using a tree of hashed indices. It allows any copy of 
monotone to function as either a client or a server, and rapidly 
synchronize or half-synchronize (push / pull) their database with 
another user. It is somewhat similar in flavor to rsync or Unison, in 
that it quickly and idempotently synchronizes information across the 
network without needing to store any local state; however, it is much 
more efficient than these protocols."



-- 

Mike Taht



^ permalink raw reply	[relevance 0%]

* Re: Git transfer protocols (was: Re: Git-commits mailing list feed)
  2005-04-23 21:28  0%                     ` Git transfer protocols (was: Re: Git-commits mailing list feed) Mike Taht
@ 2005-04-23 22:22  0%                       ` Jan Harkes
  0 siblings, 0 replies; 200+ results
From: Jan Harkes @ 2005-04-23 22:22 UTC (permalink / raw)
  To: Mike Taht
  Cc: Jan Harkes, Linus Torvalds, David Woodhouse, Jan Dittmer, Greg KH,
	Git Mailing List

On Sat, Apr 23, 2005 at 02:28:59PM -0700, Mike Taht wrote:
> Jan Harkes wrote:
> 
> >rsync works fine for now, but people are already looking at implementing
> >smarter (more efficient) ways to synchronize git repositories by
> >grabbing missing commits, and from there fetching any missing tree and
> >file blobs. However there is no such linkage to discover missing tag
> >objects, only a full rsync would be able to get them and for that it has
> >to send the name of every object in the repository to the other side to
> >check for any missing ones.

Actually I just realized that I personally probably wouldn't care about
most of the tags that people might add to their trees. Maybe once in a
while, but the tag would probably be obtained through email or the web.

> I think that one reason why rsync is inefficient for git is that it 
...
> lastly, Monotone has it's own "netsync" protocol
> (via http://www.venge.net/monotone/faq.html)

Interesting, probably something like any of these might end up useful to
replace rsync for mirroring full git repositories.

I'm actually more selfish than that and am thinking on how I expect to
use git.

See, I don't care about most of the objects in the repository. In
practice I would probably pull only the latest 'head' once in a while
look for missing commits to give me a quick overview of what has
changed. Then if a diff between the new head and my current tree shows
that anything might have changed in an area I actually do care about,
such as the VFS, I'd want something that does a quick binary search to
identify the commits where the changes occured. But for that I only
need to look at a limited number of tree objects.

As long as I know that someone, somewhere is archiving the whole
repository I can always come back later and fill in the blanks.

HTTP/1.1 with persistent connections and some request interleaving is
probably the fastest and most server friendly way to grab those objects
I really care about.

Jan


^ permalink raw reply	[relevance 0%]

* Re: Git-commits mailing list feed.
  2005-04-23 20:49  4%                   ` Jan Harkes
  2005-04-23 21:28  0%                     ` Git transfer protocols (was: Re: Git-commits mailing list feed) Mike Taht
@ 2005-04-23 23:29  3%                     ` Linus Torvalds
  1 sibling, 0 replies; 200+ results
From: Linus Torvalds @ 2005-04-23 23:29 UTC (permalink / raw)
  To: Jan Harkes
  Cc: David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
	Git Mailing List



On Sat, 23 Apr 2005, Jan Harkes wrote:
> 
> I respectfully disagree,
> 
> rsync works fine for now, but people are already looking at implementing
> smarter (more efficient) ways to synchronize git repositories by
> grabbing missing commits, and from there fetching any missing tree and
> file blobs.

Bit this is a _feature_.

Other people normally shouldn't be interested in your tags. I think it's a 
mistake to make everybody care.

So you normally would fetch only tags you _know_ about. For example, one 
of the reasons we've been _avoiding_ personal tags in teh BK trees is that 
it just gets really ugly really quickly because they get percolated up to 
everybody else. That means that in a BK tree, you can't sanely use tags 
for "private" stuff, like telling somebody else "please sync with this 
tag".

So having the tag in the object database means that fsck etc will notice 
these things, and can build up a list of tags you know about. It also 
means that you can have tag-aware synchronization tools, ie exactly the 
kind of tools that only grab missing commits can also then be used to 
select missing tags according to some _private_ understanding of what tags 
you might want to find..

		Linus

^ permalink raw reply	[relevance 3%]

* [PATCH 0/5] Better merge-base, alternative transport programs
@ 2005-04-24  0:03  3% Daniel Barkalow
  2005-04-24  0:24  8% ` [PATCH 5/5] Various " Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-04-24  0:03 UTC (permalink / raw)
  To: git

This series contains three patches to add functionality to the library
routines necessary for the rest of the series, a patch to change the
merge-base implementation such that it always returns one of its arguments
when possible (by way of using the date-based algorithm), and a patch to
support fetching what is needed from a repository by HTTP, and both
pushing and pulling by ssh.

 1: Add some functions for commit lists
 2: Parse tree objects completely
 3: Add some functions related to files
 4: Replace merge-base
 5: Add push and pull programs

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* [PATCH 5/5] Various transport programs
  2005-04-24  0:03  3% [PATCH 0/5] Better merge-base, alternative transport programs Daniel Barkalow
@ 2005-04-24  0:24  8% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-24  0:24 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds

This patch adds three similar and related programs. http-pull downloads
objects from an HTTP server; rpull downloads objects by using ssh and
rpush on the other side; and rpush uploads objects by using ssh and rpull
on the other side.

The algorithm should be sufficient to make the network throughput required
depend only on how much content is new, not at all on how much content the
repository contains.

The combination should enable people to have remote repositories by way of
ssh login for authenticated users and HTTP for anonymous access.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: Makefile
===================================================================
--- 9b75904eab1300d83264a1840d396160482fee88/Makefile  (mode:100644 sha1:57e70239503466fb3a77f1f2618ee64377e8e04b)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/Makefile  (mode:100644 sha1:b60d8eb691f4edd56d5b310b0dd670e98c852228)
@@ -16,7 +16,7 @@
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
 	check-files ls-tree merge-base merge-cache unpack-file git-export \
-	diff-cache convert-cache
+	diff-cache convert-cache http-pull rpush rpull
 
 all: $(PROG)
 
@@ -51,7 +51,13 @@
 init-db: init-db.o
 
 %: %.o $(LIB_FILE)
-	$(CC) $(CFLAGS) -o $@ $< $(LIBS)
+	$(CC) $(CFLAGS) -o $@ $(filter %.o,$^) $(LIBS)
+
+rpush: rsh.o
+
+rpull: rsh.o
+
+http-pull: LIBS += -lcurl
 
 blob.o: $(LIB_H)
 cat-file.o: $(LIB_H)
@@ -80,6 +86,9 @@
 usage.o: $(LIB_H)
 unpack-file.o: $(LIB_H)
 write-tree.o: $(LIB_H)
+http-pull.o: $(LIB_H)
+rpull.o: $(LIB_H)
+rpush.o: $(LIB_H)
 
 clean:
 	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
Index: http-pull.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/http-pull.c  (mode:100644 sha1:a17225719c53508a37905618c624ad8c4d0372ec)
@@ -0,0 +1,204 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int tree = 0;
+static int commits = 0;
+static int all = 0;
+
+static SHA_CTX c;
+static z_stream stream;
+
+static int local;
+static int zret;
+
+static size_t fwrite_sha1_file(void *ptr, size_t eltsize, size_t nmemb, 
+			       void *data) {
+	char expn[4096];
+	size_t size = eltsize * nmemb;
+	int posn = 0;
+	do {
+		ssize_t retval = write(local, ptr + posn, size - posn);
+		if (retval < 0)
+			return posn;
+		posn += retval;
+	} while (posn < size);
+
+	stream.avail_in = size;
+	stream.next_in = ptr;
+	do {
+		stream.next_out = expn;
+		stream.avail_out = sizeof(expn);
+		zret = inflate(&stream, Z_SYNC_FLUSH);
+		SHA1_Update(&c, expn, sizeof(expn) - stream.avail_out);
+	} while (stream.avail_in && zret == Z_OK);
+	return size;
+}
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+	char real_sha1[20];
+	char *url;
+	char *posn;
+
+	if (has_sha1_file(sha1)) {
+		return 0;
+	}
+
+	local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+
+	if (local < 0)
+		return error("Couldn't open %s\n", filename);
+
+	memset(&stream, 0, sizeof(stream));
+
+	inflateInit(&stream);
+
+	SHA1_Init(&c);
+
+	curl_easy_setopt(curl, CURLOPT_FILE, NULL);
+	curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	/*printf("Getting %s\n", hex);*/
+
+	if (curl_easy_perform(curl))
+		return error("Couldn't get %s for %s\n", url, hex);
+
+	close(local);
+	inflateEnd(&stream);
+	SHA1_Final(real_sha1, &c);
+	if (zret != Z_STREAM_END) {
+		unlink(filename);
+		return error("File %s (%s) corrupt\n", hex, url);
+	}
+	if (memcmp(sha1, real_sha1, 20)) {
+		unlink(filename);
+		return error("File %s has bad hash\n", hex);
+	}
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		if (fetch(entries->item.tree->object.sha1))
+			return -1;
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1))
+		return -1;
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!all)
+			tree = 0;
+	}
+	if (commits) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has_sha1_file(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	char *url;
+	int arg = 1;
+	unsigned char sha1[20];
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't') {
+			tree = 1;
+		} else if (argv[arg][1] == 'c') {
+			commits = 1;
+		} else if (argv[arg][1] == 'a') {
+			all = 1;
+			tree = 1;
+			commits = 1;
+		}
+		arg++;
+	}
+	if (argc < arg + 2) {
+		usage("http-pull [-c] [-t] [-a] commit-id url");
+		return 1;
+	}
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	if (fetch(sha1))
+		return 1;
+	if (process_commit(sha1))
+		return 1;
+
+	curl_global_cleanup();
+	return 0;
+}
Index: rpull.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rpull.c  (mode:100644 sha1:c27af2c2464de28732b8ad1fff3ed8a0804250d6)
@@ -0,0 +1,128 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+#include "rsh.h"
+
+static int tree = 0;
+static int commits = 0;
+static int all = 0;
+
+static int fd_in;
+static int fd_out;
+
+static int fetch(unsigned char *sha1)
+{
+	if (has_sha1_file(sha1))
+		return 0;
+	write(fd_out, sha1, 20);
+	return write_sha1_from_fd(sha1, fd_in);
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		/*
+		  fprintf(stderr, "Tree %s ", sha1_to_hex(sha1));
+		  fprintf(stderr, "needs %s\n", 
+		  sha1_to_hex(entries->item.tree->object.sha1));
+		*/
+		if (fetch(entries->item.tree->object.sha1)) {
+			return error("Missing item %s",
+				     sha1_to_hex(entries->item.tree->object.sha1));
+		}
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1)) {
+		return error("Fetching %s", sha1_to_hex(sha1));
+	}
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!all)
+			tree = 0;
+	}
+	if (commits) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has_sha1_file(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				error("Missing tree %s; continuing", 
+				      sha1_to_hex(parents->item->object.sha1));
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	char *url;
+	int arg = 1;
+	unsigned char sha1[20];
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't') {
+			tree = 1;
+		} else if (argv[arg][1] == 'c') {
+			commits = 1;
+		} else if (argv[arg][1] == 'a') {
+			all = 1;
+			tree = 1;
+			commits = 1;
+		}
+		arg++;
+	}
+	if (argc < arg + 2) {
+		usage("rpull [-c] [-t] [-a] commit-id url");
+		return 1;
+	}
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+
+	if (setup_connection(&fd_in, &fd_out, "rpush", url, arg, argv + 1))
+		return 1;
+
+	get_sha1_hex(commit_id, sha1);
+
+	if (fetch(sha1))
+		return 1;
+	if (process_commit(sha1))
+		return 1;
+
+	return 0;
+}
Index: rpush.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rpush.c  (mode:100644 sha1:0293a1a46311d7e20b13177143741ab9d6d0d201)
@@ -0,0 +1,69 @@
+#include "cache.h"
+#include "rsh.h"
+#include <sys/socket.h>
+#include <errno.h>
+
+void service(int fd_in, int fd_out) {
+	ssize_t size;
+	int posn;
+	char sha1[20];
+	unsigned long objsize;
+	void *buf;
+	do {
+		posn = 0;
+		do {
+			size = read(fd_in, sha1 + posn, 20 - posn);
+			if (size < 0) {
+				perror("rpush: read ");
+				return;
+			}
+			if (!size)
+				return;
+			posn += size;
+		} while (posn < 20);
+
+		/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+
+		buf = map_sha1_file(sha1, &objsize);
+		if (!buf) {
+			fprintf(stderr, "rpush: could not find %s\n", 
+				sha1_to_hex(sha1));
+			return;
+		}
+		posn = 0;
+		do {
+			size = write(fd_out, buf + posn, objsize - posn);
+			if (size <= 0) {
+				if (!size) {
+					fprintf(stderr, "rpush: write closed");
+				} else {
+					perror("rpush: write ");
+				}
+				return;
+			}
+			posn += size;
+		} while (posn < objsize);
+	} while (1);
+}
+
+int main(int argc, char **argv)
+{
+	int arg = 1;
+        char *commit_id;
+        char *url;
+	int fd_in, fd_out;
+	while (arg < argc && argv[arg][0] == '-') {
+                arg++;
+        }
+        if (argc < arg + 2) {
+                usage("rpush [-c] [-t] [-a] commit-id url");
+                return 1;
+        }
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+	if (setup_connection(&fd_in, &fd_out, "rpull", url, arg, argv + 1))
+		return 1;
+
+	service(fd_in, fd_out);
+	return 0;
+}
Index: rsh.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rsh.c  (mode:100644 sha1:4d6a90bf6c1b290975fb2ac22f25979be56cb476)
@@ -0,0 +1,63 @@
+#include "rsh.h"
+
+#include <string.h>
+#include <sys/socket.h>
+
+#include "cache.h"
+
+#define COMMAND_SIZE 4096
+
+int setup_connection(int *fd_in, int *fd_out, char *remote_prog, 
+		     char *url, int rmt_argc, char **rmt_argv)
+{
+	char *host;
+	char *path;
+	int sv[2];
+	char command[COMMAND_SIZE];
+	char *posn;
+	int i;
+
+	if (!strcmp(url, "-")) {
+		*fd_in = 0;
+		*fd_out = 1;
+		return 0;
+	}
+
+	host = strstr(url, "//");
+	if (!host) {
+		return error("Bad URL: %s", url);
+	}
+	host += 2;
+	path = strchr(host, '/');
+	if (!path) {
+		return error("Bad URL: %s", url);
+	}
+	*(path++) = '\0';
+	/* ssh <host> 'cd /<path>; stdio-pull <arg...> <commit-id>' */
+	snprintf(command, COMMAND_SIZE, 
+		 "cd /%s; SHA1_FILE_DIRECTORY=objects %s",
+		 path, remote_prog);
+	posn = command + strlen(command);
+	for (i = 0; i < rmt_argc; i++) {
+		*(posn++) = ' ';
+		strncpy(posn, rmt_argv[i], COMMAND_SIZE - (posn - command));
+		posn += strlen(rmt_argv[i]);
+		if (posn - command + 4 >= COMMAND_SIZE) {
+			return error("Command line too long");
+		}
+	}
+	strcpy(posn, " -");
+	if (socketpair(AF_LOCAL, SOCK_STREAM, 0, sv)) {
+		return error("Couldn't create socket");
+	}
+	if (!fork()) {
+		close(sv[1]);
+		dup2(sv[0], 0);
+		dup2(sv[0], 1);
+		execlp("ssh", "ssh", host, command, NULL);
+	}
+	close(sv[0]);
+	*fd_in = sv[1];
+	*fd_out = sv[1];
+	return 0;
+}
Index: rsh.h
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rsh.h  (mode:100644 sha1:97e4f20b2b80662269827d77f3104025143087e7)
@@ -0,0 +1,7 @@
+#ifndef RSH_H
+#define RSH_H
+
+int setup_connection(int *fd_in, int *fd_out, char *remote_prog, 
+		     char *url, int rmt_argc, char **rmt_argv);
+
+#endif


^ permalink raw reply	[relevance 8%]

* Re: First web interface and service API draft
  2005-04-22 22:57  0%     ` Petr Baudis
@ 2005-04-24 22:29  3%       ` Christian Meder
  0 siblings, 0 replies; 200+ results
From: Christian Meder @ 2005-04-24 22:29 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Sat, 2005-04-23 at 00:57 +0200, Petr Baudis wrote:
> Dear diary, on Fri, Apr 22, 2005 at 03:29:39PM CEST, I got a letter
> where Christian Meder <chris@absolutegiganten.org> told me that...
> > > > /<project>
> > > > 
> > > > Ok. The URI should start by stating the project name
> > > > e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> > > > that we want to have one root namespace per git archive in the long
> > > > run. Additionally you can always put rewriting or redirecting rules at
> > > > the root level for additional convenience when there's an obvious
> > > > default project.
> > > > 
> > > > Should provide some meta data, stats, etc. if available.
> > > 
> > > I don't think this makes much sense. I think you should just apply -p1
> > > to all the directories, and define that there should be some / page
> > > which should contain some metadata regarding the repository you are
> > > accessing (probably branches, tags, and such).
> > 
> > Hi,
> 
> Hi,
> 
> > remember that I want to stay stateless as long as possible so everything
> > important has to be encoded in the url. So somewhere in the url the git
> > archive to show has to be encoded. If I remove the <project> portion how
> > do I know on the server side which repo to show ?
> 
> since you are configured appropriately.
> 
> You need to be anyway. Someone needs to tell you or your web server
> "this lives at http://pasky.or.cz/wit/". So you bind "this" to the
> given repository.
> 
> No problem with an additional configuration possibility to say "at that
> place, clone your life place for the given repositories", but if I want
> to have just a single repository at a given URL, it should be possible.
> 
> I'm just trying to argue that having it _forced_ to have <project> as
> the part of the URL is useless; this is matter of configuration.

Ok. Got it. <project> for a multi-repo setup and in the simple case of
just one repo <project> can be dropped from the url. Reasonable.

> > > > * Blob data should be probably binary ?
> > > 
> > > What do you mean by binary?
> > 
> > content-type: binary/octet-stream
> 
> Ah. So just as-is, you mean?

Yes.

> 
> > > Anything wrong with putting ls-tree output there?
> > 
> > ls-tree output should be in .html (see below)
> 
> What if I actually want to process it by a script?

Use the .html variant and parse it. Or we add a .txt and/or .xml for
easier parsing.

> 
> > > > -------
> > > > /<project>/tree/<tree-sha1>
> > > > 
> > > > Tree objects are served in binary form. Primary audience are scripts,
> > > > etc. Human beings will probably get a heart attack when they
> > > > accidentally visit this URI.
> > > 
> > > Binary form is unusable for scripts.
> > 
> > Why should it be unusable for a downloading script. It's just the raw
> > git object.
> > 
> > > We should also have /gitobj/<sha1> for fetching the raw git objects.
> > 
> > Everything above is supposed to be raw git objects. No special encoding
> > whatever.
> 
> You have a consistency problem here.
> 
> Raw git objects as in database contain the leading object type at the
> start, then possibly some more stuff, then '\0' and then compressed
> binary stuff. You mean you are exporting _this_ stuff through this?
> 
> That's not very useful except for http-pull, if you as me. It also does
> not blend well with the fact that you say commits are in text or so.

Ok. We spoke of two different things. With raw objects I meant the
uncompressed raw content while you spoke of the raw compressed git
objects. Ok I'm dumb but now that I've understood what you said I agree
with you: we need one generic url for fetching compressed objects.

> 
> > > > -------
> > > > /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> > > > 
> > > > Non recursive HTML view of the objects which are contained in the diff
> > > > fully linked with the individual HTML views.
> > > 
> > > Why not .html?
> > 
> > I think .html isn't very clear because it would
> > be ..../<ancestor-tree-sha1>.html which somehow looks like it has
> > anything to do with the ancestor-tree. But it's the html version of the
> > _diff_ and not the ancestor-tree.
> 
> Perhaps /tree/<sha1>.html/diff/<ancestor> ?
> 
> I'd lend to ?diff=<ancestor> more and more. The path part of URI is
> there to express _hierarchy_, I think you are abusing that when there is
> no hierarchy.

But I'd argue that you are abusing queries ;-)
After all any given URI of the above kind is linking a specific diff
resource. It's a completely static resource from a user POV. The fact
that the server is probably dynamically generating it is just an
implementation detail.

> 
> > > For consistency, I'd stay with the plaintext output by default, .html if
> > > requested.
> > 
> > Remember that I'm just sitting on top of git and not git-pasky right
> > now. So there's no canonical changelog plaintext output for me. But I'm
> > not religious about that.
> 
> But there is canonical HTML output for you? ;-)

No. Changelog isn't defined by git so there's no canonical output of any
flavour.

> > > OTOH, I'd use
> > > 
> > > 	/log/<commit>
> > > 
> > > to specify what commit to start at. It just does not make sense
> > > otherwise, you would not know where to start
> > 
> > Start for the changelog is always head, but I guess that's pretty
> > standard. With git log you always start at the head too.
> 
> If you are sitting on top of git and not git-pasky, you have no assured
> HEAD information at all.

I've got HEAD. I'm still watching the discussion of tags.

> > If you want to start at a specific commit. Why not start
> > at /linux-2.6/commit/<sha1>.html ?
> 
> And how does that give me the changelog?

You could click through the commit chain interactively or we could add a
changelog from here function.
 
> > > I think the <commit> should follow the same or similar rules as Cogito
> > > id decoding. E.g. to get latest Linus' changelog, you'd do
> > > 
> > > 	/log/linus
> > 
> > Like I said above I think the shown head should be encoded in the
> > project id.
> 
> I thought the project was mapped to repository? But I might just have
> blindly assumed that. ;-) (That does not make me like your approach
> more, though.)

Ok. I think I misunderstood you here. You want to publish the different
heads you are tracking with the same repo, right ?

The proposal didn't account for this scenario yet. I'll think about it.



				Christian

-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[relevance 3%]

* [ANNOUNCE] Cogito-0.8 (former git-pasky, big changes!)
@ 2005-04-26  3:24  3% Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-04-26  3:24 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

  Hello,

  here goes Cogito-0.8, my SCMish layer over Linus Torvald's git tree
history tracker. This package was formerly called git-pasky, however
this release brings big changes. The usage is significantly different,
as well as some basic concepts; the history changed again (hopefully the
last time?) because of fixing dates of some old commits. The .git/
directory layout changed too.

  Upgrading through pull is possible, but rather difficult and requires
some intimacy with both git, git-pasky and Cogito. So probably the best
way to go is to just get cogito-0.8 tarball at

	http://www.kernel.org/pub/software/scm/cogito/

or

	ftp://ftp.kernel.org/pub/software/scm/cogito/

build and install it, and do

	cg-clone rsync://rsync.kernel.org/pub/scm/cogito/cogito.git



  Yes, this is a huge change. No, I don't expect any further changes of
similar scale. I think the new interface is significantly simpler _and_
cleaner than the old one.

  First for the concept changes. There is no concept of tracking
anymore; you just do either cg-pull to just fetch the changes, or
cg-update to fetch them as well as merge them to your working tree.
Even more significant change is that Cogito does not directly support
local branches anymore - git fork is gone, you just go to new directory
and do

	cg-init ~/path/to/your/original/repository

(or cg-clone, which will try to create a new subdirectory for itself).
This now acts as a separate repository, except that it is hardlinked
with the original one; therefore you get no additional disk usage.  To
get new changes to it from the original repository, you have to
cg-update origin. If you decide you want to merge back, go to the
original repository, add your new one as a branch and pull/update from
it.

  As for the interface changes, you will probably find out on your own;
cg-help should be of some help. All the scripts now start with 'cg-',
and you should ignore the 'cg-X*' ones. The non-trivial mapping is:

	git addremote -> cg-branch-add
	git lsremote -> cg-branch-ls
	git patch -> cg-mkpatch
	git apply -> cg-patch
	git lsobj -> cg-admin-lsobj

  Commands that are gone:

	git fork
	git track

  New commands:

	cg-clone
	cg-update



  Of course other changes include various bugfixes, and latest Linus'
stuff (although we do not make use of Linus' tags yet).

  Note that I don't know how many time will I have for hacking Cogito
until the next Sunday/Monday. I hope I will get some time to at least
apply bugfixes etc, but I don't know how much more will I be able to do.
You would make me a happy man if you could please port your pending
patches from git-pasky to Cogito; I promise to apply them and I hope
there isn't going to be another so big change in the foreseeable future,
which would cause major conflicts for your patches etc.


  Note that I cc'd LKML since it is going to break stuff for anyone
using git-pasky now (apologies for that; it won't happen another time).
Please try not to keep it in the cc' list unless it is really relevant.

  Have fun,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* [PATCH COGITO] Do not make cross device hard links
@ 2005-04-26  7:30 26% Alexey Nezhdanov
  0 siblings, 0 replies; 200+ results
From: Alexey Nezhdanov @ 2005-04-26  7:30 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 650 bytes --]

cg-clone doesn't work across devices:

cd /tmp # my tmp is on seperate partition
mkdir cg
cd cg
cg-clone /home/snake/cg/
....
`/home/snake/cg/.git/objects/85/eb3d54aeec1d0f4cf3d2de257b8d7e29816147' -> 
`.git/objects/85/eb3d54aeec1d0f4cf3d2de257b8d7e29816147'
cp: cannot create link 
`.git/objects/85/eb3d54aeec1d0f4cf3d2de257b8d7e29816147': Invalid 
cross-device link
-----------

I have decided that problem should be solved by omitting -u flag while 
fetching in cg-pul;:
$fetch -s -d "$uri/objects" ".git/objects" || die "rsync error"

My variant of autodetection is probably very ugly, but it's works for me :)

-- 
Respectfully
Alexey Nezhdanov

[-- Attachment #2: do-not-link-cross-device.patch --]
[-- Type: text/x-diff, Size: 1509 bytes --]

Index: cg-pull
===================================================================
--- f262000f302b749e485f5eb971e6aabefbb85680/cg-pull  (mode:100755 sha1:5cd67519fc5399886f22e8758d6d34e0e3014cbb)
+++ uncommitted/cg-pull  (mode:100755)
@@ -69,11 +69,15 @@
 	cp $cp_flags_l "$src" "$dest"
 }
 
+u_flag="-u"
+
 if echo "$uri" | grep -q ":"; then
 	fetch=fetch_rsync
 else
 	[ -d $uri/.git ] && uri=$uri/.git
 	fetch=fetch_local
+	cp -l $uri/branches/origin .cross-device-test 2>/dev/null || u_flag=""
+	[ -r .cross-device-test ] && rm .cross-device-test
 fi
 
 
@@ -95,17 +99,17 @@
 [ "$rsyncerr" ] && die "unable to get the head pointer of branch $rembranch"
 
 [ -d .git/objects ] || mkdir -p .git/objects
-$fetch -s -u -d "$uri/objects" ".git/objects" || die "rsync error"
+$fetch -s $u_flag -d "$uri/objects" ".git/objects" || die "rsync error"
 
 # FIXME: Warn about conflicting tag names?
 # XXX: We now throw stderr to /dev/null since not all repositories
 # may have tags/ and users were confused by the harmless errors.
 [ -d .git/refs/tags ] || mkdir -p .git/refs/tags
 rsyncerr=
-$fetch -s -u -d "$uri/refs/tags" ".git/refs/tags" 2>/dev/null || rsyncerr=1
+$fetch -s $u_flag -d "$uri/refs/tags" ".git/refs/tags" 2>/dev/null || rsyncerr=1
 if [ "$rsyncerr" ]; then
 	rsyncerr=
-	$fetch -s -u -d "$uri/tags" ".git/refs/tags" 2>/dev/null || rsyncerr=1
+	$fetch -s $u_flag -d "$uri/tags" ".git/refs/tags" 2>/dev/null || rsyncerr=1
 fi
 [ "$rsyncerr" ] && echo "unable to get tags list (non-fatal)" >&2
 

^ permalink raw reply	[relevance 26%]

* Re: Revised PPC assembly implementation
  @ 2005-04-27  1:47  1% ` linux
  0 siblings, 0 replies; 200+ results
From: linux @ 2005-04-27  1:47 UTC (permalink / raw)
  To: paulus; +Cc: davem, git, linux

Here's a massively revised version, scheduled very close to optimally for
the G4.  (The main remaining limitation is the loading of the k value
in %r5, which could be split up more.)

My hope is that the G5 will do decently on it as well.

The G4 can in theory do 3 integer operations per cycle, but only
if everything is arranged just right.  Every cycle, it tries to
dispatch the 3 instructions at the bottom of the GIQ.  If any
of them stall, that issue slot is lost.

So although it's theoretically out-of-order, if you want it to
sustain 3 instructions per cycle, you have to treat it as in-order.

It required interleaving the STEPDx and UPDATEW macros in a few
complicated ways.  I don't have access to a machine for testing,
so some poor schmuck^W^Wgenerous person is needed to find the bugs.

This should be *much* faster than the previous code on a G4, and I hope
it will do better on a G5 as well.

I'm curious if *reducing* the amount of fetch-ahead to 2 words
instead of 4 would help things or not.

Still to do: improve the comments.  This level of hackery needs a
lot of commenting...

/*
 * SHA-1 implementation for PowerPC.
 *
 * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
 */

/*
 * We roll the registers for A, B, C, D, E around on each
 * iteration; E on iteration t is D on iteration t+1, and so on.
 * We use registers 6 - 10 for this.  (Registers 27 - 31 hold
 * the previous values.)
 */
#define RA(t)	(((t)+4)%5+6)
#define RB(t)	(((t)+3)%5+6)
#define RC(t)	(((t)+2)%5+6)
#define RD(t)	(((t)+1)%5+6)
#define RE(t)	(((t)+0)%5+6)

/* We use registers 11 - 26 for the W values */
#define W(t)	((t)%16+11)

/* Register 5 is used for the constant k */

/*
 * There are three F functions, used four groups of 20:
 * - 20 rounds of f0(b,c,d) = "bit wise b ? c : d" =  (^b & d) + (b & c)
 * - 20 rounds of f1(b,c,d) = b^c^d = (b^d)^c
 * - 20 rounds of f2(b,c,d) = majority(b,c,d) = (b&d) + ((b^d)&c)
 * - 20 more rounds of f1(b,c,d)
 *
 * These are all scheduled for near-optimal performance on a G4.
 * The G4 is a 3-issue out-of-order machine with 3 ALUs, but it can only
 * *consider* starting the oldest 3 instructions per cycle.  So to get
 * maximum performace out of it, you have to treat it as an in-order
 * machine.  Which means interleaving the computation round t with the
 * computation of W[t+4].
 *
 * The first 16 rounds use W values loaded directly from memory, while the
 * remianing 64 use values computed from those first 16.  We preload
 * 4 values before starting, so there are three kinds of rounds:
 * - The first 12 (all f0) also load the W values from memory.
 * - The next 64 compute W(i+4) in parallel. 8*f0, 20*f1, 20*f2, 16*f1.
 * - The last 4 (all f1) do not do anything with W.
 *
 * Therefore, we have 6 different round functions:
 * STEPD0_LOAD(t,s) - Perform round t and load W(s).  s < 16
 * STEPD0_UPDATE(t,s) - Perform round t and compute W(s).  s >= 16.
 * STEPD1_UPDATE(t,s)
 * STEPD2_UPDATE(t,s)
 * STEPD1(t) - Perform round t with no load or update.
 * 
 * The G5 is more fully out-of-order, and can find the parallelism
 * by itself.  The big limit is that it has a 2-cycle ALU latency, so
 * even though it's 2-way, the code has to be scheduled as if it's
 * 4-way, which can be a limit.  To help it, we try to schedule the
 * read of RA(t) as late as possible so it doesn't stall waiting for
 * the previous round's RE(t-1), and we try to rotate RB(t) as early
 * as possible while reading RC(t) (= RB(t-1)) as late as possible.
 */


/* the initial loads. */
#define LOADW(s) \
	lwz	W(s),(s)*4(%r4)

/*
 * This is actually 13 instructions, which is an awkward fit,
 * and uses W(s) as a temporary before loading it.
 */
#define STEPD0_LOAD(t,s) \
add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t);  /* spare slot */        \
add RE(t),RE(t),%r0;  and    W(s),RC(t),RB(t); rotlwi %r0,RA(t),5;     \
add RE(t),RE(t),W(s); add    %r0,%r0,%r5;      rotlwi RB(t),RB(t),30;  \
add RE(t),RE(t),%r0;  lwz    W(s),(s)*4(%r4);

/*
 * This can execute starting with 2 out of 3 possible moduli, so it
 * does 2 rounds in 9 cycles, 4.5 cycles/round.
 */
#define STEPD0_UPDATE(t,s) \
add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
add RE(t),RE(t),%r0;  and    %r0,RC(t),RB(t); xor    W(s),W(s),W((s)-8);      \
add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
add RE(t),RE(t),%r5;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;             \
add RE(t),RE(t),%r0;

/* Nicely optimal.  Conveniently, also the most common. */
#define STEPD1_UPDATE(t,s) \
add RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
add RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   xor    W(s),W(s),W((s)-8);      \
add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;

/*
 * The naked version, no UPDATE, for the last 4 rounds.  3 cycles per.
 * We could use W(s) as a temp register, but we don't need it.
 */
#define STEPD1(t) \
/* spare slot */        add   RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); \
rotlwi RB(t),RB(t),30;  add   RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   \
add    RE(t),RE(t),%r0; rotlwi %r0,RA(t),5;     /* idle */              \
add    RE(t),RE(t),%r0;

/* 5 cycles per */
#define STEPD2_UPDATE(t,s) \
add RE(t),RE(t),W(t); and    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
add RE(t),RE(t),%r0;  xor    %r0,RD(t),RB(t); xor    W(s),W(s),W((s)-8);      \
add RE(t),RE(t),%r5;  and    %r0,%r0,RC(t);   xor    W(s),W(s),W((s)-14);     \
add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     rotlwi W(s),W(s),1;             \
add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;

#define STEP0_LOAD4(t,s)		\
	STEPD0_LOAD(t,s);		\
	STEPD0_LOAD((t+1),(s)+1);	\
	STEPD0_LOAD((t)+2,(s)+2);	\
	STEPD0_LOAD((t)+3,(s)+3);

#define STEPUP4(fn, t, s)		\
	STEP##fn##_UPDATE(t,s);		\
	STEP##fn##_UPDATE((t)+1,(s)+1);	\
	STEP##fn##_UPDATE((t)+2,(s)+2);	\
	STEP##fn##_UPDATE((t)+3,(s)+3);	\

#define STEPUP20(fn, t, s)		\
	STEPUP4(fn, t, s);		\
	STEPUP4(fn, (t)+4, (s)+4);	\
	STEPUP4(fn, (t)+4, (s)+4);	\
	STEPUP4(fn, (t)+12, (s)+12);	\
	STEPUP4(fn, (t)+16, (s)+16)

	.globl	sha1_core
sha1_core:
	stwu	%r1,-80(%r1)
	stmw	%r13,4(%r1)

	/* Load up A - E */
	lmw	%r27,0(%r3)

	mtctr	%r5

1:
	lis	%r5,0x5a82	/* K0-19 */
	mr	RA(0),%r27
	LOADW(0)
	mr	RB(0),%r28
	LOADW(1)
	mr	RC(0),%r29
	LOADW(2)
	ori	%r5,%r5,0x7999
	mr	RD(0),%r30
	LOADW(3)
	mr	RE(0),%r31

	STEP0_LOAD4(0, 4)
	STEP0_LOAD4(4, 8)
	STEP0_LOAD4(8, 12)
	STEPUP4(D0, 12, 16)
	STEPUP4(D0, 16, 20)

	lis	%r5,0x6ed9	/* K20-39 */
	ori	%r5,%r5,0xeba1
	STEPUP20(D1, 20, 24)

	lis	%r5,0x8f1b	/* K40-59 */
	ori	%r5,%r5,0xbcdc
	STEPUP20(D2, 40, 44)

	lis	%r5,0xca62	/* K60-79 */
	ori	%r5,%r5,0xc1d6
	STEPUP4(D1, 60, 64)
	STEPUP4(D1, 64, 68)
	STEPUP4(D1, 68, 72)
	STEPUP4(D1, 72, 76)
	STEPD1(76)
	STEPD1(77)
	STEPD1(78)
	STEPD1(79)

	/* Add results to original values */
	add	%r31,%r31,RE(0)
	add	%r30,%r30,RD(0)
	add	%r29,%r29,RC(0)
	add	%r28,%r28,RB(0)
	add	%r27,%r27,RA(0)

	addi	%r4,%r4,64
	bdnz	1b

	/* Save final hash, restore registers, and return */
	stmw	%r27,0(%r3)
	lmw	%r13,4(%r1)
	addi	%r1,%r1,80
	blr

^ permalink raw reply	[relevance 1%]

* [PATCH] cogito: honour SHA1_FILE_DIRECTORY env var
@ 2005-04-29 12:57 11% Rene Scharfe
  0 siblings, 0 replies; 200+ results
From: Rene Scharfe @ 2005-04-29 12:57 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Three scripts have their object store directory hardcoded to
.git/objects/.  This patch makes them honour the environment variable
SHA1_FILE_DIRECTORY like core GIT.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>

Index: cg-admin-lsobj
===================================================================
--- c3aa1e6b53cc59d5fbe261f3f859584904ae3a63/cg-admin-lsobj  (mode:100755 sha1:c68d9176d843700df17b109389102ae84eab3888)
+++ 4e32eadd27409a1352f17cb43703aa33420ac1fd/cg-admin-lsobj  (mode:100755 sha1:8410960a067f93a9457b250fea7b3636fb0eab42)
@@ -21,7 +21,7 @@
 target=$1
 
 
-subdir=.git/objects/
+subdir=${SHA1_FILE_DIRECTORY:-.git/objects}
 
 for high in 0 1 2 3 4 5 6 7 8 9 a b c d e f ; do
 	for low in 0 1 2 3 4 5 6 7 8 9 a b c d e f ; do
Index: cg-pull
===================================================================
--- c3aa1e6b53cc59d5fbe261f3f859584904ae3a63/cg-pull  (mode:100755 sha1:5cd67519fc5399886f22e8758d6d34e0e3014cbb)
+++ 4e32eadd27409a1352f17cb43703aa33420ac1fd/cg-pull  (mode:100755 sha1:52d686091f2d04a0d532a3ea1d622d6da4bad14c)
@@ -94,8 +94,9 @@
 fi
 [ "$rsyncerr" ] && die "unable to get the head pointer of branch $rembranch"
 
-[ -d .git/objects ] || mkdir -p .git/objects
-$fetch -s -u -d "$uri/objects" ".git/objects" || die "rsync error"
+sha1_dir=${SHA1_FILE_DIRECTORY:-.git/objects}
+[ -d "$sha1_dir" ] || mkdir -p "$sha1_dir"
+$fetch -s -u -d "$uri/objects" "$sha1_dir" || die "rsync error"
 
 # FIXME: Warn about conflicting tag names?
 # XXX: We now throw stderr to /dev/null since not all repositories
Index: commit-id
===================================================================
--- c3aa1e6b53cc59d5fbe261f3f859584904ae3a63/commit-id  (mode:100755 sha1:4efcb6bdfdb2b2c5744f5d4d47d92beb7777ed59)
+++ 4e32eadd27409a1352f17cb43703aa33420ac1fd/commit-id  (mode:100755 sha1:7f04b10cf8805cf2eb950a5dc3d451882e9d4929)
@@ -23,8 +23,9 @@
 
 idpref=$(echo "$id" | cut -c -2)
 idpost=$(echo "$id" | cut -c 3-)
-if [ $(find ".git/objects/$idpref" -name "$idpost*" 2>/dev/null | wc -l) -eq 1 ]; then
-	id=$idpref$(basename $(echo .git/objects/$idpref/$idpost*))
+sha1_dir=${SHA1_FILE_DIRECTORY:-.git/objects}
+if [ $(find "$sha1_dir/$idpref" -name "$idpost*" 2>/dev/null | wc -l) -eq 1 ]; then
+	id=$idpref$(basename $(echo "$sha1_dir/$idpref/$idpost"*))
 fi
 
 if echo $id | egrep -vq "$SHA1ONLY"; then

^ permalink raw reply	[relevance 11%]

* [PATCH] GIT: Honour SHA1_FILE_DIRECTORY env var in git-pull-script
@ 2005-04-29 18:31  3% Rene Scharfe
  0 siblings, 0 replies; 200+ results
From: Rene Scharfe @ 2005-04-29 18:31 UTC (permalink / raw)
  To: Linux Torvalds; +Cc: git

If you set SHA1_FILE_DIRECTORY to something else than .git/objects
git-pull-script will store the fetched files in a location the rest of
the tools does not expect.

git-prune-script also ignores this setting, but I think this is good,
because pruning a shared tree to fit a single project means throwing
away a lot of useful data. :-)

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>

---
commit 6fef2965444a6509d11a79bd33842125034dcec0
tree 63e9cdf5ff724bf462d9dc408b9c951985d4cecf
parent db413479f1bb0dabfc613b2b0017ca74aeb5a919
author Rene Scharfe <rene.scharfe@lsrfire.ath.cx> 1114799335 +0200
committer Rene Scharfe <rene.scharfe@lsrfire.ath.cx> 1114799335 +0200

Index: git-pull-script
===================================================================
--- 1e2168c7d554a4fbd25a09bb591ae0f82dac6513/git-pull-script  (mode:100755 sha1:5111da98e68f4c3eb44499d20a210966dd445212)
+++ 63e9cdf5ff724bf462d9dc408b9c951985d4cecf/git-pull-script  (mode:100755 sha1:0198c4805db7c2b78cd4424634873b0a86ee4107)
@@ -9,7 +9,7 @@
 cp .git/HEAD .git/ORIG_HEAD
 
 echo "Getting object database"
-rsync -avz --ignore-existing $merge_repo/objects/. .git/objects/.
+rsync -avz --ignore-existing $merge_repo/objects/. ${SHA1_FILE_DIRECTORY:-.git/objects}/.
 
 echo "Getting remote head"
 rsync -L $merge_repo/HEAD .git/MERGE_HEAD || exit 1

^ permalink raw reply	[relevance 3%]

* Re: git network protocol
  @ 2005-04-29 21:15  3% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-29 21:15 UTC (permalink / raw)
  To: David Lang; +Cc: git

On Fri, 29 Apr 2005, David Lang wrote:

> would it make sense for the network git protocol to be something along the 
> lines of
> 
> client contacts server and sends
> the tag you want to sync with (defaults to head)
> the local index file

Actually, you really want to have a bidirectional interaction, where the
client first fetches the info to determine where to start, and then goes
through the reachable space, asking for anything it doesn't already have.

(In the long run, we want to keep track of some things we already have all
of, or know we're missing, etc., so the receiver side doesn't have to
look over its whole tree.)

git already includes two versions of this protocol; the first runs against
a static HTTP server, and the second uses ssh to get a socket. At some
point, I'm going to enable these programs to read and write
.git/refs/?/? to figure out what they're supposed to get.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* Re: More problems...
  @ 2005-04-29 21:27  3% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-29 21:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, Russell King, git

On Fri, 29 Apr 2005, Junio C Hamano wrote:

> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> Absolutely. I use the same "git-pull-script" between two local directories 
> LT> on disk...
> LT> Of course, I don't bother with the linking. But that's the trivial part.
> 
> Would it be useful if somebody wrote local-pull.c similar to
> http-pull.c, which clones one local SHA_FILE_DIRECTORY to
> another, with an option to (1) try hardlink and if it fails
> fail; (2) try hardlink and if it fails try symlink and if it
> fails fail; (3) try hardlink and if it fails try copy and if it
> fails fail?

If someone does this, they should make a pull.c out of http-pull and
rpull; the logic for determining what you need to copy, given what you
have and what the user wants to have, should be shared.

(Note that some usage patterns only require the latest commit, or at least
can deal with fetching other stuff only when needed.)

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* [PATCH] Split out "pull" from particular methods
  @ 2005-04-30  5:36 20% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-04-30  5:36 UTC (permalink / raw)
  To: Junio C Hamano, Linus Torvalds
  Cc: Ryan Anderson, Petr Baudis, Russell King, git

The method for deciding what to pull is useful separately from any of the
ways of actually fetching the objects.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

Split out "pull" functionality from http-pull and rpull
Index: Makefile
===================================================================
--- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/Makefile  (mode:100644 sha1:ef9a9fae88a1ac438c22beb50790f0f0e37ffc3c)
+++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/Makefile  (mode:100644 sha1:87fe8fef5ebd315f370af882bd3172632b850c02)
@@ -82,9 +82,9 @@
 git-export: export.c
 git-diff-cache: diff-cache.c
 git-convert-cache: convert-cache.c
-git-http-pull: http-pull.c
+git-http-pull: http-pull.c pull.c
 git-rpush: rsh.c
-git-rpull: rsh.c
+git-rpull: rsh.c pull.c
 git-rev-list: rev-list.c
 git-mktag: mktag.c
 git-diff-tree-helper: diff-tree-helper.c
Index: http-pull.c
===================================================================
--- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/http-pull.c  (mode:100644 sha1:192dcc370dee47c52c72915394bb6f2a79f64e12)
+++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/http-pull.c  (mode:100644 sha1:d877c4abe3ff7766d858bfeac5c9a0eaf1385b65)
@@ -7,6 +7,8 @@
 #include <errno.h>
 #include <stdio.h>
 
+#include "pull.h"
+
 #include <curl/curl.h>
 #include <curl/easy.h>
 
@@ -14,10 +16,6 @@
 
 static char *base;
 
-static int tree = 0;
-static int commits = 0;
-static int all = 0;
-
 static SHA_CTX c;
 static z_stream stream;
 
@@ -47,7 +45,7 @@
 	return size;
 }
 
-static int fetch(unsigned char *sha1)
+int fetch(unsigned char *sha1)
 {
 	char *hex = sha1_to_hex(sha1);
 	char *filename = sha1_file_name(sha1);
@@ -105,77 +103,21 @@
 	return 0;
 }
 
-static int process_tree(unsigned char *sha1)
-{
-	struct tree *tree = lookup_tree(sha1);
-	struct tree_entry_list *entries;
-
-	if (parse_tree(tree))
-		return -1;
-
-	for (entries = tree->entries; entries; entries = entries->next) {
-		if (fetch(entries->item.tree->object.sha1))
-			return -1;
-		if (entries->directory) {
-			if (process_tree(entries->item.tree->object.sha1))
-				return -1;
-		}
-	}
-	return 0;
-}
-
-static int process_commit(unsigned char *sha1)
-{
-	struct commit *obj = lookup_commit(sha1);
-
-	if (fetch(sha1))
-		return -1;
-
-	if (parse_commit(obj))
-		return -1;
-
-	if (tree) {
-		if (fetch(obj->tree->object.sha1))
-			return -1;
-		if (process_tree(obj->tree->object.sha1))
-			return -1;
-		if (!all)
-			tree = 0;
-	}
-	if (commits) {
-		struct commit_list *parents = obj->parents;
-		for (; parents; parents = parents->next) {
-			if (has_sha1_file(parents->item->object.sha1))
-				continue;
-			if (fetch(parents->item->object.sha1)) {
-				/* The server might not have it, and
-				 * we don't mind. 
-				 */
-				continue;
-			}
-			if (process_commit(parents->item->object.sha1))
-				return -1;
-		}
-	}
-	return 0;
-}
-
 int main(int argc, char **argv)
 {
 	char *commit_id;
 	char *url;
 	int arg = 1;
-	unsigned char sha1[20];
 
 	while (arg < argc && argv[arg][0] == '-') {
 		if (argv[arg][1] == 't') {
-			tree = 1;
+			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
-			commits = 1;
+			get_history = 1;
 		} else if (argv[arg][1] == 'a') {
-			all = 1;
-			tree = 1;
-			commits = 1;
+			get_all = 1;
+			get_tree = 1;
+			get_history = 1;
 		}
 		arg++;
 	}
@@ -186,17 +128,13 @@
 	commit_id = argv[arg];
 	url = argv[arg + 1];
 
-	get_sha1_hex(commit_id, sha1);
-
 	curl_global_init(CURL_GLOBAL_ALL);
 
 	curl = curl_easy_init();
 
 	base = url;
 
-	if (fetch(sha1))
-		return 1;
-	if (process_commit(sha1))
+	if (pull(commit_id))
 		return 1;
 
 	curl_global_cleanup();
Index: pull.c
===================================================================
--- /dev/null  (tree:8602fe7cb4bf668fd021ab3bfb2082ac7d535e57)
+++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/pull.c  (mode:100644 sha1:86a7b6901fe69a82c12c3470b456982ef52cebd0)
@@ -0,0 +1,77 @@
+#include "pull.h"
+
+#include "cache.h"
+#include "commit.h"
+#include "tree.h"
+
+int get_tree = 0;
+int get_history = 0;
+int get_all = 0;
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		if (fetch(entries->item.tree->object.sha1))
+			return -1;
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1))
+		return -1;
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (get_tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!get_all)
+			get_tree = 0;
+	}
+	if (get_history) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has_sha1_file(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int pull(char *target)
+{
+	int retval;
+	unsigned char sha1[20];
+	retval = get_sha1_hex(target, sha1);
+	if (retval)
+		return retval;
+	retval = fetch(sha1);
+	if (retval)
+		return retval;
+	return process_commit(sha1);
+}
Index: pull.h
===================================================================
--- /dev/null  (tree:8602fe7cb4bf668fd021ab3bfb2082ac7d535e57)
+++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/pull.h  (mode:100644 sha1:314bc7e95ab1a73634f6a96a8a3782fda91ea261)
@@ -0,0 +1,18 @@
+#ifndef PULL_H
+#define PULL_H
+
+/** To be provided by the particular implementation. **/
+extern int fetch(unsigned char *sha1);
+
+/** Set to fetch the target tree. */
+extern int get_tree;
+
+/** Set to fetch the commit history. */
+extern int get_history;
+
+/** Set to fetch the trees in the commit history. **/
+extern int get_all;
+
+extern int pull(char *target);
+
+#endif /* PULL_H */
Index: rpull.c
===================================================================
--- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/rpull.c  (mode:100644 sha1:c27af2c2464de28732b8ad1fff3ed8a0804250d6)
+++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/rpull.c  (mode:100644 sha1:6624440d5ad24854e1bd1a8dff628427581198e0)
@@ -7,15 +7,12 @@
 #include <errno.h>
 #include <stdio.h>
 #include "rsh.h"
-
-static int tree = 0;
-static int commits = 0;
-static int all = 0;
+#include "pull.h"
 
 static int fd_in;
 static int fd_out;
 
-static int fetch(unsigned char *sha1)
+int fetch(unsigned char *sha1)
 {
 	if (has_sha1_file(sha1))
 		return 0;
@@ -23,87 +20,21 @@
 	return write_sha1_from_fd(sha1, fd_in);
 }
 
-static int process_tree(unsigned char *sha1)
-{
-	struct tree *tree = lookup_tree(sha1);
-	struct tree_entry_list *entries;
-
-	if (parse_tree(tree))
-		return -1;
-
-	for (entries = tree->entries; entries; entries = entries->next) {
-		/*
-		  fprintf(stderr, "Tree %s ", sha1_to_hex(sha1));
-		  fprintf(stderr, "needs %s\n", 
-		  sha1_to_hex(entries->item.tree->object.sha1));
-		*/
-		if (fetch(entries->item.tree->object.sha1)) {
-			return error("Missing item %s",
-				     sha1_to_hex(entries->item.tree->object.sha1));
-		}
-		if (entries->directory) {
-			if (process_tree(entries->item.tree->object.sha1))
-				return -1;
-		}
-	}
-	return 0;
-}
-
-static int process_commit(unsigned char *sha1)
-{
-	struct commit *obj = lookup_commit(sha1);
-
-	if (fetch(sha1)) {
-		return error("Fetching %s", sha1_to_hex(sha1));
-	}
-
-	if (parse_commit(obj))
-		return -1;
-
-	if (tree) {
-		if (fetch(obj->tree->object.sha1))
-			return -1;
-		if (process_tree(obj->tree->object.sha1))
-			return -1;
-		if (!all)
-			tree = 0;
-	}
-	if (commits) {
-		struct commit_list *parents = obj->parents;
-		for (; parents; parents = parents->next) {
-			if (has_sha1_file(parents->item->object.sha1))
-				continue;
-			if (fetch(parents->item->object.sha1)) {
-				/* The server might not have it, and
-				 * we don't mind. 
-				 */
-				error("Missing tree %s; continuing", 
-				      sha1_to_hex(parents->item->object.sha1));
-				continue;
-			}
-			if (process_commit(parents->item->object.sha1))
-				return -1;
-		}
-	}
-	return 0;
-}
-
 int main(int argc, char **argv)
 {
 	char *commit_id;
 	char *url;
 	int arg = 1;
-	unsigned char sha1[20];
 
 	while (arg < argc && argv[arg][0] == '-') {
 		if (argv[arg][1] == 't') {
-			tree = 1;
+			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
-			commits = 1;
+			get_history = 1;
 		} else if (argv[arg][1] == 'a') {
-			all = 1;
-			tree = 1;
-			commits = 1;
+			get_all = 1;
+			get_tree = 1;
+			get_history = 1;
 		}
 		arg++;
 	}
@@ -117,11 +48,7 @@
 	if (setup_connection(&fd_in, &fd_out, "rpush", url, arg, argv + 1))
 		return 1;
 
-	get_sha1_hex(commit_id, sha1);
-
-	if (fetch(sha1))
-		return 1;
-	if (process_commit(sha1))
+	if (pull(commit_id))
 		return 1;
 
 	return 0;


^ permalink raw reply	[relevance 20%]

* Quick command reference
@ 2005-05-01 12:58  3% Paul Mackerras
  0 siblings, 0 replies; 200+ results
From: Paul Mackerras @ 2005-05-01 12:58 UTC (permalink / raw)
  To: git

As an aid to my understanding of the core git commands, I created this
summary of the commands and their options and parameters.  I hope it
will be useful to others.  Corrections welcome of course.

Paul.
---

git-cat-file -t sha1-id
	Prints type of object with given sha1-id to stdout.

git-cat-file type sha1-id
	Copies contents of object with given sha1-id to stdout.
	Complains if type of sha1-id isn't of the type specified.

git-check-files pathname...
	Checks that each pathname given is up-to-date in the w.d.
	(i.e. matches the dircache) or is not present.

git-checkout-cache [-a] [-f] [-q] [-n] [--prefix=path] [--] [files...]
	Copies files from the git object repository to the
	working directory or another directory.  Does not rewrite
	files that already exist and match the dircache.
	-a: check out all files listed in dircache.
	-f: overwrite existing files; without this, checkout-cache
		will not overwrite an existing file even if it
		differs from what is in the dircache.
	-q: quiet; don't print an error message when a file is
		unmerged or not in the dircache, or when a file exists
		but differs from the dircache and -f was not given.
	-n: not new files; don't checkout any file that doesn't
		already exist in the dircache.
	--prefix=path: prepend path to the pathname of each file
		being checked out.  If you want to use this to
		check out files with their normal names but in
		another directory, make sure the path ends in /.
	The order of the flags matters; checkout-cache -a -f
	is different from checkout-cache -f -a.  Flags may be
	interspersed between file names.

git-commit-tree tree-id [-p parent-commit-id]* < changelog
	Generates a commit object referring to the given tree with
	the parent commit-ids given.  (If no parents are given, this
	is an initial commit.)  Prints the sha1 id of the generated
	commit to stdout.

git-diff-cache [-r] [-p] [-z] [--cached] tree-id
	Show differences between the tree identified by tree-id
	and the dircache and/or the working directory.
	-r: ignored (old recursive flag)
	-p: generate patches (full diff listings)
	-z: terminate lines with \0 instead of \n
	--cached: diff against last cached state rather than
	    file in w.d. for new or changed files.  New and changed
	    files are always identified by comparing dircache and
	    tree entries, but without this flag, the files that are
	    identified as new or changed are compared against the
	    working directory rather than the cached version.
	Unmerged (non-stage 0) entries in dircache are shown as:
		U <pathname>
	or if -p is given, as
		* Unmerged path pathname
	Files in tree but not in dircache (or w.d., without --cached):
		-mode<tab>blob<tab>sha1<tab>pathname
	or with -p, as a patch deleting the file.
	Files in dircache but not in tree are shown as:
		+mode<tab>blob<tab>sha1<tab>pathname
	or with -p, as a patch adding the file.
	Files that differ are shown as:
		*mode->mode<tab>blob<tab>sha1->sha1<tab>pathname
	or with -p, as a patch showing the differences

git-diff-files
	Compares working-directory with dircache and prints a listing
	of changed files.
	-p: generate patches (full diff listings)
	-q: Silent; don't show files missing from w.d.
	-r: ignored (old recursive flag)
	-s: ignored (old silent flag)
	-z: terminate lines with \0 instead of \n
	If no pathnames given, compare all files in dircache.
	Checks mode, uid, gid, size, mtime, ctime, dev/ino, size.
	Output is as for git-diff-cache ("-" indicates file in
	dircache but not in w.d.).

git-diff-tree [-p] [-r] [-z] tree1-id tree2-id
	Compares two trees identified by their ids.
	-p: generate patches (implies -r)
	-r: recursive
	-z: terminate lines with \0 instead of \n
	Output is as for git-diff-cache (except there are no unmerged
	entries, since they can only exist in the dircache).

git-diff-tree-helper [-R] [-z] pathname...
	Reads the output of git-diff-tree and generates diffs (patches)
	for the files listed on the command line.
	-R: generate reverse diff
	-z: expect input lines to be terminated with \0

git-export top-sha1 [base-sha1]
	top-sha1 and base-sha1 are commit-ids
	Outputs all the changesets to get to top-sha1, with patches.
	If base-sha1 is given, only outputs changesets from base-sha1
	to top-sha1.

git-fsck-cache [--tags] [--root] [--unreachable] head-sha1...
	Checks the consistency of the object repository.
	If given, the head-sha1 parameter(s) is/are the ids of
	one or more heads of the commit graph.

git-http-pull [-c] [-t] [-a] commit-id url
	-t: tree
	-c: commits
	-a: all
	Fetches the commit object with id commit-id.
	With -t or -a, fetches the tree and blobs for that commit-id.
	With -c or -a, fetches the parents, and recursively fetches
	each of their parents, etc.; with -a, fetches the tree and
	blobs for each of the ancestors as well.

git-init-db
	makes .git directory
	if SHA1_FILE_DIRECTORY not set, makes .git/objects/xx dirs

git-ls-files [-c|--cached] [-d|--deleted] [-o|--others] [-i|--ignored]
	[-s|--stage] [-u|--unmerged] [-x pattern|--exclude=pattern]
	[-X excl-file|--exclude-from=excl-file] [-z]
	Lists filenames from dircache.
	-c|--cached: list files in dircache (default)
	-d|--deleted: list files in dircache but not in w.d.
	-o|--others: list files in w.d. but not in dircache
	-i|--ignored: show files that would be excluded;
		requires at least one -x|-X|--exclude|--exclude-from
	-s|--stage: show full information including merge stage
		for each file
	-u|--unmerged: only show files with merge stage > 0 in dircache
	-x pattern|--exclude=pattern: exclude files matching pattern
	-X file|--exclude-from=file: read exclude patterns from
		file, one per line
	-z: terminate lines with \0 instead of \n
	Without -s, just prints pathnames, one per line.
	With -s, prints:
		mode sha1 stage pathname

git-ls-tree [-z] [-r] sha1
	prints contents of tree object in readable form
	4 columns: mode type sha1 name
	-z: terminate lines with \0 instead of \n
	-r: show subdirectories recursively

git-merge-base commit1-id commit2-id
	Finds the nearest common ancestor of commit1 and commit2,
	and prints its sha1 id

git-merge-cache <merge-program> [-a] [--] <filename>*
	-a: merge all files listed in dircache
	For each file to be merged, do nothing if it is at stage
	0 in the dircache.  Otherwise run:
		merge-program stage1-id stage2-id stage3-id \
		    pathname stage1-mode stage2-mode stage3-mode
	The stageX-id and stageX-mode are "" if that stage isn't
	present in the dircache for that file.

git-mktag < signature-file
	Verifies the input is a syntactically valid tag,
	creates an object containing the input, and prints
	its sha1 id.

git-read-tree (-m | stage0-sha1) [stage1-sha1] [stage2-sha1] [stage3-sha1]
	tree-object(s) -> dircache (uids, gids, inos, times, sizes == 0)
	-m: merge, i.e. start in stage 1; requires all objects in
	    dircache to be stage 0 initially; requires 1 or 3 trees.
	    With 1 tree, merges stat info from existing dircache
	    for unchanged files (same name and sha1 as tree).
	    With 3 trees, does a trivial 3-way merge.  Files merged
	    are made stage 0 and old stat info is used if possible.
	    Anything non-trivial is left as stage 1,2,3 entries.
	    Result goes into new index file.
	Without -m, existing dircache contents are discarded.
	Normally only one sha1 id would be given; more than one can be
	given but no merging is done.

git-rev-list commit-id
	prints the commit-ids of the ancestors of commit-id,
	ordered by date.

git-rev-tree [--edges] [^]commit-id [[^]commit-id]*
	--edges: show commits whose reachability differs from one or
	  more of its parents (reachability == which subset of the
	  commit-ids given on the command line it's reachable from).
	^ means don't show commits reachable from this commit-id
	  (ignored with --edges)
	each line of output is formatted as:
	  decimal-date commit-id:flags [parent-commit-id:flags]*
	flags is in decimal and is a reachability bitmap, i.e.
	0x1 is set if reachable from the first commit-id given,
	0x2 if reachable from the second, etc.

git-rpull [-t] [-c] [-a] commit-id url
	Flags are like http-pull.
	Pulls commits, trees and blobs from another machine over
	ssh; execs ssh to run rpush on the remote machine.
	url can be "-" meaning just talk over stdin/stdout
	instead of running ssh.

git-rpush [-t] [-c] [-a] commit-id url
	Flags are like http-pull.
	Pushes commits, tree and blobs from this machine to another
	machine over ssh; execs ssh to run rpull on the remote machine.
	url can be "-" meaning just talk over stdin/stdout
	instead of running ssh.

git-tar-tree sha1-id [basedir]
	Generates a tar-file on stdout for the tree identified by
	sha1-id, which can be a commit id or a tree id.
	If basedir is given, basedir/ is prepended to all pathnames.

git-unpack-file sha1-id
	Generates a temporary file name of the form .merge_file_XXXXXX
	and writes the contents of the blob object identified by sha1-id
	to it; outputs the generated name to stdout.

git-update-cache [--add] [--remove] [--] pathname...
	Update dircache entry for filename(s) from w.d.
	--add: add pathnames that are in w.d. but not dircache to
	    the dircache (without --add, print a message)
	--remove: remove pathnames which are in dircache but not
	    w.d. from the dircache (without --remove, print a message)

git-update-cache --refresh [--ignore-missing]
	Sets uid, gid, times, size on each entry in dircache from w.d.
	Complains if mode or data differs (assumes data matches
	if size and date match).
	Complains if any file not in w.d. unless --ignore-missing is given.

git-update-cache --cacheinfo mode sha1 path
	Adds an entry to the dircache for path with given mode and sha1.

git-write-tree
	Creates a tree object from the contents of the dircache
	(creating tree objects for subdirectories, recursively).
	Prints sha1 of top-level tree-object to stdout.
	Complains if any files are unmerged (merge stage > 0).

N.B.
w.d. = working directory (.)
dircache is in .git/index
object files are in $SHA1_FILE_DIRECTORY or .git/objects

^ permalink raw reply	[relevance 3%]

* [PATCH] Make pull not assume anything about current objects
@ 2005-05-01 17:33  5% Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-01 17:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Previously, pull assumed that, if you have a commit, you either have or
don't want everything it references. Change this to actually check
locally on everything you want, to be completely sure.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: pull.c
===================================================================
--- 6f0f1d99169f9d90aa44e47d1bcff7b1dd4d8ea0/pull.c  (mode:100644 sha1:86a7b6901fe69a82c12c3470b456982ef52cebd0)
+++ 661b090ca8652d2cfa299b4cac3ffceebdd2b43c/pull.c  (mode:100644 sha1:90d2d41ed2c56580f72f020bc93c3e1b8a3befa5)
@@ -48,8 +48,6 @@
 	if (get_history) {
 		struct commit_list *parents = obj->parents;
 		for (; parents; parents = parents->next) {
-			if (has_sha1_file(parents->item->object.sha1))
-				continue;
 			if (fetch(parents->item->object.sha1)) {
 				/* The server might not have it, and
 				 * we don't mind. 


^ permalink raw reply	[relevance 5%]

* [0/2] Complete http-pull
@ 2005-05-01 21:49  3% Daniel Barkalow
  2005-05-01 21:56 20% ` [2/2] " Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-05-01 21:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This series enables http-pull to do everything renecessary to bring your
local repository up to date with respect to a remote repository by HTTP.

 1: Library support for files in refs
 2: Fetch refs over the network and write them locally

	-Daniel
*This .sig left intentionally blank*



^ permalink raw reply	[relevance 3%]

* [2/2] Complete http-pull
  2005-05-01 21:49  3% [0/2] Complete http-pull Daniel Barkalow
@ 2005-05-01 21:56 20% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-01 21:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This adds support for fetching a reference from the remote repository and
for writing to a local reference file (with the -w option). It also makes
rpull aware that it lacks this capability.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: http-pull.c
===================================================================
--- f0d6a3af54a5ec8dd588fb8e501e38f6252eda19/http-pull.c  (mode:100644 sha1:d877c4abe3ff7766d858bfeac5c9a0eaf1385b65)
+++ 6bae8854157f3f8b29f9afee2c54434334f899e4/http-pull.c  (mode:100644 sha1:9a52c08f51d8e9f96c4704f84f8d0d15637fe397)
@@ -7,6 +7,8 @@
 #include <errno.h>
 #include <stdio.h>
 
+#include "refs.h"
+
 #include "pull.h"
 
 #include <curl/curl.h>
@@ -45,6 +47,23 @@
 	return size;
 }
 
+struct buffer
+{
+	size_t posn;
+	size_t size;
+	void *buffer;
+};
+
+static size_t fwrite_buffer(void *ptr, size_t eltsize, size_t nmemb,
+			    struct buffer *buffer) {
+	size_t size = eltsize * nmemb;
+	if (size > buffer->size - buffer->posn)
+		size = buffer->size - buffer->posn;
+	memcpy(buffer->buffer + buffer->posn, ptr, size);
+	buffer->posn += size;
+	return size;
+}
+
 int fetch(unsigned char *sha1)
 {
 	char *hex = sha1_to_hex(sha1);
@@ -103,6 +122,40 @@
 	return 0;
 }
 
+int fetch_ref(char *dir, char *name, unsigned char *sha1)
+{
+	char *url, *posn;
+	char hex[42];
+	struct buffer buffer;
+	buffer.size = 41;
+	buffer.posn = 0;
+	buffer.buffer = hex;
+	hex[41] = '\0';
+	
+	curl_easy_setopt(curl, CURLOPT_FILE, &buffer);
+	curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
+
+	url = xmalloc(strlen(base) + 7 + strlen(dir) + strlen(name));
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "refs/");
+	posn += 5;
+	strcpy(posn, dir);
+	posn += strlen(dir);
+	*(posn++) = '/';
+	strcpy(posn, name);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	if (curl_easy_perform(curl))
+		return error("Couldn't get %s for %s/%s\n", url,
+			     dir, name);
+
+	hex[40] = '\0';
+	get_sha1_hex(hex, sha1);
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;
@@ -118,6 +171,10 @@
 			get_all = 1;
 			get_tree = 1;
 			get_history = 1;
+		} else if (argv[arg][1] == 'w') {
+			char *write_ref = argv[arg + 1];
+			split_ref(&write_ref_dir, &write_ref_name, write_ref);
+			arg++;
 		}
 		arg++;
 	}
Index: pull.c
===================================================================
--- f0d6a3af54a5ec8dd588fb8e501e38f6252eda19/pull.c  (mode:100644 sha1:90d2d41ed2c56580f72f020bc93c3e1b8a3befa5)
+++ 6bae8854157f3f8b29f9afee2c54434334f899e4/pull.c  (mode:100644 sha1:89f11906f67ea9b36e1d4d85fa87f0e9b7d08d65)
@@ -3,6 +3,12 @@
 #include "cache.h"
 #include "commit.h"
 #include "tree.h"
+#include "tag.h"
+
+#include "refs.h"
+
+char *write_ref_dir = NULL;
+char *write_ref_name = NULL;
 
 int get_tree = 0;
 int get_history = 0;
@@ -61,15 +67,53 @@
 	return 0;
 }
 
+static int process_tag(unsigned char *sha1)
+{
+	return 0;
+}
+
+static int process_unknown(unsigned char *sha1)
+{
+	struct object *obj;
+	if (fetch(sha1))
+		return -1;
+	obj = parse_object(sha1);
+	if (obj->type == commit_type)
+		return process_commit(sha1);
+	else if (obj->type == tag_type)
+		return process_tag(sha1);
+	return error("Cannot pull a %s object", obj->type);
+}
+
+static int interpret_target(char *target, unsigned char *sha1)
+{
+	char *dir, *name;
+	if (!get_sha1_hex(target, sha1))
+		return 0;
+	if (!split_ref(&dir, &name, target)) {
+		if (!fetch_ref(dir, name, sha1)) {
+			return 0;
+		}
+	}
+	return -1;
+}
+
 int pull(char *target)
 {
 	int retval;
 	unsigned char sha1[20];
-	retval = get_sha1_hex(target, sha1);
+	retval = interpret_target(target, sha1);
+	if (retval) {
+		return error("Could not interpret %s as something to pull",
+			     target);
+	}
+	retval = fetch(sha1);
 	if (retval)
 		return retval;
-	retval = fetch(sha1);
+	retval = process_unknown(sha1);
 	if (retval)
 		return retval;
-	return process_commit(sha1);
+	if (write_ref_dir && write_ref_name)
+		write_split_ref_sha1(write_ref_dir, write_ref_name, sha1);
+	return 0;
 }
Index: pull.h
===================================================================
--- f0d6a3af54a5ec8dd588fb8e501e38f6252eda19/pull.h  (mode:100644 sha1:314bc7e95ab1a73634f6a96a8a3782fda91ea261)
+++ 6bae8854157f3f8b29f9afee2c54434334f899e4/pull.h  (mode:100644 sha1:5df0ff6001ad4129dcd8b2af7c927bade8c413d2)
@@ -4,6 +4,12 @@
 /** To be provided by the particular implementation. **/
 extern int fetch(unsigned char *sha1);
 
+extern int fetch_ref(char *dir, char *name, unsigned char *sha1);
+
+/** Ref filename to write target to. **/
+extern char *write_ref_dir;
+extern char *write_ref_name;
+
 /** Set to fetch the target tree. */
 extern int get_tree;
 
Index: rpull.c
===================================================================
--- f0d6a3af54a5ec8dd588fb8e501e38f6252eda19/rpull.c  (mode:100644 sha1:6624440d5ad24854e1bd1a8dff628427581198e0)
+++ 6bae8854157f3f8b29f9afee2c54434334f899e4/rpull.c  (mode:100644 sha1:a1c1be18195d40a152f86ed35886364dbc806d80)
@@ -20,6 +20,11 @@
 	return write_sha1_from_fd(sha1, fd_in);
 }
 
+int fetch_ref(char *name, char *dir, unsigned char *sha1)
+{
+	return -1;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;


^ permalink raw reply	[relevance 20%]

* [PATCH] Do not call fetch() when we have it.
@ 2005-05-02  0:10 33% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-02  0:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Currently pull() calls fetch() without checking whether we have
the wanted object but all of the existing fetch()
implementations perform this check and return success
themselves.  This patch moves the check to the caller.

I will be sending a trivial git-local-pull which depends on
this in the next message.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

http-pull.c |    4 ----
pull.c      |   17 ++++++++++++-----
rpull.c     |    2 --
3 files changed, 12 insertions(+), 11 deletions(-)

# - date handling: handle "AM"/"PM" on time
# + working-tree
--- k/http-pull.c
+++ l/http-pull.c
@@ -53,10 +53,6 @@ int fetch(unsigned char *sha1)
 	char *url;
 	char *posn;
 
-	if (has_sha1_file(sha1)) {
-		return 0;
-	}
-
 	local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
 
 	if (local < 0)
--- k/pull.c
+++ l/pull.c
@@ -8,6 +8,13 @@ int get_tree = 0;
 int get_history = 0;
 int get_all = 0;
 
+static int make_sure_we_have_it(unsigned char *sha1)
+{
+	if (has_sha1_file(sha1))
+		return 0;
+	return fetch(sha1);	
+}
+
 static int process_tree(unsigned char *sha1)
 {
 	struct tree *tree = lookup_tree(sha1);
@@ -17,7 +24,7 @@ static int process_tree(unsigned char *s
 		return -1;
 
 	for (entries = tree->entries; entries; entries = entries->next) {
-		if (fetch(entries->item.tree->object.sha1))
+		if (make_sure_we_have_it(entries->item.tree->object.sha1))
 			return -1;
 		if (entries->directory) {
 			if (process_tree(entries->item.tree->object.sha1))
@@ -31,14 +38,14 @@ static int process_commit(unsigned char 
 {
 	struct commit *obj = lookup_commit(sha1);
 
-	if (fetch(sha1))
+	if (make_sure_we_have_it(sha1))
 		return -1;
 
 	if (parse_commit(obj))
 		return -1;
 
 	if (get_tree) {
-		if (fetch(obj->tree->object.sha1))
+		if (make_sure_we_have_it(obj->tree->object.sha1))
 			return -1;
 		if (process_tree(obj->tree->object.sha1))
 			return -1;
@@ -50,7 +57,7 @@ static int process_commit(unsigned char 
 		for (; parents; parents = parents->next) {
 			if (has_sha1_file(parents->item->object.sha1))
 				continue;
-			if (fetch(parents->item->object.sha1)) {
+			if (make_sure_we_have_it(parents->item->object.sha1)) {
 				/* The server might not have it, and
 				 * we don't mind. 
 				 */
@@ -70,7 +77,7 @@ int pull(char *target)
 	retval = get_sha1_hex(target, sha1);
 	if (retval)
 		return retval;
-	retval = fetch(sha1);
+	retval = make_sure_we_have_it(sha1);
 	if (retval)
 		return retval;
 	return process_commit(sha1);
--- k/rpull.c
+++ l/rpull.c
@@ -14,8 +14,6 @@ static int fd_out;
 
 int fetch(unsigned char *sha1)
 {
-	if (has_sha1_file(sha1))
-		return 0;
 	write(fd_out, sha1, 20);
 	return write_sha1_from_fd(sha1, fd_in);
 }


^ permalink raw reply	[relevance 33%]

* [PATCH] Add git-local-pull.
@ 2005-05-02  0:11  4% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-02  0:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This adds the git-local-pull command as a smaller brother of
http-pull and rpull.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Makefile     |    3 +
local-pull.c |  110 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 112 insertions(+), 1 deletion(-)

# - [PATCH] Do not pollute work tree in git-merge-one-file-script
# + [PATCH] Add local pull.
--- k/Makefile
+++ l/Makefile
@@ -21,7 +21,7 @@ PROG=   git-update-cache git-diff-files 
 	git-check-files git-ls-tree git-merge-base git-merge-cache \
 	git-unpack-file git-export git-diff-cache git-convert-cache \
 	git-http-pull git-rpush git-rpull git-rev-list git-mktag \
-	git-diff-tree-helper git-tar-tree git-write-blob
+	git-diff-tree-helper git-tar-tree git-write-blob git-local-pull
 
 all: $(PROG)
 
@@ -87,6 +87,7 @@ git-export: export.c
 git-diff-cache: diff-cache.c
 git-convert-cache: convert-cache.c
 git-http-pull: http-pull.c pull.c
+git-local-pull: local-pull.c pull.c
 git-rpush: rsh.c
 git-rpull: rsh.c pull.c
 git-rev-list: rev-list.c
# - date handling: handle "AM"/"PM" on time
# + [PATCH] Add local pull.
Created: local-pull.c (mode:100644)
--- /dev/null
+++ l/local-pull.c
@@ -0,0 +1,110 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+#include "pull.h"
+
+static int use_link = 0;
+static int use_symlink = 0;
+static int verbose = 0;
+
+static char *path;
+
+static void say(const char *fmt, const char *hex) {
+	if (verbose)
+		fprintf(stderr, fmt, hex);
+}
+
+int fetch(unsigned char *sha1)
+{
+	static int object_name_start = -1;
+	static char filename[PATH_MAX];
+	char *hex = sha1_to_hex(sha1);
+	const char *dest_filename = sha1_file_name(sha1);
+	int ifd, ofd, status;
+	struct stat st;
+	void *map;
+
+	if (object_name_start < 0) {
+		strcpy(filename, path); /* e.g. git.git */
+		strcat(filename, "/objects/");
+		object_name_start = strlen(filename);
+	}
+	filename[object_name_start+0] = hex[0];
+	filename[object_name_start+1] = hex[1];
+	filename[object_name_start+2] = '/';
+	strcpy(filename + object_name_start + 3, hex + 2);
+	if (use_link && !link(filename, dest_filename)) {
+		say("Hardlinked %s.\n", hex);
+		return 0;
+	}
+	if (use_symlink && !symlink(filename, dest_filename)) {
+		say("Symlinked %s.\n", hex);
+		return 0;
+	}
+	ifd = open(filename, O_RDONLY);
+	if (ifd < 0 || fstat(ifd, &st) < 0) {
+		close(ifd);
+		fprintf(stderr, "Cannot open %s\n", filename);
+		return -1;
+	}
+	map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0);
+	close(ifd);
+	if (-1 == (int)(long)map) {
+		fprintf(stderr, "Cannot mmap %s\n", filename);
+		return -1;
+	}
+	ofd = open(dest_filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+	status = ((ofd < 0) || (write(ofd, map, st.st_size) != st.st_size));
+	munmap(map, st.st_size);
+	close(ofd);
+	if (status)
+		fprintf(stderr, "Cannot write %s (%ld bytes)\n",
+			dest_filename, st.st_size);
+	else
+		say("Copied %s.\n", hex);
+	return status;
+}
+
+static const char *local_pull_usage = 
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-v] commit-id path";
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	int arg = 1;
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't')
+			get_tree = 1;
+		else if (argv[arg][1] == 'c')
+			get_history = 1;
+		else if (argv[arg][1] == 'a') {
+			get_all = 1;
+			get_tree = 1;
+			get_history = 1;
+		}
+		else if (argv[arg][1] == 'l')
+			use_link = 1;
+		else if (argv[arg][1] == 's')
+			use_symlink = 1;
+		else if (argv[arg][1] == 'v')
+			verbose = 1;
+		else
+			usage(local_pull_usage);
+		arg++;
+	}
+	if (argc < arg + 2)
+		usage(local_pull_usage);
+	commit_id = argv[arg];
+	path = argv[arg + 1];
+
+	if (pull(commit_id))
+		return 1;
+
+	return 0;
+}


^ permalink raw reply	[relevance 4%]

* [PATCH] git-local-pull updates
@ 2005-05-02  3:41  7% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-02  3:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This is to be applied on top of the previous patch to add
git-local-pull command.  In addition to the '-l' (attempt
hardlink before anything else) and the '-s' (then attempt
symlink) flags, it adds '-n' (do not fall back to file copy)
flag.  Also it updates the comments.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

local-pull.c |   65 +++++++++++++++++++++++++++++++++++++----------------------
1 files changed, 41 insertions(+), 24 deletions(-)

# jit-diff -1:
# - Roll-up patches missing from Linus tree.
# + working-tree
--- k/local-pull.c
+++ l/local-pull.c
@@ -1,3 +1,6 @@
+/*
+ * Copyright (C) 2005 Junio C Hamano
+ */
 #include <fcntl.h>
 #include <unistd.h>
 #include <string.h>
@@ -10,6 +13,7 @@
 
 static int use_link = 0;
 static int use_symlink = 0;
+static int use_filecopy = 1;
 static int verbose = 0;
 
 static char *path;
@@ -25,9 +29,6 @@ int fetch(unsigned char *sha1)
 	static char filename[PATH_MAX];
 	char *hex = sha1_to_hex(sha1);
 	const char *dest_filename = sha1_file_name(sha1);
-	int ifd, ofd, status;
-	struct stat st;
-	void *map;
 
 	if (object_name_start < 0) {
 		strcpy(filename, path); /* e.g. git.git */
@@ -46,33 +47,47 @@ int fetch(unsigned char *sha1)
 		say("Symlinked %s.\n", hex);
 		return 0;
 	}
-	ifd = open(filename, O_RDONLY);
-	if (ifd < 0 || fstat(ifd, &st) < 0) {
+	if (use_filecopy) {
+		int ifd, ofd, status;
+		struct stat st;
+		void *map;
+		ifd = open(filename, O_RDONLY);
+		if (ifd < 0 || fstat(ifd, &st) < 0) {
+			close(ifd);
+			fprintf(stderr, "Cannot open %s\n", filename);
+			return -1;
+		}
+		map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0);
 		close(ifd);
-		fprintf(stderr, "Cannot open %s\n", filename);
-		return -1;
-	}
-	map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0);
-	close(ifd);
-	if (-1 == (int)(long)map) {
-		fprintf(stderr, "Cannot mmap %s\n", filename);
-		return -1;
+		if (-1 == (int)(long)map) {
+			fprintf(stderr, "Cannot mmap %s\n", filename);
+			return -1;
+		}
+		ofd = open(dest_filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+		status = ((ofd < 0) ||
+			  (write(ofd, map, st.st_size) != st.st_size));
+		munmap(map, st.st_size);
+		close(ofd);
+		if (status)
+			fprintf(stderr, "Cannot write %s (%ld bytes)\n",
+				dest_filename, st.st_size);
+		else
+			say("Copied %s.\n", hex);
+		return status;
 	}
-	ofd = open(dest_filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
-	status = ((ofd < 0) || (write(ofd, map, st.st_size) != st.st_size));
-	munmap(map, st.st_size);
-	close(ofd);
-	if (status)
-		fprintf(stderr, "Cannot write %s (%ld bytes)\n",
-			dest_filename, st.st_size);
-	else
-		say("Copied %s.\n", hex);
-	return status;
+	fprintf(stderr, "No copy method was provided to copy %s.\n", hex);
+	return -1;
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-v] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path";
 
+/* 
+ * By default we only use file copy.
+ * If -l is specified, a hard link is attempted.
+ * If -s is specified, then a symlink is attempted.
+ * If -n is _not_ specified, then a regular file-to-file copy is done.
+ */
 int main(int argc, char **argv)
 {
 	char *commit_id;
@@ -92,6 +107,8 @@ int main(int argc, char **argv)
 			use_link = 1;
 		else if (argv[arg][1] == 's')
 			use_symlink = 1;
+		else if (argv[arg][1] == 'n')
+			use_filecopy = 0;
 		else if (argv[arg][1] == 'v')
 			verbose = 1;
 		else


^ permalink raw reply	[relevance 7%]

* semi-useful git perl file
@ 2005-05-02  5:33  1% Joshua T. Corbin
  0 siblings, 0 replies; 200+ results
From: Joshua T. Corbin @ 2005-05-02  5:33 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 607 bytes --]

I've been playing around with driving git from perl land. The attached
allows you to easily access git objects from perl as:

	tie %git, 'GIT::ObjectDB';
	print Dumper( $git{ $commit_id } );

Looks like:
{
	type    => 'commit',
	sha     => $commit_id,
	tree    => '0000000000000000000000000000000000000000',
	parents => [ ... ],
	mess    => "\nbla bla bla\n"
}

And corresponding for trees, tags, and blobs.

If you want to see use of this in action, you can pull my (incomplete)
translation of cogito into perl from:

	http://node1.wunjo.org/~jcorbin/yagf.git

	or

	rsync://node1.wunjo.org/yagf.git

Josh

[-- Attachment #2: GIT.pm --]
[-- Type: text/x-perl, Size: 9467 bytes --]

package GIT;
use strict;
our %cmd;

sub cmdpath {
	my $cmd = shift;
	unless ( defined $cmd{$cmd} ) {
		local $/ = "\n";
		chomp( $cmd{$cmd} = `which $cmd` );
		return undef if $cmd{$cmd} eq '';
	}
	return $cmd{$cmd};
}

sub cmd {
	my $cmd = shift;
	cmdpath( $cmd ) || die "command '$cmd' not found\n";
	my $r = system( $cmd{$cmd}, @_ );
	die "$cmd failed: " . cmderrmsg( $cmd ) . "\n"
		if $r != 0;
	return 1;
}

sub cmdinout {
	my $infh = shift;
	my $cmd = shift;
	cmdpath( $cmd ) || die "command '$cmd' not found\n";
	my ( $r, $w );
	pipe( $r, $w ) || die "Failed to pipe: $!";
	my $pid = fork();
	die "Failed to fork: $!" unless defined $pid;
	if ( $pid ) {
		close $w;
		my $kid = waitpid( $pid, 0 );
		die "Hmm, auto reaping in place?" if $kid == -1;
		die "$cmd failed: " . cmderrmsg( $cmd ) . "\n"
			if $? & 127 || $? >> 8 != 0;
		local $/;
		local $_ = <$r>;
		close $r;
		if ( wantarray ) {
			return split( "\n", $_ );
		} else {
			return $_;
		}
	} else {
		close $r;
		close STDOUT;
		close STDIN;
		open STDIN, '<&', $infh || die "Failed to rediret STDIN";
		open STDOUT, '>&', $w || die "Failed to redirect STDOUT";
		exec( $cmd{$cmd}, @_ );
	}
}

sub cmdout {
	my $callback = shift if ref $_[0] eq 'CODE';
	my $cmd = shift;
	cmdpath( $cmd ) || die "command '$cmd' not found\n";
	my ( $r, $w );
	pipe( $r, $w ) || die "Failed to pipe: $!";
	my $pid = fork();
	die "Failed to fork: $!" unless defined $pid;
	if ( $pid ) {
		close $w;
		if ( defined $callback ) {
			local $_;
			while ( <$r> ) {
				if ( &$callback() ) {
					kill 15, $pid;
					last;
				}
			}
			close $r;
			waitpid( $pid, 0 );
			return 1;
		} else {
			my $kid = waitpid( $pid, 0 );
			die "Hmm, auto reaping in place?" if $kid == -1;
			die "$cmd failed: " . cmderrmsg( $cmd ) . "\n"
				if $? & 127 || $? >> 8 != 0;
			if ( wantarray ) {
				my @r = <$r>;
				close $r;
				chomp @r;
				return @r;
			} else {
				local $/;
				my $r = <$r>;
				close $r;
				return $r;
			}
		}
	} else {
		close $r;
		close STDOUT;
		open STDOUT, '>&', $w || die "Failed to redirect STDOUT";
		exec( $cmd{$cmd}, @_ );
	}
}

sub cmderrmsg {
	my $cmd = shift;
	my $e;
	if ( $? == -1 ) {
		$e = "failed to execute $cmd{$cmd}: $!";
	} elsif ( $? & 127 ) {
		$e = sprintf( 'child died from signal %d', ( $? & 127 ) );
		$e .= ' (with coredump)' if $? & 128;
	} else {
		$e = sprintf( 'child exit value: %d', $? >> 8 );
	}
	return $e;
}

package GIT::ObjectDB;
use strict;
# Cache this many commit/tree/tag objects, blobs are not cached because they are (possibly) huge.
our $CacheMax = 20;
our $MissingFatal;
use Carp qw( croak );

sub TIEHASH {
	my ( $class, $dir ) = @_;
	$dir ||= $ENV{SHA1_FILE_DIRECTORY} || '.git/objects';
	$ENV{SHA1_FILE_DIRECTORY} = $dir
		if $dir ne '.git/objects';

	( -d $dir ) || croak "No such directory $dir";
	bless my $self = {
		dir    => $dir,
		types  => {},
		cache  => {}, # What we're caching
		cachea => []  # The order we cached it in so
	} => $class;
	return $self;
}

sub FETCH {
	my ( $self, $key ) = @_;
	croak "Invalid sha1 key '$key'" unless $key =~ /^[A-Za-z0-9]{40}$/;
	return $self->{ cache }->{ $key }
		if defined $self->{ cache }->{ $key };
	my $type = $self->objectType( $key );
	unless ( defined $type ) {
		die "no such object $key" if $MissingFatal;
		return undef;
	}

	if ( $type eq 'blob' ) {
		return new GIT::ObjectDB::Blob( $key );
	} else {
		if ( $type eq 'tree' ) {
			$self->{ cache }->{ $key } =
				GIT::ObjectDB::Tree->new_fromkey( $key );
		} elsif ( $type eq 'commit' ) {
			$self->{ cache }->{ $key } =
				GIT::ObjectDB::Commit->new_fromkey( $key );
		} elsif ( $type eq 'tag' ) {
			$self->{ cache }->{ $key } =
				GIT::ObjectDB::Tag->new_fromkey( $key );
		} else {
			croak "Unrecognized object($key) type '$type'";
		}
		push @{ $self->{ cachea } }, $key;
		while ( scalar @{ $self->{ cachea } } > $CacheMax ) {
			my $k = shift @{ $self->{ cachea } };
			delete $self->{ cache }->{ $k };
		}
		return $self->{ cache }->{ $key };
	}
}

sub STORE {
	my ( $self, $key, $value ) = @_;
	croak "Will not overwrite an object"
		if defined $self->objectType( $key );
	if ( UNIVERSAL::isa( $value, 'GIT::ObjectDB::Commit' ) ) {
		my $mess = $value->{ mess };
		$mess =~ /^\s*$/s && croak "Won't commit an empty message";
		my $fh;
		open $fh, '<', \$mess;
		chomp( $value->{ sha } = GIT::cmdinout( $fh,
			'git-commit-tree', $value->{ tree },
			map { ( '-p', $_ ) } @{ $value->{ parents } }
		) );
		close $fh;
	} elsif ( UNIVERSAL::isa( $value, 'GIT::ObjectDB::Tag' ) ) {
		my $type = $self->objectType( $value->{ object } ) ||
			croak "No such object $value->{object}";
		croak "Tagging a tag?" if $type eq 'tag';
		my $tag =
			"object $value->{object}\n" .
			"type $type\n" .
			"tag $value-{tag}\n" .
			$value->{ sig };
		my $fh;
		open $fh, '<', \$tag;
		chomp( $value->{ sha } = GIT::cmdinout( $fh, 'git-mktag' ) );
		close $fh;
	} else {
		croak "Only support storing commits and tags";
	}
	push @{ $self->{ cachea } },
		$self->{ cache }->{ $value->{ sha } } = $value;
	while ( scalar @{ $self->{ cachea } } > $CacheMax ) {
		my $k = shift @{ $self->{ cachea } };
		delete $self->{ cache }->{ $k };
	}
}

sub EXISTS {
	my ( $self, $key ) = @_;
	return defined( $self->objectType( $key ) ) ? 1 : 0;
}

sub FIRSTKEY {
	my ( $self ) = @_;
	if ( defined $self->{ dh } ) {
		closedir $self->{ dh };
		delete $self->{ dh };
	}
	$self->{ i } = -1;
	return $self->NEXTKEY;
}

sub NEXTKEY {
	my ( $self ) = @_;
	my $r;
	until ( defined $r ) {
		if ( defined $self->{ dh } ) {
			$r = readdir $self->{ dh };
			unless ( defined $r ) {
				closedir $self->{ dh };
				delete $self->{ dh };
				next;
			}
			$r = undef if $r !~ /^[A-Za-z0-9]{38}$/;
			$r = sprintf( '%02x%s', $self->{ i }, $r ) if defined $r;
		} else {
			$self->{ i }++;
			last if $self->{ i } > 0xff;
			my $dh;
			my $dir = sprintf( '%s/%02x', $self->{ dir }, $self->{ i } );
			opendir $dh, $dir ||
				die "Failed to opendir $dir: $!";
			$self->{ dh } = $dh;
			next;
		}
	}
	return $r;
}

sub SCALAR {
	my ( $self ) = @_;
	return $self->{ dir };
}

sub UNTIE {
	my ( $self ) = @_;
	closedir $self->{ dh } if defined $self->{ dh };
}

sub objectType {
	my ( $self, $key ) = @_;
	eval {
		chomp(
			( $self->{ types }->{ $key } ) =
				GIT::cmdout( 'git-cat-file', '-t', $key )
		) unless defined $self->{ types }->{ $key };
	};
	return undef if $@;
	return $self->{ types }->{ $key };
}

package GIT::ObjectDB::Blob;
use strict;

sub new {
	my ( $class, $key ) = @_;
	bless {
		type => 'blob',
		sha => $key
	} => $class;
}

sub contents {
	my ( $self ) = @_;
	return GIT::cmdout( 'git-cat-file', 'blob', $self->{ sha } );
}

sub write_to_filehandle {
	my ( $self, $fh ) = @_;
	GIT::cmdout( sub {
		print $fh $_;
		return 0;
	}, 'git-cat-file', 'blob', $self->{ sha } );
	return 1;
}

package GIT::ObjectDB::Commit;
use strict;
use Carp qw( croak );

sub new {
	my $class   = shift;
	my $mess    = shift          || croak "Missing message";
	my $tree    = shift          || croak "Missing tree";
	$tree =~ /^[A-Za-z0-9]{40}$/ || croak "Invalid tree id";
	my @parents = @_             or croak "Missing parent(s)";
	for my $parent ( @parents ) {
		$parent =~ /^[A-Za-z0-9]{40}$/ || croak "Invalid parent id '$parent'";
	}

	return bless {
		type    => 'commit',
		parents => \@parents,
		tree    => $tree,
		mess    => $mess
	} => $class;
}

sub new_fromkey {
	my ( $class, $key ) = @_;
	bless my $self = {
		type    => 'commit',
		sha     => $key,
		parents => [],
		mess    => ''
	} => $class;

	local $/ = "\n";
	my $no_more_parents;
	GIT::cmdout( sub {
		chomp;
		if ( ! defined $self->{ tree } && /^tree ([A-Za-z0-9]{40})$/ ) {
			$self->{ tree } = $1;
		} elsif ( ! $no_more_parents && /^parent ([A-Za-z0-9]{40})$/ ) {
			push @{ $self->{ parents } }, $1;
		} else {
			$no_more_parents = 1;
			if ( ! defined $self->{ author } && /^author (.+) (\d+ [-+]\d{4})$/ ) {
				$self->{ author } = [ $1, $2 ];
			} elsif ( ! defined $self->{ committer } && /^committer (.+) (\d+ [-+]\d{4})$/ ) {
				$self->{ committer } = [ $1, $2 ];
			} else {
				$self->{ mess } .= "$_\n";
			}
		}
		return 0;
	}, 'git-cat-file', 'commit', $key );

	return $self;
}

package GIT::ObjectDB::Tree;
use strict;

sub new_fromkey {
	my ( $class, $key ) = @_;

	bless my $self = {
		type => 'tree',
		sha  => $key,
		ent  => []
	} => $class;

	my $raw = GIT::cmdout( 'git-cat-file', 'tree', $key );

	my @raw = unpack( '(Z*H40)*', $raw ); 
	$raw = undef;
	while ( @raw ) {
		push @{ $self->{ ent } },
			[ split( ' ', shift @raw, 2 ), shift @raw ];
	}

	return $self;
}

package GIT::ObjectDB::Tag;
use strict;
use Carp qw( croak );

sub new {
	my $class  = shift;
	my $object = shift             || croak "Missing object";
	$object =~ /^[A-Za-z0-9]{40}$/ || croak "Invalid object id";
	my $tag    = shift             || croak "Missing tag";
	my $sig    = shift             || croak "Missing signature";

	return bless {
		type   => 'tag',
		object => $object,
		tag    => $tag,
		sig    => $sig
	} => $class;
}

sub new_fromkey {
	my ( $class, $key ) = @_;
	bless my $self = {
		type => 'tag',
		sha  => $key,
		sig  => ''
	} => $class;
  
	local $/ = "\n";
	GIT::cmdout( sub {
		if ( /^object ([A-Za-z0-9]{40})$/ ) {
			$self->{ object } = $1;
		} elsif ( /^type (.+)$/ ) {
			$self->{ object_type } = $1;
		} elsif ( /^tag (.+)$/ ) {
			$self->{ tag } = $1;
		} else {
			$self->{ sig } .= $_;
		}
		return 0;
	}, 'git-cat-file', 'tag', $key );

	return $self;
}

^ permalink raw reply	[relevance 1%]

* Re: questions about cg-update, cg-pull, and cg-clone.
  @ 2005-05-02 19:58  3% ` Petr Baudis
  2005-05-03 15:22  3%   ` Zack Brown
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-02 19:58 UTC (permalink / raw)
  To: Zack Brown; +Cc: Git Mailing List

Dear diary, on Sat, Apr 30, 2005 at 02:53:22AM CEST, I got a letter
where Zack Brown <zbrown@tumblerings.org> told me that...
> 'cg-update branch-name' grabs any new changes from the upstream repository and
> merges them into my local repository. If I've been editing files in my local
> repository, the update attempts to merge the changes cleanly.

Yes.

> Now, if the update is clean, a cg-commit is invoked automatically, and if the
> update is not clean, I then have to resolve any conflicts and give the cg-commit
> command by hand. But: what is the significance of either of these cg-commit
> commands? Why should I have to write a changelog entry recording this merge? All

You might want to write some special notes regarding the merge, e.g.
when you want to describe some non-trivial conflict resolution, or even
give a short blurb of the changes you are merging.

If you don't know what to say, just press Ctrl-D. The first line of the
commit always says "Merge with what_you_are_merging_with".

> I'm doing is updating my tree to be current. Why should I have to 'commit' that
> update?

If you are only updating your tree to be current, you don't have to
commit, and in fact you don't commit (you do so-called "fast-forward
merge", which will just update your HEAD pointer to point at the newer
commit). You commit only when you were merging stuff (so-called "tree
merge"; well, that's at least how I call it to differentiate it from the
fast-forward merge). That means you have some local commits over there -
I can't just update your tree to be current, sorry. That would lose your
commit. I have to merge the changes into your tree through a merge
commit.

> Now I look at 'cg-pull'. What does this do? The readme says something about
> printing two ids, and being useful for diffs. But can't I do a diff after a
> cg-update and get the same result? I'm very confused about cg-pull right now.

cg-pull does the first part of cg-update. It is concerned by fetching
the stuff from the remote repository to the local one. cg-merge then
does the second part, merging the stuff to your local tree (doing either
fast-forward or tree merge).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* [PATCH] Make git-*-pull say who wants it for missing objects.
@ 2005-05-03  0:13 10% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-03  0:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch updates pull.c, the engine that decides what object is needed
given a commit to traverse from, to report which commit was calling for
the object that cannot be retrieved from the remote side.  This complements
git-fsck-cache in that it checks the consistency of the remote repository
for reachability.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

pull.c |   37 ++++++++++++++++++++++++++++++-------
1 files changed, 30 insertions(+), 7 deletions(-)

--- a/pull.c
+++ b/pull.c
@@ -7,12 +7,31 @@
 int get_tree = 0;
 int get_history = 0;
 int get_all = 0;
+static unsigned char current_commit_sha1[20];
 
-static int make_sure_we_have_it(unsigned char *sha1)
+static const char commitS[] = "commit";
+static const char treeS[] = "tree";
+static const char blobS[] = "blob";
+
+static void report_missing(const char *what, const unsigned char *missing)
+{
+	char missing_hex[41];
+
+	strcpy(missing_hex, sha1_to_hex(missing));;
+	fprintf(stderr,
+		"Cannot obtain needed %s %s\nwhile processing commit %s.\n",
+		what, missing_hex, sha1_to_hex(current_commit_sha1));
+}
+
+static int make_sure_we_have_it(const char *what, unsigned char *sha1)
 {
+	int status;
 	if (has_sha1_file(sha1))
 		return 0;
-	return fetch(sha1);	
+	status = fetch(sha1);
+	if (status && what)
+		report_missing(what, sha1);
+	return status;
 }
 
 static int process_tree(unsigned char *sha1)
@@ -24,7 +43,8 @@ static int process_tree(unsigned char *s
 		return -1;
 
 	for (entries = tree->entries; entries; entries = entries->next) {
-		if (make_sure_we_have_it(entries->item.tree->object.sha1))
+		const char *what = entries->directory ? treeS : blobS;
+		if (make_sure_we_have_it(what, entries->item.tree->object.sha1))
 			return -1;
 		if (entries->directory) {
 			if (process_tree(entries->item.tree->object.sha1))
@@ -38,14 +58,14 @@ static int process_commit(unsigned char 
 {
 	struct commit *obj = lookup_commit(sha1);
 
-	if (make_sure_we_have_it(sha1))
+	if (make_sure_we_have_it(commitS, sha1))
 		return -1;
 
 	if (parse_commit(obj))
 		return -1;
 
 	if (get_tree) {
-		if (make_sure_we_have_it(obj->tree->object.sha1))
+		if (make_sure_we_have_it(treeS, obj->tree->object.sha1))
 			return -1;
 		if (process_tree(obj->tree->object.sha1))
 			return -1;
@@ -57,7 +77,8 @@ static int process_commit(unsigned char 
 		for (; parents; parents = parents->next) {
 			if (has_sha1_file(parents->item->object.sha1))
 				continue;
-			if (make_sure_we_have_it(parents->item->object.sha1)) {
+			if (make_sure_we_have_it(NULL,
+						 parents->item->object.sha1)) {
 				/* The server might not have it, and
 				 * we don't mind. 
 				 */
@@ -65,6 +86,7 @@ static int process_commit(unsigned char 
 			}
 			if (process_commit(parents->item->object.sha1))
 				return -1;
+			memcpy(current_commit_sha1, sha1, 20);
 		}
 	}
 	return 0;
@@ -77,8 +99,9 @@ int pull(char *target)
 	retval = get_sha1_hex(target, sha1);
 	if (retval)
 		return retval;
-	retval = make_sure_we_have_it(sha1);
+	retval = make_sure_we_have_it(commitS, sha1);
 	if (retval)
 		return retval;
+	memcpy(current_commit_sha1, sha1, 20);
 	return process_commit(sha1);
 }


^ permalink raw reply	[relevance 10%]

* [PATCH] Short-cut error return path in git-local-pull.
@ 2005-05-03  0:26  9% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-03  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

When git-local-pull with -l option gets ENOENT attempting to create
a hard link, there is no point falling back to other copy methods.
This patch implements a short-cut to detect that case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

local-pull.c |   25 ++++++++++++++++---------
1 files changed, 16 insertions(+), 9 deletions(-)

--- a/local-pull.c
+++ b/local-pull.c
@@ -39,12 +39,19 @@ int fetch(unsigned char *sha1)
 	filename[object_name_start+1] = hex[1];
 	filename[object_name_start+2] = '/';
 	strcpy(filename + object_name_start + 3, hex + 2);
-	if (use_link && !link(filename, dest_filename)) {
-		say("Hardlinked %s.\n", hex);
-		return 0;
+	if (use_link) {
+		if (!link(filename, dest_filename)) {
+			say("link %s\n", hex);
+			return 0;
+		}
+		/* If we got ENOENT there is no point continuing. */
+		if (errno == ENOENT) {
+			fprintf(stderr, "does not exist %s\n", filename);
+			return -1;
+		}
 	}
 	if (use_symlink && !symlink(filename, dest_filename)) {
-		say("Symlinked %s.\n", hex);
+		say("symlink %s\n", hex);
 		return 0;
 	}
 	if (use_filecopy) {
@@ -54,13 +61,13 @@ int fetch(unsigned char *sha1)
 		ifd = open(filename, O_RDONLY);
 		if (ifd < 0 || fstat(ifd, &st) < 0) {
 			close(ifd);
-			fprintf(stderr, "Cannot open %s\n", filename);
+			fprintf(stderr, "cannot open %s\n", filename);
 			return -1;
 		}
 		map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0);
 		close(ifd);
 		if (-1 == (int)(long)map) {
-			fprintf(stderr, "Cannot mmap %s\n", filename);
+			fprintf(stderr, "cannot mmap %s\n", filename);
 			return -1;
 		}
 		ofd = open(dest_filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
@@ -69,13 +76,13 @@ int fetch(unsigned char *sha1)
 		munmap(map, st.st_size);
 		close(ofd);
 		if (status)
-			fprintf(stderr, "Cannot write %s (%ld bytes)\n",
+			fprintf(stderr, "cannot write %s (%ld bytes)\n",
 				dest_filename, st.st_size);
 		else
-			say("Copied %s.\n", hex);
+			say("copy %s\n", hex);
 		return status;
 	}
-	fprintf(stderr, "No copy method was provided to copy %s.\n", hex);
+	fprintf(stderr, "failed to copy %s with given copy methods.\n", hex);
 	return -1;
 }
 


^ permalink raw reply	[relevance 9%]

* Re: cogito "origin" vs. HEAD
  @ 2005-05-03  6:49  3% ` Petr Baudis
  2005-05-03  7:13  0%   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-03  6:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Git Mailing List

Dear diary, on Tue, May 03, 2005 at 05:24:19AM CEST, I got a letter
where Benjamin Herrenschmidt <benh@kernel.crashing.org> told me that...
> Hi !

Hi,

> So when I later do cg-pull or cg-update origin to update, my "origin"
> pointer is updated I suppose to the new head of the remote repository,
> does it also update my local "refs/heads/master" ? Or not ? What happens
> to it ? does anything will use my local HEAD -> refs/heads/master/
> ever ? If I want to publish my tree, what will remote cogito's try to
> rsync down ? HEAD ? origin ?

when accessing the remote repository, Cogito always looks for remote
refs/heads/master first - if that one isn't there, it takes HEAD, but
there is no correlation between the local and remote branch name. If you
want to fetch a different branch from the remote repository, use the
fragment identifier (see cg-help cg-branch-add).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: cogito "origin" vs. HEAD
  2005-05-03  6:49  3% ` Petr Baudis
@ 2005-05-03  7:13  0%   ` Benjamin Herrenschmidt
  2005-05-03  9:06  0%     ` Alexey Nezhdanov
  2005-05-03  9:47  0%     ` Petr Baudis
  0 siblings, 2 replies; 200+ results
From: Benjamin Herrenschmidt @ 2005-05-03  7:13 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Git Mailing List


> when accessing the remote repository, Cogito always looks for remote
> refs/heads/master first - if that one isn't there, it takes HEAD, but
> there is no correlation between the local and remote branch name. If you
> want to fetch a different branch from the remote repository, use the
> fragment identifier (see cg-help cg-branch-add).

Ok, that I'm getting. So then, what happen of my local
refs/heads/<branchname> and refs/heads/master/ ? I'm still a bit
confused by the whole branch mecanism... It's my understanding than when
I cg-init, it creates both "master" (a head without matching branch)
and "origin" (a branch  + a head) both having the same sha1. It also
checks out the tree.

Now, when I cg-update origin, what happens exactly ? I mean, I know it's
pulls all objects, then get the master from the remote pointed by the
origin branch, but then, I suppose it updates both my local "origin" and
my local "master" pointer, right ? I mean, they are always in sync ? Or
is this related to what branch my current checkout is tracking ?

Ben.



^ permalink raw reply	[relevance 0%]

* [PATCH] add the ability to create and retrieve delta objects
  @ 2005-05-03  8:06  1%   ` Nicolas Pitre
    0 siblings, 1 reply; 200+ results
From: Nicolas Pitre @ 2005-05-03  8:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Alon Ziv, git

On Mon, 2 May 2005, Linus Torvalds wrote:

> If you do something like this, you want such a delta-blob to be named by 
> the sha1 of the result, so that things that refer to it can transparently 
> see either the original blob _or_ the "deltified" one, and will never 
> care.

Yep, that's what I've done last weekend (and just made it actually 
work since people are getting interested).

==========

This patch adds the necessary functionalities to perform delta 
compression on objects.  It adds a git-mkdelta command which can replace 
any object with its deltafied version given a reference object.

Access to a delta object will transparently fetch the reference object 
and apply the transformation.  Scripts can be used to perform any sort 
of compression policy on top of it.

The delta generator has been extracted from libxdiff and optimized for 
git usage in order to avoid as much data copy as possible, and the delta 
storage format modified to be even more compact.  Therefore no need to 
rely on any external library.  The test-delta program can be used to 
test it.

The fsck tool doesn't know about delta object and its relation with 
other objects yet.  But if one doesn't use git-mkdelta it should not be 
a problem.  Many refinements are possible but better merge them 
separately.  Loop detection and recursion limit are a few examples.

Signed-off-by: Nicolas Pitre <nico@cam.org>

--- k/delta.h
+++ l/delta.h
@@ -0,0 +1,6 @@
+extern void *diff_delta(void *from_buf, unsigned long from_size,
+			void *to_buf, unsigned long to_size,
+		        unsigned long *delta_size);
+extern void *patch_delta(void *src_buf, unsigned long src_size,
+			 void *delta_buf, unsigned long delta_size,
+			 unsigned long *dst_size);
--- k/diff-delta.c
+++ l/diff-delta.c
@@ -0,0 +1,315 @@
+/*
+ * diff-delta.c: generate a delta between two buffers
+ *
+ *  Many parts of this file have been lifted from LibXDiff version 0.10.
+ *  http://www.xmailserver.org/xdiff-lib.html
+ *
+ *  LibXDiff was written by Davide Libenzi <davidel@xmailserver.org>
+ *  Copyright (C) 2003	Davide Libenzi
+ *
+ *  Many mods for GIT usage by Nicolas Pitre <nico@cam.org>, (C) 2005.
+ *
+ *  This file is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ */
+
+#include <stdlib.h>
+#include "delta.h"
+
+
+/* block size: min = 16, max = 64k, power of 2 */
+#define BLK_SIZE 16
+
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
+
+#define GR_PRIME 0x9e370001
+#define HASH(v, b) (((unsigned int)(v) * GR_PRIME) >> (32 - (b)))
+	
+/* largest prime smaller than 65536 */
+#define BASE 65521
+
+/* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
+#define NMAX 5552
+
+#define DO1(buf, i)  { s1 += buf[i]; s2 += s1; }
+#define DO2(buf, i)  DO1(buf, i); DO1(buf, i + 1);
+#define DO4(buf, i)  DO2(buf, i); DO2(buf, i + 2);
+#define DO8(buf, i)  DO4(buf, i); DO4(buf, i + 4);
+#define DO16(buf)    DO8(buf, 0); DO8(buf, 8);
+
+static unsigned int adler32(unsigned int adler, const unsigned char *buf, int len)
+{
+	int k;
+	unsigned int s1 = adler & 0xffff;
+	unsigned int s2 = adler >> 16;
+
+	while (len > 0) {
+		k = MIN(len, NMAX);
+		len -= k;
+		while (k >= 16) {
+			DO16(buf);
+			buf += 16;
+			k -= 16;
+		}
+		if (k != 0)
+			do {
+				s1 += *buf++;
+				s2 += s1;
+			} while (--k);
+		s1 %= BASE;
+		s2 %= BASE;
+	}
+
+	return (s2 << 16) | s1;
+}
+
+static unsigned int hashbits(unsigned int size)
+{
+	unsigned int val = 1, bits = 0;
+	while (val < size && bits < 32) {
+		val <<= 1;
+	       	bits++;
+	}
+	return bits ? bits: 1;
+}
+
+typedef struct s_chanode {
+	struct s_chanode *next;
+	int icurr;
+} chanode_t;
+
+typedef struct s_chastore {
+	chanode_t *head, *tail;
+	int isize, nsize;
+	chanode_t *ancur;
+	chanode_t *sncur;
+	int scurr;
+} chastore_t;
+
+static void cha_init(chastore_t *cha, int isize, int icount)
+{
+	cha->head = cha->tail = NULL;
+	cha->isize = isize;
+	cha->nsize = icount * isize;
+	cha->ancur = cha->sncur = NULL;
+	cha->scurr = 0;
+}
+
+static void *cha_alloc(chastore_t *cha)
+{
+	chanode_t *ancur;
+	void *data;
+
+	ancur = cha->ancur;
+	if (!ancur || ancur->icurr == cha->nsize) {
+		ancur = malloc(sizeof(chanode_t) + cha->nsize);
+		if (!ancur)
+			return NULL;
+		ancur->icurr = 0;
+		ancur->next = NULL;
+		if (cha->tail)
+			cha->tail->next = ancur;
+		if (!cha->head)
+			cha->head = ancur;
+		cha->tail = ancur;
+		cha->ancur = ancur;
+	}
+
+	data = (void *)ancur + sizeof(chanode_t) + ancur->icurr;
+	ancur->icurr += cha->isize;
+	return data;
+}
+
+static void cha_free(chastore_t *cha)
+{
+	chanode_t *cur = cha->head;
+	while (cur) {
+		chanode_t *tmp = cur;
+		cur = cur->next;
+		free(tmp);
+	}
+}
+
+typedef struct s_bdrecord {
+	struct s_bdrecord *next;
+	unsigned int fp;
+	const unsigned char *ptr;
+} bdrecord_t;
+
+typedef struct s_bdfile {
+	const unsigned char *data, *top;
+	chastore_t cha;
+	unsigned int fphbits;
+	bdrecord_t **fphash;
+} bdfile_t;
+
+static int delta_prepare(const unsigned char *buf, int bufsize, bdfile_t *bdf)
+{
+	unsigned int fphbits;
+	int i, hsize;
+	const unsigned char *base, *data, *top;
+	bdrecord_t *brec;
+	bdrecord_t **fphash;
+
+	fphbits = hashbits(bufsize / BLK_SIZE + 1);
+	hsize = 1 << fphbits;
+	fphash = malloc(hsize * sizeof(bdrecord_t *));
+	if (!fphash)
+		return -1;
+	for (i = 0; i < hsize; i++)
+		fphash[i] = NULL;
+	cha_init(&bdf->cha, sizeof(bdrecord_t), hsize / 4 + 1);
+
+	bdf->data = data = base = buf;
+	bdf->top = top = buf + bufsize;
+	data += (bufsize / BLK_SIZE) * BLK_SIZE;
+	if (data == top)
+		data -= BLK_SIZE;
+
+	for ( ; data >= base; data -= BLK_SIZE) {
+		brec = cha_alloc(&bdf->cha);
+		if (!brec) {
+			cha_free(&bdf->cha);
+			free(fphash);
+			return -1;
+		}
+		brec->fp = adler32(0, data, MIN(BLK_SIZE, top - data));
+		brec->ptr = data;
+		i = HASH(brec->fp, fphbits);
+		brec->next = fphash[i];
+		fphash[i] = brec;
+	}
+
+	bdf->fphbits = fphbits;
+	bdf->fphash = fphash;
+
+	return 0;
+}
+
+static void delta_cleanup(bdfile_t *bdf)
+{
+	free(bdf->fphash);
+	cha_free(&bdf->cha);
+}
+
+#define COPYOP_SIZE(o, s) \
+    (!!(o & 0xff) + !!(o & 0xff00) + !!(o & 0xff0000) + !!(o & 0xff000000) + \
+     !!(s & 0xff) + !!(s & 0xff00) + 1)
+
+void *diff_delta(void *from_buf, unsigned long from_size,
+		 void *to_buf, unsigned long to_size,
+		 unsigned long *delta_size)
+{
+	int i, outpos, outsize, inscnt, csize, msize, moff;
+	unsigned int fp;
+	const unsigned char *data, *top, *ptr1, *ptr2;
+	unsigned char *out, *orig;
+	bdrecord_t *brec;
+	bdfile_t bdf;
+
+	if (delta_prepare(from_buf, from_size, &bdf))
+		return NULL;
+	
+	outpos = 0;
+	outsize = 4096;
+	out = malloc(outsize);
+	if (!out) {
+		delta_cleanup(&bdf);
+		return NULL;
+	}
+
+	data = to_buf;
+	top = to_buf + to_size;
+
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size;
+
+	inscnt = 0;
+	moff = 0;
+	while (data < top) {
+		msize = 0;
+		fp = adler32(0, data, MIN(top - data, BLK_SIZE));
+		i = HASH(fp, bdf.fphbits);
+		for (brec = bdf.fphash[i]; brec; brec = brec->next) {
+			if (brec->fp == fp) {
+				csize = bdf.top - brec->ptr;
+				if (csize > top - data)
+					csize = top - data;
+				for (ptr1 = brec->ptr, ptr2 = data; 
+				     csize && *ptr1 == *ptr2;
+				     csize--, ptr1++, ptr2++);
+
+				csize = ptr1 - brec->ptr;
+				if (csize > msize) {
+					moff = brec->ptr - bdf.data;
+					msize = csize;
+					if (msize >= 0x10000) {
+						msize = 0x10000;
+						break;
+					}
+				}
+			}
+		}
+
+		if (!msize || msize < COPYOP_SIZE(moff, msize)) {
+			if (!inscnt)
+				outpos++;
+			out[outpos++] = *data++;
+			inscnt++;
+			if (inscnt == 0x7f) {
+				out[outpos - inscnt - 1] = inscnt;
+				inscnt = 0;
+			}
+		} else {
+			if (inscnt) {
+				out[outpos - inscnt - 1] = inscnt;
+				inscnt = 0;
+			}
+
+			data += msize;
+			orig = out + outpos++;
+			i = 0x80;
+
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x01; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x02; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x04; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x08; }
+
+			if (msize & 0xff) { out[outpos++] = msize; i |= 0x10; }
+			msize >>= 8;
+			if (msize & 0xff) { out[outpos++] = msize; i |= 0x20; }
+
+			*orig = i;
+		}
+
+		/* next time around the largest possible output is 1 + 4 + 3 */
+		if (outpos > outsize - 8) {
+			void *tmp = out;
+			outsize = outsize * 3 / 2;
+			out = realloc(out, outsize);
+			if (!out) {
+				free(tmp);
+				delta_cleanup(&bdf);
+				return NULL;
+			}
+		}
+	}
+
+	if (inscnt)
+		out[-inscnt - 1] = inscnt;
+
+	delta_cleanup(&bdf);
+	*delta_size = outpos;
+	return out;
+}
--- k/mkdelta.c
+++ l/mkdelta.c
@@ -0,0 +1,95 @@
+#include "cache.h"
+#include "delta.h"
+
+static int write_delta_file(char *buf, unsigned long len, unsigned char *sha1_ref, unsigned char *path)
+{
+	int size;
+	char *compressed;
+	z_stream stream;
+	char hdr[50];
+	int fd, hdrlen;
+
+	/* Generate the header */
+	hdrlen = sprintf(hdr, "delta %lu", len+20)+1;
+	memcpy(hdr + hdrlen, sha1_ref, 20);
+	hdrlen += 20;
+
+	fd = open(path, O_WRONLY | O_CREAT | O_EXCL, 0666);
+	if (fd < 0)
+		return -1;
+
+	/* Set it up */
+	memset(&stream, 0, sizeof(stream));
+	deflateInit(&stream, Z_BEST_COMPRESSION);
+	size = deflateBound(&stream, len+hdrlen);
+	compressed = xmalloc(size);
+
+	/* Compress it */
+	stream.next_out = compressed;
+	stream.avail_out = size;
+
+	/* First header.. */
+	stream.next_in = hdr;
+	stream.avail_in = hdrlen;
+	while (deflate(&stream, 0) == Z_OK)
+		/* nothing */
+
+	/* Then the data itself.. */
+	stream.next_in = buf;
+	stream.avail_in = len;
+	while (deflate(&stream, Z_FINISH) == Z_OK)
+		/* nothing */;
+	deflateEnd(&stream);
+	size = stream.total_out;
+
+	if (write(fd, compressed, size) != size)
+		die("unable to write file");
+	close(fd);
+		
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	unsigned char sha1_ref[20], sha1_trg[20];
+	char type_ref[20], type_trg[20];
+	void *buf_ref, *buf_trg, *buf_delta;
+	unsigned long size_ref, size_trg, size_delta;
+	char *filename, tmpname[100];
+
+	if (argc != 3 || get_sha1(argv[1], sha1_ref) || get_sha1(argv[2], sha1_trg))
+		usage("git-mkdelta <reference_sha1> <target_sha1>");
+
+	buf_ref = read_sha1_file(sha1_ref, type_ref, &size_ref);
+	if (!buf_ref) {
+		fprintf(stderr, "%s: unable to read reference object\n", argv[0]);
+		exit(1);
+	}
+	buf_trg = read_sha1_file(sha1_trg, type_trg, &size_trg);
+	if (!buf_trg) {
+		fprintf(stderr, "%s: unable to read target object\n", argv[0]);
+		exit(1);
+	}
+	if (strcmp(type_ref, type_trg)) {
+		fprintf(stderr, "%s: reference and target are of different type\n", argv[0]);
+		exit(2);
+	}
+	buf_delta = diff_delta(buf_ref, size_ref, buf_trg, size_trg, &size_delta);
+	if (!buf_delta) {
+		fprintf(stderr, "%s: unable to create delta\n", argv[0]);
+		exit(3);
+	}
+
+	filename = sha1_file_name(sha1_trg);
+	sprintf(tmpname, "%s.delta.tmp", filename);
+	if (write_delta_file(buf_delta, size_delta, sha1_ref, tmpname)) {
+		perror(tmpname);
+		exit(1);
+	}
+	if (rename(tmpname, filename)) {
+		perror("rename");
+		exit(1);
+	}
+
+	return 0;
+}
--- k/patch-delta.c
+++ l/patch-delta.c
@@ -0,0 +1,73 @@
+/*
+ * patch-delta.c:
+ * recreate a buffer from a source and the delta produced by diff-delta.c
+ *
+ * (C) 2005 Nicolas Pitre <nico@cam.org>
+ *
+ * This code is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include "delta.h"
+
+void *patch_delta(void *src_buf, unsigned long src_size,
+		  void *delta_buf, unsigned long delta_size,
+		  unsigned long *dst_size)
+{
+	const unsigned char *data, *top;
+	unsigned char *dst, *out;
+	int size;
+
+	/* the smallest delta size possible is 10 bytes */
+	if (delta_size < 10)
+		return NULL;
+
+	data = delta_buf;
+	top = delta_buf + delta_size;
+
+	/* make sure the orig file size matches what we expect */
+	size = data[0] | (data[1] << 8) | (data[2] << 16) | (data[3] << 24);
+	data += 4;
+	if (size != src_size)
+		return NULL;
+
+	/* now the result size */
+	size = data[0] | (data[1] << 8) | (data[2] << 16) | (data[3] << 24);
+	data += 4;
+	dst = malloc(size);
+	if (!dst)
+		return NULL;
+
+	out = dst;
+	while (data < top) {
+		unsigned char cmd = *data++;
+		if (cmd & 0x80) {
+			unsigned int cp_off = 0, cp_size = 0;
+			if (cmd & 0x01) cp_off = *data++;
+			if (cmd & 0x02) cp_off |= (*data++ << 8);
+			if (cmd & 0x04) cp_off |= (*data++ << 16);
+			if (cmd & 0x08) cp_off |= (*data++ << 24);
+			if (cmd & 0x10) cp_size = *data++;
+			if (cmd & 0x20) cp_size |= (*data++ << 8);
+			if (cp_size == 0) cp_size = 0x10000;
+			memcpy(out, src_buf + cp_off, cp_size);
+			out += cp_size;
+		} else {
+			memcpy(out, data, cmd);
+			out += cmd;
+			data += cmd;
+		}
+	}
+
+	/* sanity check */
+	if (data != top || out - dst != size) {
+		free(dst);
+		return NULL;
+	}
+
+	*dst_size = size;
+	return dst;
+}
Binary files k/test-delta and l/test-delta differ
--- k/test-delta.c
+++ l/test-delta.c
@@ -0,0 +1,79 @@
+/*
+ * test-delta.c: test code to exercise diff-delta.c and patch-delta.c
+ *
+ * (C) 2005 Nicolas Pitre <nico@cam.org>
+ *
+ * This code is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include "delta.h"
+
+static const char *usage =
+	"test-delta (-d|-p) <from_file> <data_file> <out_file>";
+
+int main(int argc, char *argv[])
+{
+	int fd;
+	struct stat st;
+	void *from_buf, *data_buf, *out_buf;
+	unsigned long from_size, data_size, out_size;
+
+	if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
+		fprintf(stderr, "Usage: %s\n", usage);
+		return 1;
+	}
+
+	fd = open(argv[2], O_RDONLY);
+	if (fd < 0 || fstat(fd, &st)) {
+		perror(argv[2]);
+		return 1;
+	}
+	from_size = st.st_size;
+	from_buf = mmap(NULL, from_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (from_buf == MAP_FAILED) {
+		perror(argv[2]);
+		return 1;
+	}
+	close(fd);
+
+	fd = open(argv[3], O_RDONLY);
+	if (fd < 0 || fstat(fd, &st)) {
+		perror(argv[3]);
+		return 1;
+	}
+	data_size = st.st_size;
+	data_buf = mmap(NULL, data_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (data_buf == MAP_FAILED) {
+		perror(argv[3]);
+		return 1;
+	}
+	close(fd);
+
+	if (argv[1][1] == 'd')
+		out_buf = diff_delta(from_buf, from_size,
+				     data_buf, data_size, &out_size);
+	else
+		out_buf = patch_delta(from_buf, from_size,
+				      data_buf, data_size, &out_size);
+	if (!out_buf) {
+		fprintf(stderr, "delta operation failed (returned NULL)\n");
+		return 1;
+	}
+
+	fd = open (argv[4], O_WRONLY|O_CREAT|O_TRUNC, 0666);
+	if (fd < 0 || write(fd, out_buf, out_size) != out_size) {
+		perror(argv[4]);
+		return 1;
+	}
+
+	return 0;
+}
--- k/Makefile
+++ l/Makefile
@@ -21,7 +21,7 @@ PROG=   git-update-cache git-diff-files 
 	git-check-files git-ls-tree git-merge-base git-merge-cache \
 	git-unpack-file git-export git-diff-cache git-convert-cache \
 	git-http-pull git-rpush git-rpull git-rev-list git-mktag \
-	git-diff-tree-helper git-tar-tree git-local-pull
+	git-diff-tree-helper git-tar-tree git-local-pull git-mkdelta
 
 all: $(PROG)
 
@@ -29,7 +29,7 @@ install: $(PROG) $(SCRIPTS)
 	install $(PROG) $(SCRIPTS) $(HOME)/bin/
 
 LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
-	 tag.o date.o
+	 tag.o date.o diff-delta.o patch-delta.o
 LIB_FILE=libgit.a
 LIB_H=cache.h object.h blob.h tree.h commit.h tag.h
 
@@ -63,6 +63,9 @@ $(LIB_FILE): $(LIB_OBJS)
 test-date: test-date.c date.o
 	$(CC) $(CFLAGS) -o $@ test-date.c date.o
 
+test-delta: test-delta.c diff-delta.o patch-delta.o
+	$(CC) $(CFLAGS) -o $@ $^
+
 git-%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
 
@@ -92,6 +95,7 @@ git-rpush: rsh.c
 git-rpull: rsh.c pull.c
 git-rev-list: rev-list.c
 git-mktag: mktag.c
+git-mkdelta: mkdelta.c
 git-diff-tree-helper: diff-tree-helper.c
 git-tar-tree: tar-tree.c
 
--- k/sha1_file.c
+++ l/sha1_file.c
@@ -8,6 +8,7 @@
  */
 #include <stdarg.h>
 #include "cache.h"
+#include "delta.h"
 
 const char *sha1_file_directory = NULL;
 
@@ -186,7 +187,8 @@ void * unpack_sha1_file(void *map, unsig
 	int ret, bytes;
 	z_stream stream;
 	char buffer[8192];
-	char *buf;
+	char *buf, *delta_ref;
+	unsigned long delta_ref_sz;
 
 	/* Get the data stream */
 	memset(&stream, 0, sizeof(stream));
@@ -201,8 +203,15 @@ void * unpack_sha1_file(void *map, unsig
 		return NULL;
 	if (sscanf(buffer, "%10s %lu", type, size) != 2)
 		return NULL;
-
 	bytes = strlen(buffer) + 1;
+
+	if (!strcmp(type, "delta")) {
+		delta_ref = read_sha1_file(buffer + bytes, type, &delta_ref_sz);
+		if (!delta_ref)
+			return NULL;
+	} else
+		delta_ref = NULL;
+
 	buf = xmalloc(*size);
 
 	memcpy(buf, buffer + bytes, stream.total_out - bytes);
@@ -214,6 +223,17 @@ void * unpack_sha1_file(void *map, unsig
 			/* nothing */;
 	}
 	inflateEnd(&stream);
+
+	if (delta_ref) {
+		char *newbuf;
+		unsigned long newsize;
+		newbuf = patch_delta(delta_ref, delta_ref_sz, buf+20, *size-20, &newsize);
+		free(delta_ref);
+		free(buf);
+		buf = newbuf;
+		*size = newsize;
+	}
+
 	return buf;
 }
 

^ permalink raw reply	[relevance 1%]

* Re: cogito "origin" vs. HEAD
  2005-05-03  7:13  0%   ` Benjamin Herrenschmidt
@ 2005-05-03  9:06  0%     ` Alexey Nezhdanov
  2005-05-03  9:47  0%     ` Petr Baudis
  1 sibling, 0 replies; 200+ results
From: Alexey Nezhdanov @ 2005-05-03  9:06 UTC (permalink / raw)
  To: git; +Cc: Benjamin Herrenschmidt

At Tuesday, 03 May 2005 11:13 Benjamin Herrenschmidt wrote:
> > when accessing the remote repository, Cogito always looks for remote
> > refs/heads/master first - if that one isn't there, it takes HEAD, but
> > there is no correlation between the local and remote branch name. If you
> > want to fetch a different branch from the remote repository, use the
> > fragment identifier (see cg-help cg-branch-add).
>
> Ok, that I'm getting. So then, what happen of my local
> refs/heads/<branchname> and refs/heads/master/ ? I'm still a bit
> confused by the whole branch mecanism... It's my understanding than when
> I cg-init, it creates both "master" (a head without matching branch)
> and "origin" (a branch  + a head) both having the same sha1. It also
> checks out the tree.
>
> Now, when I cg-update origin, what happens exactly ? I mean, I know it's
> pulls all objects, then get the master from the remote pointed by the
> origin branch, but then, I suppose it updates both my local "origin" and
> my local "master" pointer, right ? I mean, they are always in sync ? Or
> is this related to what branch my current checkout is tracking ?
If I understand this mechanics correctly then "master head" always track your 
local tree (i.e. with all remote and local patches applied) and "origin head" 
always tracking head of the remote branch from where you are getting objects.

I.e. it is really a tree, not source of objects. The tree can be strored on 
many different hosts but it is the same across them. But the master tree have 
no source to sync from - you are creating it yourself locally so there is no 
"master branch" - only head.

So if you are just tracking some other tree and do not do any merges/patches 
yourself then your master head will always match your remote source head 
("origin" in most cases).

-- 
Respectfully
Alexey Nezhdanov


^ permalink raw reply	[relevance 0%]

* Re: cogito "origin" vs. HEAD
  2005-05-03  7:13  0%   ` Benjamin Herrenschmidt
  2005-05-03  9:06  0%     ` Alexey Nezhdanov
@ 2005-05-03  9:47  0%     ` Petr Baudis
  2005-05-03 23:49  0%       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-03  9:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Git Mailing List

Dear diary, on Tue, May 03, 2005 at 09:13:28AM CEST, I got a letter
where Benjamin Herrenschmidt <benh@kernel.crashing.org> told me that...
> > when accessing the remote repository, Cogito always looks for remote
> > refs/heads/master first - if that one isn't there, it takes HEAD, but
> > there is no correlation between the local and remote branch name. If you
> > want to fetch a different branch from the remote repository, use the
> > fragment identifier (see cg-help cg-branch-add).
> 
> Ok, that I'm getting. So then, what happen of my local
> refs/heads/<branchname> and refs/heads/master/ ? I'm still a bit
> confused by the whole branch mecanism... It's my understanding than when
> I cg-init, it creates both "master" (a head without matching branch)
> and "origin" (a branch  + a head) both having the same sha1. It also
> checks out the tree.
> 
> Now, when I cg-update origin, what happens exactly ? I mean, I know it's
> pulls all objects, then get the master from the remote pointed by the
> origin branch, but then, I suppose it updates both my local "origin" and
> my local "master" pointer, right ? I mean, they are always in sync ? Or
> is this related to what branch my current checkout is tracking ?

They are in sync as long as you update only from that given branch.
At the moment you do a local commit, they get out of sync, at least
until your master branch is merged to the origin branch on the other
side. Every cg-update will then generate a merging commit, so it will
look like this:

     [origin]    [master]
            commit1
              |
            commit2               Both heads are in sync so far...
              |
            commit3
             /    \
            /     commit4         Now heads/master is commit4, but
           /        |             heads/origin is still commit3
          /         |
      commit5-.     |             heads/master:commit4, heads/origin:commit5
          |    \    |
          |     `-commit6         commit6 merges origin to master
          |       /
          |     /
          |   /
      commit6                     origin merged your master; since it
                                  contained all the commits on the origin
               |                  branch, it just took over the commit6
             commit6              commit pointer as its new head; so both
                                  heads are again in sync now


This is the reason why there are always at least two branches, origin
and master. The checked out tree is always of the master branch (unless
you do cg-seek, which is somewhat special anyway). [*] "Normally", when
you do no local changes and just always cg-update the origin branch, the
two branches are always in sync. At the point you start to "mix" several
remote branches besides origin in your tree, or at the point you do a
local commit, the master branch gets standalone - until the origin
merges your changes as drawn in the diagram.

There is one other situation when the head pointers may not be in sync -
when you do cg-pull instead of cg-update. You want to see what are the
changes in the origin branch, but you are not sure if you want them to
appear in your master branch, you do cg-pull origin. Your origin head
pointer is updated, but your master pointer stays where it is. If you
decide it's ok to bring the changes in, you do either cg-update, or only
cg-merge to avoid re-pulling.


[*] Technically, you can have multiple local branches and your tree can
be based on any of them, not only "master". Cogito supports that
internally, but (deliberately) provides no UI to set that up, at least
until we devise a way to do it without confusing people even more.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: questions about cg-update, cg-pull, and cg-clone.
  2005-05-02 19:58  3% ` Petr Baudis
@ 2005-05-03 15:22  3%   ` Zack Brown
  2005-05-03 16:30  0%     ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Zack Brown @ 2005-05-03 15:22 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Git Mailing List

On Mon, May 02, 2005 at 09:58:46PM +0200, Petr Baudis wrote:
> Dear diary, on Sat, Apr 30, 2005 at 02:53:22AM CEST, I got a letter
> where Zack Brown <zbrown@tumblerings.org> told me that...
> > 'cg-update branch-name' grabs any new changes from the upstream repository and
> > merges them into my local repository. If I've been editing files in my local
> > repository, the update attempts to merge the changes cleanly.
> 
> Yes.
> 
> > Now, if the update is clean, a cg-commit is invoked automatically, and if the
> > update is not clean, I then have to resolve any conflicts and give the cg-commit
> > command by hand. But: what is the significance of either of these cg-commit
> > commands? Why should I have to write a changelog entry recording this merge? All
> 
> You might want to write some special notes regarding the merge, e.g.
> when you want to describe some non-trivial conflict resolution, or even
> give a short blurb of the changes you are merging.
> 
> If you don't know what to say, just press Ctrl-D. The first line of the
> commit always says "Merge with what_you_are_merging_with".
> 
> > I'm doing is updating my tree to be current. Why should I have to 'commit' that
> > update?
> 
> If you are only updating your tree to be current, you don't have to
> commit, and in fact you don't commit (you do so-called "fast-forward
> merge", which will just update your HEAD pointer to point at the newer
> commit). You commit only when you were merging stuff (so-called "tree
> merge"; well, that's at least how I call it to differentiate it from the
> fast-forward merge). That means you have some local commits over there -
> I can't just update your tree to be current, sorry. That would lose your
> commit. I have to merge the changes into your tree through a merge
> commit.

Hm.

So, suppose I'm working on your Cogito HEAD. I make some changes to my local
tree and commit them to my tree, and then before I go forward, I want to grab
whatever you've done recently, to make sure we're not in conflict before I add
new changes. If I understand you right, this situation would be a 'fast forward
merge'. So what is the command I give to just 'merge' your HEAD with mine,
without requiring a changelog entry?

Alternatively, suppose I'm you, the project lead, and Zackdude has some
changes for me, based on my HEAD. I want to 'merge' his tree into mine. If
I'm still understanding you, this is a 'tree merge'. Now I give a cg-update,
and now I *want* to give a changelog entry to record the merge.  Correct?

No, I still don't see it. I don't see why I would want to add an additional
changelog entry on top of whatever changelog entries Zackdude has made himself.
It just seems to pollute the changelog with entries that are essentially
meaningless. When I read back over the logs, I'm not going to be interested in
the bookkeeping of when I merged with various developers, I'm going to be
interested in what those developers actually did to the code, and what *I*
actually did to the code.

> 
> > Now I look at 'cg-pull'. What does this do? The readme says something about
> > printing two ids, and being useful for diffs. But can't I do a diff after a
> > cg-update and get the same result? I'm very confused about cg-pull right now.
> 
> cg-pull does the first part of cg-update. It is concerned by fetching
> the stuff from the remote repository to the local one. cg-merge then
> does the second part, merging the stuff to your local tree (doing either
> fast-forward or tree merge).

OK, I don't understand this either. What is the difference between fetching the
stuff and merging the stuff? Suppose I am working on a local repo of Cogito
HEAD. I make some changes, commit them, and then I do a cg-pull. What happens?
Are my changes overwritten? Do they show up at all? Do they exist in some
nebulous ether that I will never see until I do a merge?

Be well,
Zack

> 
> -- 
> 				Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Zack Brown

^ permalink raw reply	[relevance 3%]

* [PATCH] add the ability to create and retrieve delta objects
  @ 2005-05-03 15:52  1%       ` Nicolas Pitre
  0 siblings, 0 replies; 200+ results
From: Nicolas Pitre @ 2005-05-03 15:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

On Tue, 3 May 2005, Linus Torvalds wrote:

> On Tue, 3 May 2005, Nicolas Pitre wrote:
> > 
> > Yep, that's what I've done last weekend (and just made it actually 
> > work since people are getting interested).
> 
> I have to say that it looks uncommonly simple. Also, afaik, this should
> still work with the current fsck, it's just that because fsck doesn't
> understand the linkages, the error reporting won't be as good as it could
> be (I'd _much_ rather see "delta failed in object xxxxx" than "unable to
> read xxxxxx").

Yep.  Let's do it in a separate patch if you please.

> Now, one thing I like about this approach is that the actual delta 
> _generation_ can be done off-line, and independently of anything else. 
> Which means that the performance paths I care about (commit etc) are 
> largely unaffected, and you can "deltify" a git archive overnight or 
> something. 

Yes.  And actually you can use any kind of delta reference topology as 
you wish.  It may start from the first object revision and the next 
revision is a delta against the first, the third a delta against the 
second, etc.  But it is much more interesting to do it the other way 
around, such that the second revision is stored as is and the first 
revision is made a delta against the second revision.  Then on the next 
commit the third revision is stored as is and the second rev made a 
delta against the third, and so on.  You therefore get delta compression 
at commit time with little overhead if you wish to do that.  And this 
approach has the advantage of keeping the latest object revisions fast 
accessible and the delta overhead is relegated to the old historic 
objects.

And suppose the delta chain is too deep for some objects and accessing 
them gets too much overhead.  No problem: just pick a random object in 
the middle of the delta chain and swap it with its original undeltafied 
version and the delta chain is now cut in two.

Etc.  It's flexible and open to any arrangement.

OK, here's a revised patch correcting the little bug found by
Chris Mason.

==========

This patch adds the necessary functionalities to perform delta
compression on objects.  It adds a git-mkdelta command which can replace
any object with its deltafied version given a reference object.

Access to a delta object will transparently fetch the reference object
and apply the transformation.  Scripts can be used to perform any sort
of compression policy on top of it.

The delta generator has been extracted from libxdiff and optimized for
git usage in order to avoid as much data copy as possible, and the delta
storage format modified to be even more compact.  Therefore no need to
rely on any external library.  The test-delta program can be used to
test it.

Many refinements are needed but better merge them separately.  Loop 
detection and recursion treshold are a few examples.

Signed-off-by: Nicolas Pitre <nico@cam.org>

--- a/Makefile
+++ b/Makefile
@@ -29,7 +29,7 @@ install: $(PROG) $(SCRIPTS)
 	install $(PROG) $(SCRIPTS) $(HOME)/bin/
 
 LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
-	 tag.o date.o
+	 tag.o date.o diff-delta.o patch-delta.o
 LIB_FILE=libgit.a
 LIB_H=cache.h object.h blob.h tree.h commit.h tag.h
 
@@ -63,6 +63,9 @@ $(LIB_FILE): $(LIB_OBJS)
 test-date: test-date.c date.o
 	$(CC) $(CFLAGS) -o $@ test-date.c date.o
 
+test-delta: test-delta.c diff-delta.o patch-delta.o
+	$(CC) $(CFLAGS) -o $@ $^
+
 git-%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
 
@@ -92,6 +95,7 @@ git-rpush: rsh.c
 git-rpull: rsh.c pull.c
 git-rev-list: rev-list.c
 git-mktag: mktag.c
+git-mkdelta: mkdelta.c
 git-diff-tree-helper: diff-tree-helper.c
 git-tar-tree: tar-tree.c
 git-write-blob: write-blob.c
Created: delta.h (mode:100644)
--- /dev/null
+++ b/delta.h
@@ -0,0 +1,6 @@
+extern void *diff_delta(void *from_buf, unsigned long from_size,
+			void *to_buf, unsigned long to_size,
+		        unsigned long *delta_size);
+extern void *patch_delta(void *src_buf, unsigned long src_size,
+			 void *delta_buf, unsigned long delta_size,
+			 unsigned long *dst_size);
Created: diff-delta.c (mode:100644)
--- /dev/null
+++ b/diff-delta.c
@@ -0,0 +1,315 @@
+/*
+ * diff-delta.c: generate a delta between two buffers
+ *
+ *  Many parts of this file have been lifted from LibXDiff version 0.10.
+ *  http://www.xmailserver.org/xdiff-lib.html
+ *
+ *  LibXDiff was written by Davide Libenzi <davidel@xmailserver.org>
+ *  Copyright (C) 2003	Davide Libenzi
+ *
+ *  Many mods for GIT usage by Nicolas Pitre <nico@cam.org>, (C) 2005.
+ *
+ *  This file is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ */
+
+#include <stdlib.h>
+#include "delta.h"
+
+
+/* block size: min = 16, max = 64k, power of 2 */
+#define BLK_SIZE 16
+
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
+
+#define GR_PRIME 0x9e370001
+#define HASH(v, b) (((unsigned int)(v) * GR_PRIME) >> (32 - (b)))
+	
+/* largest prime smaller than 65536 */
+#define BASE 65521
+
+/* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
+#define NMAX 5552
+
+#define DO1(buf, i)  { s1 += buf[i]; s2 += s1; }
+#define DO2(buf, i)  DO1(buf, i); DO1(buf, i + 1);
+#define DO4(buf, i)  DO2(buf, i); DO2(buf, i + 2);
+#define DO8(buf, i)  DO4(buf, i); DO4(buf, i + 4);
+#define DO16(buf)    DO8(buf, 0); DO8(buf, 8);
+
+static unsigned int adler32(unsigned int adler, const unsigned char *buf, int len)
+{
+	int k;
+	unsigned int s1 = adler & 0xffff;
+	unsigned int s2 = adler >> 16;
+
+	while (len > 0) {
+		k = MIN(len, NMAX);
+		len -= k;
+		while (k >= 16) {
+			DO16(buf);
+			buf += 16;
+			k -= 16;
+		}
+		if (k != 0)
+			do {
+				s1 += *buf++;
+				s2 += s1;
+			} while (--k);
+		s1 %= BASE;
+		s2 %= BASE;
+	}
+
+	return (s2 << 16) | s1;
+}
+
+static unsigned int hashbits(unsigned int size)
+{
+	unsigned int val = 1, bits = 0;
+	while (val < size && bits < 32) {
+		val <<= 1;
+	       	bits++;
+	}
+	return bits ? bits: 1;
+}
+
+typedef struct s_chanode {
+	struct s_chanode *next;
+	int icurr;
+} chanode_t;
+
+typedef struct s_chastore {
+	chanode_t *head, *tail;
+	int isize, nsize;
+	chanode_t *ancur;
+	chanode_t *sncur;
+	int scurr;
+} chastore_t;
+
+static void cha_init(chastore_t *cha, int isize, int icount)
+{
+	cha->head = cha->tail = NULL;
+	cha->isize = isize;
+	cha->nsize = icount * isize;
+	cha->ancur = cha->sncur = NULL;
+	cha->scurr = 0;
+}
+
+static void *cha_alloc(chastore_t *cha)
+{
+	chanode_t *ancur;
+	void *data;
+
+	ancur = cha->ancur;
+	if (!ancur || ancur->icurr == cha->nsize) {
+		ancur = malloc(sizeof(chanode_t) + cha->nsize);
+		if (!ancur)
+			return NULL;
+		ancur->icurr = 0;
+		ancur->next = NULL;
+		if (cha->tail)
+			cha->tail->next = ancur;
+		if (!cha->head)
+			cha->head = ancur;
+		cha->tail = ancur;
+		cha->ancur = ancur;
+	}
+
+	data = (void *)ancur + sizeof(chanode_t) + ancur->icurr;
+	ancur->icurr += cha->isize;
+	return data;
+}
+
+static void cha_free(chastore_t *cha)
+{
+	chanode_t *cur = cha->head;
+	while (cur) {
+		chanode_t *tmp = cur;
+		cur = cur->next;
+		free(tmp);
+	}
+}
+
+typedef struct s_bdrecord {
+	struct s_bdrecord *next;
+	unsigned int fp;
+	const unsigned char *ptr;
+} bdrecord_t;
+
+typedef struct s_bdfile {
+	const unsigned char *data, *top;
+	chastore_t cha;
+	unsigned int fphbits;
+	bdrecord_t **fphash;
+} bdfile_t;
+
+static int delta_prepare(const unsigned char *buf, int bufsize, bdfile_t *bdf)
+{
+	unsigned int fphbits;
+	int i, hsize;
+	const unsigned char *base, *data, *top;
+	bdrecord_t *brec;
+	bdrecord_t **fphash;
+
+	fphbits = hashbits(bufsize / BLK_SIZE + 1);
+	hsize = 1 << fphbits;
+	fphash = malloc(hsize * sizeof(bdrecord_t *));
+	if (!fphash)
+		return -1;
+	for (i = 0; i < hsize; i++)
+		fphash[i] = NULL;
+	cha_init(&bdf->cha, sizeof(bdrecord_t), hsize / 4 + 1);
+
+	bdf->data = data = base = buf;
+	bdf->top = top = buf + bufsize;
+	data += (bufsize / BLK_SIZE) * BLK_SIZE;
+	if (data == top)
+		data -= BLK_SIZE;
+
+	for ( ; data >= base; data -= BLK_SIZE) {
+		brec = cha_alloc(&bdf->cha);
+		if (!brec) {
+			cha_free(&bdf->cha);
+			free(fphash);
+			return -1;
+		}
+		brec->fp = adler32(0, data, MIN(BLK_SIZE, top - data));
+		brec->ptr = data;
+		i = HASH(brec->fp, fphbits);
+		brec->next = fphash[i];
+		fphash[i] = brec;
+	}
+
+	bdf->fphbits = fphbits;
+	bdf->fphash = fphash;
+
+	return 0;
+}
+
+static void delta_cleanup(bdfile_t *bdf)
+{
+	free(bdf->fphash);
+	cha_free(&bdf->cha);
+}
+
+#define COPYOP_SIZE(o, s) \
+    (!!(o & 0xff) + !!(o & 0xff00) + !!(o & 0xff0000) + !!(o & 0xff000000) + \
+     !!(s & 0xff) + !!(s & 0xff00) + 1)
+
+void *diff_delta(void *from_buf, unsigned long from_size,
+		 void *to_buf, unsigned long to_size,
+		 unsigned long *delta_size)
+{
+	int i, outpos, outsize, inscnt, csize, msize, moff;
+	unsigned int fp;
+	const unsigned char *data, *top, *ptr1, *ptr2;
+	unsigned char *out, *orig;
+	bdrecord_t *brec;
+	bdfile_t bdf;
+
+	if (delta_prepare(from_buf, from_size, &bdf))
+		return NULL;
+	
+	outpos = 0;
+	outsize = 4096;
+	out = malloc(outsize);
+	if (!out) {
+		delta_cleanup(&bdf);
+		return NULL;
+	}
+
+	data = to_buf;
+	top = to_buf + to_size;
+
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size; from_size >>= 8;
+	out[outpos++] = from_size;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size; to_size >>= 8;
+	out[outpos++] = to_size;
+
+	inscnt = 0;
+	moff = 0;
+	while (data < top) {
+		msize = 0;
+		fp = adler32(0, data, MIN(top - data, BLK_SIZE));
+		i = HASH(fp, bdf.fphbits);
+		for (brec = bdf.fphash[i]; brec; brec = brec->next) {
+			if (brec->fp == fp) {
+				csize = bdf.top - brec->ptr;
+				if (csize > top - data)
+					csize = top - data;
+				for (ptr1 = brec->ptr, ptr2 = data; 
+				     csize && *ptr1 == *ptr2;
+				     csize--, ptr1++, ptr2++);
+
+				csize = ptr1 - brec->ptr;
+				if (csize > msize) {
+					moff = brec->ptr - bdf.data;
+					msize = csize;
+					if (msize >= 0x10000) {
+						msize = 0x10000;
+						break;
+					}
+				}
+			}
+		}
+
+		if (!msize || msize < COPYOP_SIZE(moff, msize)) {
+			if (!inscnt)
+				outpos++;
+			out[outpos++] = *data++;
+			inscnt++;
+			if (inscnt == 0x7f) {
+				out[outpos - inscnt - 1] = inscnt;
+				inscnt = 0;
+			}
+		} else {
+			if (inscnt) {
+				out[outpos - inscnt - 1] = inscnt;
+				inscnt = 0;
+			}
+
+			data += msize;
+			orig = out + outpos++;
+			i = 0x80;
+
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x01; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x02; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x04; }
+			moff >>= 8;
+			if (moff & 0xff) { out[outpos++] = moff; i |= 0x08; }
+
+			if (msize & 0xff) { out[outpos++] = msize; i |= 0x10; }
+			msize >>= 8;
+			if (msize & 0xff) { out[outpos++] = msize; i |= 0x20; }
+
+			*orig = i;
+		}
+
+		/* next time around the largest possible output is 1 + 4 + 3 */
+		if (outpos > outsize - 8) {
+			void *tmp = out;
+			outsize = outsize * 3 / 2;
+			out = realloc(out, outsize);
+			if (!out) {
+				free(tmp);
+				delta_cleanup(&bdf);
+				return NULL;
+			}
+		}
+	}
+
+	if (inscnt)
+		out[outpos - inscnt - 1] = inscnt;
+
+	delta_cleanup(&bdf);
+	*delta_size = outpos;
+	return out;
+}
Created: patch-delta.c (mode:100644)
--- /dev/null
+++ b/patch-delta.c
@@ -0,0 +1,73 @@
+/*
+ * patch-delta.c:
+ * recreate a buffer from a source and the delta produced by diff-delta.c
+ *
+ * (C) 2005 Nicolas Pitre <nico@cam.org>
+ *
+ * This code is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include "delta.h"
+
+void *patch_delta(void *src_buf, unsigned long src_size,
+		  void *delta_buf, unsigned long delta_size,
+		  unsigned long *dst_size)
+{
+	const unsigned char *data, *top;
+	unsigned char *dst, *out;
+	int size;
+
+	/* the smallest delta size possible is 10 bytes */
+	if (delta_size < 10)
+		return NULL;
+
+	data = delta_buf;
+	top = delta_buf + delta_size;
+
+	/* make sure the orig file size matches what we expect */
+	size = data[0] | (data[1] << 8) | (data[2] << 16) | (data[3] << 24);
+	data += 4;
+	if (size != src_size)
+		return NULL;
+
+	/* now the result size */
+	size = data[0] | (data[1] << 8) | (data[2] << 16) | (data[3] << 24);
+	data += 4;
+	dst = malloc(size);
+	if (!dst)
+		return NULL;
+
+	out = dst;
+	while (data < top) {
+		unsigned char cmd = *data++;
+		if (cmd & 0x80) {
+			unsigned int cp_off = 0, cp_size = 0;
+			if (cmd & 0x01) cp_off = *data++;
+			if (cmd & 0x02) cp_off |= (*data++ << 8);
+			if (cmd & 0x04) cp_off |= (*data++ << 16);
+			if (cmd & 0x08) cp_off |= (*data++ << 24);
+			if (cmd & 0x10) cp_size = *data++;
+			if (cmd & 0x20) cp_size |= (*data++ << 8);
+			if (cp_size == 0) cp_size = 0x10000;
+			memcpy(out, src_buf + cp_off, cp_size);
+			out += cp_size;
+		} else {
+			memcpy(out, data, cmd);
+			out += cmd;
+			data += cmd;
+		}
+	}
+
+	/* sanity check */
+	if (data != top || out - dst != size) {
+		free(dst);
+		return NULL;
+	}
+
+	*dst_size = size;
+	return dst;
+}
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -8,6 +8,7 @@
  */
 #include <stdarg.h>
 #include "cache.h"
+#include "delta.h"
 
 const char *sha1_file_directory = NULL;
 
@@ -186,7 +187,8 @@ void * unpack_sha1_file(void *map, unsig
 	int ret, bytes;
 	z_stream stream;
 	char buffer[8192];
-	char *buf;
+	char *buf, *delta_ref;
+	unsigned long delta_ref_sz;
 
 	/* Get the data stream */
 	memset(&stream, 0, sizeof(stream));
@@ -201,8 +203,15 @@ void * unpack_sha1_file(void *map, unsig
 		return NULL;
 	if (sscanf(buffer, "%10s %lu", type, size) != 2)
 		return NULL;
-
 	bytes = strlen(buffer) + 1;
+
+	if (!strcmp(type, "delta")) {
+		delta_ref = read_sha1_file(buffer + bytes, type, &delta_ref_sz);
+		if (!delta_ref)
+			return NULL;
+	} else
+		delta_ref = NULL;
+
 	buf = xmalloc(*size);
 
 	memcpy(buf, buffer + bytes, stream.total_out - bytes);
@@ -214,6 +223,17 @@ void * unpack_sha1_file(void *map, unsig
 			/* nothing */;
 	}
 	inflateEnd(&stream);
+
+	if (delta_ref) {
+		char *newbuf;
+		unsigned long newsize;
+		newbuf = patch_delta(delta_ref, delta_ref_sz, buf+20, *size-20, &newsize);
+		free(delta_ref);
+		free(buf);
+		buf = newbuf;
+		*size = newsize;
+	}
+
 	return buf;
 }
 
Created: test-delta.c (mode:100644)
--- /dev/null
+++ b/test-delta.c
@@ -0,0 +1,79 @@
+/*
+ * test-delta.c: test code to exercise diff-delta.c and patch-delta.c
+ *
+ * (C) 2005 Nicolas Pitre <nico@cam.org>
+ *
+ * This code is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include "delta.h"
+
+static const char *usage =
+	"test-delta (-d|-p) <from_file> <data_file> <out_file>";
+
+int main(int argc, char *argv[])
+{
+	int fd;
+	struct stat st;
+	void *from_buf, *data_buf, *out_buf;
+	unsigned long from_size, data_size, out_size;
+
+	if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
+		fprintf(stderr, "Usage: %s\n", usage);
+		return 1;
+	}
+
+	fd = open(argv[2], O_RDONLY);
+	if (fd < 0 || fstat(fd, &st)) {
+		perror(argv[2]);
+		return 1;
+	}
+	from_size = st.st_size;
+	from_buf = mmap(NULL, from_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (from_buf == MAP_FAILED) {
+		perror(argv[2]);
+		return 1;
+	}
+	close(fd);
+
+	fd = open(argv[3], O_RDONLY);
+	if (fd < 0 || fstat(fd, &st)) {
+		perror(argv[3]);
+		return 1;
+	}
+	data_size = st.st_size;
+	data_buf = mmap(NULL, data_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (data_buf == MAP_FAILED) {
+		perror(argv[3]);
+		return 1;
+	}
+	close(fd);
+
+	if (argv[1][1] == 'd')
+		out_buf = diff_delta(from_buf, from_size,
+				     data_buf, data_size, &out_size);
+	else
+		out_buf = patch_delta(from_buf, from_size,
+				      data_buf, data_size, &out_size);
+	if (!out_buf) {
+		fprintf(stderr, "delta operation failed (returned NULL)\n");
+		return 1;
+	}
+
+	fd = open (argv[4], O_WRONLY|O_CREAT|O_TRUNC, 0666);
+	if (fd < 0 || write(fd, out_buf, out_size) != out_size) {
+		perror(argv[4]);
+		return 1;
+	}
+
+	return 0;
+}

^ permalink raw reply	[relevance 1%]

* Re: questions about cg-update, cg-pull, and cg-clone.
  2005-05-03 15:22  3%   ` Zack Brown
@ 2005-05-03 16:30  0%     ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-03 16:30 UTC (permalink / raw)
  To: Zack Brown; +Cc: Petr Baudis, Git Mailing List

On Tue, 3 May 2005, Zack Brown wrote:

> So, suppose I'm working on your Cogito HEAD. I make some changes to my local
> tree and commit them to my tree, and then before I go forward, I want to grab
> whatever you've done recently, to make sure we're not in conflict before I add
> new changes. If I understand you right, this situation would be a 'fast forward
> merge'. So what is the command I give to just 'merge' your HEAD with mine,
> without requiring a changelog entry?

In this case, you have to do a tree merge, because you have some commits
and he has some commits, and you want to be in a state where you have your
commits and his; this state is new, so you need a new commit with both
lines as parents.

> Alternatively, suppose I'm you, the project lead, and Zackdude has some
> changes for me, based on my HEAD. I want to 'merge' his tree into mine. If
> I'm still understanding you, this is a 'tree merge'. Now I give a cg-update,
> and now I *want* to give a changelog entry to record the merge.  Correct?

In this case, you don't have any commits that the other guy doesn't
have. Zackdude took your tree, made some changes, and that's his
head. Your head is still the same. He's already specified what happens
when you go from your head to his head; that's what he did, so the answer
has to be his head. That's a fast-forward.

Now, if the project lead decided to update from a second contributor who
hadn't rebased their contribution on the new head, then a merge is
required, to resolve the potential conflicts, and this merge needs a
commit.

> No, I still don't see it. I don't see why I would want to add an additional
> changelog entry on top of whatever changelog entries Zackdude has made himself.
> It just seems to pollute the changelog with entries that are essentially
> meaningless. When I read back over the logs, I'm not going to be interested in
> the bookkeeping of when I merged with various developers, I'm going to be
> interested in what those developers actually did to the code, and what *I*
> actually did to the code.

If developer A's changes work, and developer B's changes work, but they
don't work in your merge of them, you'll want to see that. Furthermore,
without a commit with both of their commits as parents, you can't reach
both of their histories from anywhere.

> OK, I don't understand this either. What is the difference between fetching the
> stuff and merging the stuff? Suppose I am working on a local repo of Cogito
> HEAD. I make some changes, commit them, and then I do a cg-pull. What happens?
> Are my changes overwritten? Do they show up at all? Do they exist in some
> nebulous ether that I will never see until I do a merge?

If you do a "cg-pull pasky", this doesn't change any of your stuff, but it
means that "cg-diff -r pasky" will now compare against his new head,
rather than the head he had when you previously did stuff. "cg-log
pasky" will include the new messages, and so forth. Also, you can then do
the merge without a network connection; you can pull overnight and merge
on the train.

You don't see anything different in your working directory, but your
repository essentially "knows more".

	-Daniel
*This .sig left intentionally blank*



^ permalink raw reply	[relevance 0%]

* Re: Careful object writing..
  @ 2005-05-03 19:47  3%   ` Linus Torvalds
  0 siblings, 0 replies; 200+ results
From: Linus Torvalds @ 2005-05-03 19:47 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Git Mailing List



On Tue, 3 May 2005, Chris Wedgwood wrote:
> 
> how is this better than a single rename?  i take it there is something
> fundamental from clue.101 i slept though here?

A rename will overwrite any old object, which means that you cannot do any 
collision checks. In contrast, a "link()" will return EEXIST if somebody 
else raced with you and created a new object, and you can do collision 
checks instead of overwriting another persons object.

> also, if you are *really* paranoid you want to fsync *before* you do
> the link/unklink or rename --- which is what MTAs do[1]

Me, I refuse to slow down my habits for old filesystems. You can either 
fsck, or use a logging filesystem. 

I don't see anybody not using logging filesystems these days, so..

> also, shouldn't HEAD (and similar)[2] be updated with a temporary and
> a rename too?

Maybe. Much less important, though.

> > NOTE NOTE NOTE! I have _not_ updated all the helper stuff that also
> > write objects.
> 
> i thought this was all common code?  if it's not maybe now is the time
> to change that?

It is all common code, except:
 - things like "fetch from another host" will use rsync/wget/xxx to 
   actually get the files. To those programs, we're not talking about git 
   objects, we're just talking "regular files"
 - rpull.c has a special different routine to write its objects. I don't 
   use it, so..

Anyway, it should be reasonably easily fixable.

		Linus

^ permalink raw reply	[relevance 3%]

* Re: cogito "origin" vs. HEAD
  2005-05-03  9:47  0%     ` Petr Baudis
@ 2005-05-03 23:49  0%       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 200+ results
From: Benjamin Herrenschmidt @ 2005-05-03 23:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Git Mailing List

On Tue, 2005-05-03 at 11:47 +0200, Petr Baudis wrote:
> Dear diary, on Tue, May 03, 2005 at 09:13:28AM CEST, I got a letter
> where Benjamin Herrenschmidt <benh@kernel.crashing.org> told me that...
> > > when accessing the remote repository, Cogito always looks for remote
> > > refs/heads/master first - if that one isn't there, it takes HEAD, but
> > > there is no correlation between the local and remote branch name. If you
> > > want to fetch a different branch from the remote repository, use the
> > > fragment identifier (see cg-help cg-branch-add).
> > 
> > Ok, that I'm getting. So then, what happen of my local
> > refs/heads/<branchname> and refs/heads/master/ ? I'm still a bit
> > confused by the whole branch mecanism... It's my understanding than when
> > I cg-init, it creates both "master" (a head without matching branch)
> > and "origin" (a branch  + a head) both having the same sha1. It also
> > checks out the tree.
> > 
> > Now, when I cg-update origin, what happens exactly ? I mean, I know it's
> > pulls all objects, then get the master from the remote pointed by the
> > origin branch, but then, I suppose it updates both my local "origin" and
> > my local "master" pointer, right ? I mean, they are always in sync ? Or
> > is this related to what branch my current checkout is tracking ?
> 
> They are in sync as long as you update only from that given branch.
> At the moment you do a local commit, they get out of sync, at least
> until your master branch is merged to the origin branch on the other
> side. Every cg-update will then generate a merging commit, so it will
> look like this:
> > .../...

Thanks for that detailed explanation !

Ben.



^ permalink raw reply	[relevance 0%]

* Re: [PATCH] Fix memory leaks in read_tree_recursive()
  @ 2005-05-05  0:08  3%   ` Jonas Fonseca
  0 siblings, 0 replies; 200+ results
From: Jonas Fonseca @ 2005-05-05  0:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote Wed, May 04, 2005:
> >>>>> "JF" == Jonas Fonseca <fonseca@diku.dk> writes:
> 
> JF> This patch fixes memory leaks in the error path of
> JF> read_tree_recursive().
> 
> The leak seems to be real but what is "mem_free"?

That mem_free() was a bad habbit from the project I usually work on.

> Has it been compile tested?

No, but the second patch has.

BTW, when compiling git on a FreeBSD box I get these warnings:

date.c: In function `parse_date':
date.c:414: warning: long unsigned int format, time_t arg (arg 4)
date.c:414: warning: long unsigned int format, time_t arg (arg 4)
date.c: In function `datestamp':
date.c:427: warning: long unsigned int format, time_t arg (arg 4)
date.c:427: warning: long unsigned int format, time_t arg (arg 4)
tar-tree.c: In function `write_header':
tar-tree.c:249: warning: long unsigned int format, time_t arg (arg 3)
local-pull.c: In function `fetch':
local-pull.c:73: warning: long int format, different type arg (arg 4)

because time_t is defined as int32_t.  Don't know if they are worth
fixing at this point.

> JF> @@ -39,14 +39,18 @@
> JF>  		if (S_ISDIR(mode)) {
> JF>  			int retval;
>  
> JF> ...
> JF> -			if (!eltbuf || strcmp(elttype, "tree"))
> JF> +			if (!eltbuf || strcmp(elttype, "tree")) {
> JF> +				if (eltbuf) mem_free(eltbuf);
> JF>  				return -1;
> 
> Btw, who is putting this header in your mail?  It does not make
> sense to me unless Jonas is pseudonym for Linus...
> 
>   Mail-Followup-To: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org 

Maybe because Mutt knows I am subscribe to this mailing list and assumes
I don't want to have mail addressed directly to my email address?

I just hit 'g'.

-- 
Jonas Fonseca

^ permalink raw reply	[relevance 3%]

* read-only git repositories
  @ 2005-05-05  9:51  3%   ` David Lang
  0 siblings, 0 replies; 200+ results
From: David Lang @ 2005-05-05  9:51 UTC (permalink / raw)
  To: git

given that git already treats everything in the object storage as being 
fixed it occured to me that there may be value in makeing it so that git 
can make use of more then one pool of storage.

possible uses of this would be to have a bunch of data on read-only media 
(say the 3G+ kernel history on a DVD), having a pruned local object store 
with automated fetching from elsewhere if the object isn't found locally, 
or marking the object store that you plan on sharing with the world as 
read-only (with your changed object going into a secondary store) so that 
you don't pollute it accidently (this could also cut down on the storage 
requirements)

there are probably other uses and it seems like a fairly small 
modification to add a hook to use if the object isn't found initially that 
I thought I'd mention it to the group.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[relevance 3%]

* Re: Kernel nightly snapshots..
  @ 2005-05-05 14:44  3%     ` Linus Torvalds
  2005-05-05 15:10  0%       ` David Woodhouse
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-05-05 14:44 UTC (permalink / raw)
  To: David Woodhouse; +Cc: H. Peter Anvin, Git Mailing List



On Thu, 5 May 2005, David Woodhouse wrote:
> 
> hera /home/dwmw2/git/snapshot-2.6 $ cg-init /pub/scm/linux/kernel/git/torvalds/linux-2.6.git &> ../asd
> hera /home/dwmw2/git/snapshot-2.6 $ cg-tag-ls
> v2.6.11 5dc01c595e6c6ec9ccda4f6f69c131c0dd945f8c
> v2.6.11-tree    5dc01c595e6c6ec9ccda4f6f69c131c0dd945f8c
> v2.6.12-rc2     9e734775f7c22d2f89943ad6c745571f1930105f
> v2.6.12-rc3     0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
> hera /home/dwmw2/git/snapshot-2.6 $ git-cat-file -t 0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
> .git/objects/03/97236d43e48e821cce5bbe6a80a1a56bb7cc3a: No such file or directory

Looks like cg uses git-http-pull instead of rsync, and doesn't download 
anything but the required objects. 

In which case you probably don't have the v2.6.11 tree either, in fact, 
since it's not required to get a working copy of HEAD.

If you fetch the _whole_ object database (with rsync), you should get 
them.

		Linus

^ permalink raw reply	[relevance 3%]

* Re: Kernel nightly snapshots..
  2005-05-05 14:44  3%     ` Linus Torvalds
@ 2005-05-05 15:10  0%       ` David Woodhouse
  0 siblings, 0 replies; 200+ results
From: David Woodhouse @ 2005-05-05 15:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: H. Peter Anvin, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 452 bytes --]

On Thu, 2005-05-05 at 07:44 -0700, Linus Torvalds wrote:
> If you fetch the _whole_ object database (with rsync), you should get 
> them.

OK, I've changed my 'origin' to an rsync URL referring to the same
place, to make sure I get tags correctly in future. 2.6.12-rc3-git1 is
in the process of being built; if the attached script works and
continues working when invoked from cron, we might even see nightly
snapshots again as requested...

-- 
dwmw2

[-- Attachment #2: git-snapshot.sh --]
[-- Type: application/x-shellscript, Size: 1724 bytes --]

^ permalink raw reply	[relevance 0%]

* Make errors
@ 2005-05-08 19:35  3% John Kacur
  0 siblings, 0 replies; 200+ results
From: John Kacur @ 2005-05-08 19:35 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

Ok, I'm coming late to the git game, and having some build problems.
My machine is amd64 (athlon64), and I fetched both

git-0.7.tar.bz2
cogito-0.9.tar.bz2

They both fail to build in a similar fashion
For example, I did
bunzip2 -c git-0.7.tar.bz2 | tar xvf -
cd git-0.7
export PATH=$PWD:$PATH
make

and it fails here:
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-convert-cache
convert-c
ache.c libgit.a -lz -lssl
convert-cache.c: In function `write_subdirectory':
convert-cache.c:102: warning: field precision is not type int (arg 4)
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-http-pull
http-pull.c l
ibgit.a -lz -lssl -lcurl
http-pull.c:10:23: curl/curl.h: No such file or directory
http-pull.c:11:23: curl/easy.h: No such file or directory
http-pull.c:13: error: parse error before '*' token

Any hints?

Thanks in advance

The entire build output is attached:



[-- Attachment #2: out.txt --]
[-- Type: text/plain, Size: 4381 bytes --]

gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o read-cache.o read-cache.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o sha1_file.o sha1_file.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o usage.o usage.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o object.o object.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o commit.o commit.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o tree.o tree.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o blob.o blob.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o tag.o tag.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o strbuf.o strbuf.c
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>'   -c -o diff.o diff.c
ar rcs libgit.a read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o tag.o strbuf.o diff.o
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-update-cache update-cache.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-diff-files diff-files.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-init-db init-db.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-write-tree write-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-read-tree read-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-commit-tree commit-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-cat-file cat-file.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-fsck-cache fsck-cache.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-checkout-cache checkout-cache.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-diff-tree diff-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-rev-tree rev-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-show-files show-files.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-check-files check-files.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-ls-tree ls-tree.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-merge-base merge-base.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-merge-cache merge-cache.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-unpack-file unpack-file.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-export export.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-diff-cache diff-cache.c libgit.a -lz -lssl
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-convert-cache convert-cache.c libgit.a -lz -lssl
convert-cache.c: In function `write_subdirectory':
convert-cache.c:102: warning: field precision is not type int (arg 4)
gcc -g -O2 -Wall '-DSHA1_HEADER=<openssl/sha.h>' -o git-http-pull http-pull.c libgit.a -lz -lssl -lcurl
http-pull.c:10:23: curl/curl.h: No such file or directory
http-pull.c:11:23: curl/easy.h: No such file or directory
http-pull.c:13: error: parse error before '*' token
http-pull.c:13: warning: type defaults to `int' in declaration of `curl'
http-pull.c:13: warning: data definition has no type or storage class
http-pull.c: In function `fetch':
http-pull.c:73: warning: implicit declaration of function `curl_easy_setopt'
http-pull.c:73: error: `CURLOPT_FILE' undeclared (first use in this function)
http-pull.c:73: error: (Each undeclared identifier is reported only once
http-pull.c:73: error: for each function it appears in.)
http-pull.c:74: error: `CURLOPT_WRITEFUNCTION' undeclared (first use in this function)
http-pull.c:86: error: `CURLOPT_URL' undeclared (first use in this function)
http-pull.c:90: warning: implicit declaration of function `curl_easy_perform'
http-pull.c: In function `main':
http-pull.c:191: warning: implicit declaration of function `curl_global_init'
http-pull.c:191: error: `CURL_GLOBAL_ALL' undeclared (first use in this function)
http-pull.c:193: warning: implicit declaration of function `curl_easy_init'
http-pull.c:193: warning: assignment makes pointer from integer without a cast
http-pull.c:202: warning: implicit declaration of function `curl_global_cleanup'
make: *** [git-http-pull] Error 1

^ permalink raw reply	[relevance 3%]

* [RFC] Renaming environment variables.
  @ 2005-05-09 20:05  1%                     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-09 20:05 UTC (permalink / raw)
  To: git; +Cc: H. Peter Anvin, Sean, Linus Torvalds

H. Peter Anvin mentioned that using SHA1_whatever as an
environment variable name is not nice and we should instead use
names starting with "GIT_" prefix to avoid conflicts.

Here is a patch, requesting for comments.

 - Renames the following environment variables:

    New name                Old Name

    GIT_AUTHOR_DATE         AUTHOR_DATE
    GIT_AUTHOR_EMAIL        AUTHOR_EMAIL
    GIT_AUTHOR_NAME         AUTHOR_NAME
    GIT_COMMIT_AUTHOR_EMAIL COMMIT_AUTHOR_EMAIL
    GIT_COMMIT_AUTHOR_NAME  COMMIT_AUTHOR_NAME
    GIT_ALTERNATE_OBJECTS   SHA1_FILE_DIRECTORIES
    GIT_OBJECTS             SHA1_FILE_DIRECTORY

 - Changes all users of the environment variable to fetch
   environment variable with the new name.

 - Introduces a compatibility macro, gitenv(), which does an
   getenv() and if it fails calls gitenv_bc(), which in turn
   picks up the value from old name while giving a warning about
   using an old name.

I've also updated the documentation and scripts shipped with
Linus GIT distribution.

The transition plan is as follows:

 - We will keep the backward compatibility list used by gitenv()
   for now, so the current scripts and user environments
   continue to work as before.  The users will get warnings when
   they have old name but not new name in their environment to
   the stderr.

 - The Porcelain layers should start using new names.  However,
   just in case it ends up calling old Plumbing layer
   implementation, they should also export old names, taking
   values from the corresponding new names, during the
   transition period.

 - After a couple of weeks or so, we would drop the
   compatibility support and drop gitenv().  Revert the callers
   to directly call getenv() but keep using the new names.

   The last part is probably optional and the transition
   duration needs to be set to a reasonable value.

Not-quite-signed-off-yet-by: Junio C Hamano <junkio@cox.net>
------------

Documentation/core-git.txt |   17 +++++-----
Makefile                   |    3 +
README                     |    2 -
cache.h                    |   15 +++++++--
commit-tree.c              |   10 +++---
diff.c                     |   10 +++---
git-prune-script           |    8 +++--
gitenv.c                   |   70 +++++++++++++++++++++++++++++++++++++++++++++
init-db.c                  |    7 ++--
rsh.c                      |    4 +-
sha1_file.c                |    8 ++---
11 files changed, 120 insertions(+), 34 deletions(-)

# - HEAD: Fix git-update-cache --cacheinfo error message.
# + 7: Rename Environment Variables
--- a/Documentation/core-git.txt
+++ b/Documentation/core-git.txt
@@ -210,15 +210,16 @@ Environment Variables
 ---------------------
 Various git commands use the following environment variables:
 
-- 'AUTHOR_NAME'
-- 'AUTHOR_EMAIL'
-- 'AUTHOR_DATE'
-- 'COMMIT_AUTHOR_NAME'
-- 'COMMIT_AUTHOR_EMAIL'
+- 'GIT_AUTHOR_NAME'
+- 'GIT_AUTHOR_EMAIL'
+- 'GIT_AUTHOR_DATE'
+- 'GIT_COMMIT_AUTHOR_NAME'
+- 'GIT_COMMIT_AUTHOR_EMAIL'
 - 'GIT_DIFF_OPTS'
 - 'GIT_EXTERNAL_DIFF'
 - 'GIT_INDEX_FILE'
-- 'SHA1_FILE_DIRECTORY'
+- 'GIT_OBJECTS'
+- 'GIT_ALTERNATE_OBJECTS'
 
 
 NAME
@@ -876,7 +877,7 @@ sha1 mismatch <object>::
 Environment Variables
 ---------------------
 
-SHA1_FILE_DIRECTORY::
+GIT_OBJECTS::
 	used to specify the object database root (usually .git/objects)
 
 GIT_INDEX_FILE::
@@ -918,7 +919,7 @@ DESCRIPTION
 This simply creates an empty git object database - basically a `.git`
 directory and `.git/object/??/` directories.
 
-If the object storage directory is specified via the 'SHA1_FILE_DIRECTORY'
+If the object storage directory is specified via the 'GIT_OBJECTS'
 environment variable then the sha1 directories are created underneath -
 otherwise the default `.git/objects` directory is used.
 
--- a/Makefile
+++ b/Makefile
@@ -46,6 +46,8 @@ LIB_OBJS += strbuf.o
 LIB_H += diff.h
 LIB_OBJS += diff.o
 
+LIB_OBJS += gitenv.o
+
 LIBS = $(LIB_FILE)
 LIBS += -lz
 
@@ -116,6 +118,7 @@ sha1_file.o: $(LIB_H)
 usage.o: $(LIB_H)
 diff.o: $(LIB_H)
 strbuf.o: $(LIB_H)
+gitenv.o: $(LIB_H)
 
 clean:
 	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
--- a/README
+++ b/README
@@ -24,7 +24,7 @@ There are two object abstractions: the "
 
 
 
-	The Object Database (SHA1_FILE_DIRECTORY)
+	The Object Database (GIT_OBJECTS)
 
 
 The object database is literally just a content-addressable collection
--- a/cache.h
+++ b/cache.h
@@ -31,6 +31,13 @@
 #endif
 
 /*
+ * Environment variables transition.
+ * We accept older names for now but warn.
+ */
+extern char *gitenv_bc(const char *);
+#define gitenv(e) (getenv(e) ? : gitenv_bc(e))
+
+/*
  * Basic data structures for the directory cache
  *
  * NOTE NOTE NOTE! This is all in the native CPU byte format. It's
@@ -99,16 +106,16 @@ static inline unsigned int create_ce_mod
 struct cache_entry **active_cache;
 unsigned int active_nr, active_alloc, active_cache_changed;
 
-#define DB_ENVIRONMENT "SHA1_FILE_DIRECTORY"
+#define DB_ENVIRONMENT "GIT_OBJECTS"
 #define DEFAULT_DB_ENVIRONMENT ".git/objects"
-#define ALTERNATE_DB_ENVIRONMENT "SHA1_FILE_DIRECTORIES"
+#define ALTERNATE_DB_ENVIRONMENT "GIT_ALTERNATE_OBJECTS"
 
-#define get_object_directory() (getenv(DB_ENVIRONMENT) ? : DEFAULT_DB_ENVIRONMENT)
+#define get_object_directory() (gitenv(DB_ENVIRONMENT) ? : DEFAULT_DB_ENVIRONMENT)
 
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
 #define DEFAULT_INDEX_ENVIRONMENT ".git/index"
 
-#define get_index_file() (getenv(INDEX_ENVIRONMENT) ? : DEFAULT_INDEX_ENVIRONMENT)
+#define get_index_file() (gitenv(INDEX_ENVIRONMENT) ? : DEFAULT_INDEX_ENVIRONMENT)
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -146,11 +146,11 @@ int main(int argc, char **argv)
 	datestamp(realdate, sizeof(realdate));
 	strcpy(date, realdate);
 
-	commitgecos = getenv("COMMIT_AUTHOR_NAME") ? : realgecos;
-	commitemail = getenv("COMMIT_AUTHOR_EMAIL") ? : realemail;
-	gecos = getenv("AUTHOR_NAME") ? : realgecos;
-	email = getenv("AUTHOR_EMAIL") ? : realemail;
-	audate = getenv("AUTHOR_DATE");
+	commitgecos = gitenv("GIT_COMMIT_AUTHOR_NAME") ? : realgecos;
+	commitemail = gitenv("GIT_COMMIT_AUTHOR_EMAIL") ? : realemail;
+	gecos = gitenv("GIT_AUTHOR_NAME") ? : realgecos;
+	email = gitenv("GIT_AUTHOR_EMAIL") ? : realemail;
+	audate = gitenv("GIT_AUTHOR_DATE");
 	if (audate)
 		parse_date(audate, date, sizeof(date));
 
--- a/diff.c
+++ b/diff.c
@@ -8,11 +8,11 @@
 #include "cache.h"
 #include "diff.h"
 
-static char *diff_opts = "-pu";
+static const char *diff_opts = "-pu";
 
 static const char *external_diff(void)
 {
-	static char *external_diff_cmd = NULL;
+	static const char *external_diff_cmd = NULL;
 	static int done_preparing = 0;
 
 	if (done_preparing)
@@ -26,11 +26,11 @@ static const char *external_diff(void)
 	 *
 	 * GIT_DIFF_OPTS="-c";
 	 */
-	if (getenv("GIT_EXTERNAL_DIFF"))
-		external_diff_cmd = getenv("GIT_EXTERNAL_DIFF");
+	if (gitenv("GIT_EXTERNAL_DIFF"))
+		external_diff_cmd = gitenv("GIT_EXTERNAL_DIFF");
 
 	/* In case external diff fails... */
-	diff_opts = getenv("GIT_DIFF_OPTS") ? : diff_opts;
+	diff_opts = gitenv("GIT_DIFF_OPTS") ? : diff_opts;
 
 	done_preparing = 1;
 	return external_diff_cmd;
--- a/git-prune-script
+++ b/git-prune-script
@@ -28,9 +28,13 @@ sed -ne '/unreachable /{
     s/unreachable [^ ][^ ]* //
     s|\(..\)|\1/|p
 }' | {
-	case "$SHA1_FILE_DIRECTORY" in
+	for d in "$GIT_OBJECTS" "$SHA1_FILE_DIRECTORY" ''
+	do
+		test "$d" != "" && test -d "$d" && break
+	done
+	case "$d" in
 	'') cd .git/objects/ ;;
-	*) cd "$SHA1_FILE_DIRECTORY" ;;
+	*) cd "$d" ;;
 	esac || exit
 	xargs -r $dryrun rm -f
 }
Created: gitenv.c (mode:100644)
--- /dev/null
+++ b/gitenv.c
@@ -0,0 +1,70 @@
+#include "cache.h"
+
+/*
+ * This array must be sorted by its canonical name, because
+ * we do look-up by binary search.
+ */
+static struct backward_compatible_env {
+	const char *canonical;
+	const char *old;
+} bc_name[] = {
+	{ "GIT_ALTERNATE_OBJECTS", "SHA1_FILE_DIRECTORIES" },
+	{ "GIT_AUTHOR_DATE", "AUTHOR_DATE" },
+	{ "GIT_AUTHOR_EMAIL", "AUTHOR_EMAIL" },
+	{ "GIT_AUTHOR_NAME", "AUTHOR_NAME" }, 
+	{ "GIT_COMMIT_AUTHOR_EMAIL", "COMMIT_AUTHOR_EMAIL" },
+	{ "GIT_COMMIT_AUTHOR_NAME", "COMMIT_AUTHOR_NAME" },
+	{ "GIT_OBJECTS", "SHA1_FILE_DIRECTORY" },
+};
+
+static void warn_old_environment(void)
+{
+	int i;
+	static int warned = 0;
+	if (warned)
+		return;
+
+	warned = 1;
+	fprintf(stderr,
+		"warning: GIT environment variables have been renamed.\n"
+		"warning: Please adjust your scripts and environment.\n");
+	for (i = 0; i < sizeof(bc_name) / sizeof(bc_name[0]); i++) {
+		/* warning is needed only when old name is there and
+		 * new name is not.
+		 */
+		if (!getenv(bc_name[i].canonical) && getenv(bc_name[i].old))
+			fprintf(stderr, "warning: old %s => new %s\n",
+				bc_name[i].old, bc_name[i].canonical);
+	}
+}
+
+char *gitenv_bc(const char *e)
+{
+	int first, last;
+	char *val = getenv(e);
+	if (val)
+		/* inefficient.  caller should use gitenv() not gitenv_bc() */
+		return val;
+
+	first = 0;
+	last = sizeof(bc_name) / sizeof(bc_name[0]);
+	while (last > first) {
+		int next = (last + first) >> 1;
+		int cmp = strcmp(e, bc_name[next].canonical);
+		if (!cmp) {
+			val = getenv(bc_name[next].old);
+			/* If the user has only old name, warn.
+			 * otherwise stay silent.
+			 */
+			if (val)
+				warn_old_environment();
+			return val;
+		}
+		if (cmp < 0) {
+			last = next;
+			continue;
+		}
+		first = next+1;
+	}
+	return NULL;
+}
--- a/init-db.c
+++ b/init-db.c
@@ -5,7 +5,7 @@
  */
 #include "cache.h"
 
-void safe_create_dir(char *dir)
+void safe_create_dir(const char *dir)
 {
 	if (mkdir(dir, 0755) < 0) {
 		if (errno != EEXIST) {
@@ -23,12 +23,13 @@ void safe_create_dir(char *dir)
  */
 int main(int argc, char **argv)
 {
-	char *sha1_dir, *path;
+	const char *sha1_dir;
+	char *path;
 	int len, i;
 
 	safe_create_dir(".git");
 
-	sha1_dir = getenv(DB_ENVIRONMENT);
+	sha1_dir = gitenv(DB_ENVIRONMENT);
 	if (!sha1_dir) {
 		sha1_dir = DEFAULT_DB_ENVIRONMENT;
 		fprintf(stderr, "defaulting to local storage area\n");
--- a/rsh.c
+++ b/rsh.c
@@ -36,8 +36,8 @@ int setup_connection(int *fd_in, int *fd
 	*(path++) = '\0';
 	/* ssh <host> 'cd /<path>; stdio-pull <arg...> <commit-id>' */
 	snprintf(command, COMMAND_SIZE, 
-		 "cd /%s; SHA1_FILE_DIRECTORY=objects %s",
-		 path, remote_prog);
+		 "cd /%s; %s=objects %s",
+		 path, DB_ENVIRONMENT, remote_prog);
 	posn = command + strlen(command);
 	for (i = 0; i < rmt_argc; i++) {
 		*(posn++) = ' ';
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -120,7 +120,7 @@ static void fill_sha1_path(char *pathbuf
  *
  * Also note that this returns the location for creating.  Reading
  * SHA1 file can happen from any alternate directory listed in the
- * SHA1_FILE_DIRECTORIES environment variable if it is not found in
+ * DB_ENVIRONMENT environment variable if it is not found in
  * the primary object database.
  */
 char *sha1_file_name(const unsigned char *sha1)
@@ -128,7 +128,7 @@ char *sha1_file_name(const unsigned char
 	static char *name, *base;
 
 	if (!base) {
-		char *sha1_file_directory = get_object_directory();
+		const char *sha1_file_directory = get_object_directory();
 		int len = strlen(sha1_file_directory);
 		base = xmalloc(len + 60);
 		memcpy(base, sha1_file_directory, len);
@@ -151,7 +151,7 @@ static struct alternate_object_database 
  * alt_odb points at an array of struct alternate_object_database.
  * This array is terminated with an element that has both its base
  * and name set to NULL.  alt_odb[n] comes from n'th non-empty
- * element from colon separated $SHA1_FILE_DIRECTORIES environment
+ * element from colon separated ALTERNATE_DB_ENVIRONMENT environment
  * variable, and its base points at a statically allocated buffer
  * that contains "/the/directory/corresponding/to/.git/objects/...",
  * while its name points just after the slash at the end of
@@ -167,7 +167,7 @@ static void prepare_alt_odb(void)
 	int pass, totlen, i;
 	const char *cp, *last;
 	char *op = 0;
-	const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT) ? : "";
+	const char *alt = gitenv(ALTERNATE_DB_ENVIRONMENT) ? : "";
 
 	/* The first pass counts how large an area to allocate to
 	 * hold the entire alt_odb structure, including array of


^ permalink raw reply	[relevance 1%]

* Re: [PATCH] improved delta support for git
  @ 2005-05-12  4:36  8% ` Junio C Hamano
  2005-05-12 14:27  3%   ` Chris Mason
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2005-05-12  4:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

The changes to sha1_file interface seems to be contained to
read_sha1_file() only; which is a very good sign.  You have
already expressed that you are aware that fsck-cache needs to be
taught about the delta objects, so I'd trust that would be what
you will be tackling next.

I started wondering how the delta chains would affect pull.c,
the engine that decides which files under GIT_OBJECT_DIRECTORY
need to be pulled from the remote side in order to construct the
set of objects needed by the given commit ID, under various
combinations of cut-off criteria given with -c, -t, and -a
options.

It appears to me that changes to the make_sure_we_have_it()
routine along the following lines (completely untested) would
suffice.  Instead of just returning success, we first fetch the
named object from the remote side, read it to see if it is
really the object we have asked, or just a delta, and if it is a
delta call itself again on the underlying object that delta
object depends upon.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
# - git-pb: Fixed a leak in read-tree
# + (working tree)
--- a/pull.c
+++ b/pull.c
@@ -32,11 +32,23 @@ static void report_missing(const char *w
 static int make_sure_we_have_it(const char *what, unsigned char *sha1)
 {
 	int status;
+	unsigned long mapsize;
+	void *map, *buf;
+
 	if (has_sha1_file(sha1))
 		return 0;
 	status = fetch(sha1);
 	if (status && what)
 		report_missing(what, sha1);
+
+	map = map_sha1_file(sha1, &mapsize);
+	if (map) {
+		buf = unpack_sha1_file(map, mapsize, type, size);
+		munmap(map, mapsize);
+		if (buf && !strcmp(type, "delta"))
+			status = make_sure_we_have_it(what, buf);
+		free(buf);
+	}
 	return status;
 }
 



^ permalink raw reply	[relevance 8%]

* Re: [PATCH] improved delta support for git
  2005-05-12  4:36  8% ` Junio C Hamano
@ 2005-05-12 14:27  3%   ` Chris Mason
       [not found]         ` <2cfc403205051207467755cdf@mail.gmail.com>
  0 siblings, 1 reply; 200+ results
From: Chris Mason @ 2005-05-12 14:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git

On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
> It appears to me that changes to the make_sure_we_have_it()
> routine along the following lines (completely untested) would
> suffice.  Instead of just returning success, we first fetch the
> named object from the remote side, read it to see if it is
> really the object we have asked, or just a delta, and if it is a
> delta call itself again on the underlying object that delta
> object depends upon.

If we fetch the named object and it is a delta, the delta will either depend 
on an object we already have or an object that we don't have.  If we don't 
have it, the pull should find it while pulling other commits we don't have.

-chris



^ permalink raw reply	[relevance 3%]

* Re: [PATCH] improved delta support for git
       [not found]         ` <2cfc403205051207467755cdf@mail.gmail.com>
@ 2005-05-12 14:47  0%       ` Jon Seymour
  2005-05-12 15:18  0%         ` Nicolas Pitre
  0 siblings, 1 reply; 200+ results
From: Jon Seymour @ 2005-05-12 14:47 UTC (permalink / raw)
  To: Git Mailing List

On 5/13/05, Chris Mason <mason@suse.com> wrote:
> On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
> > It appears to me that changes to the make_sure_we_have_it() ...
>
> If we fetch the named object and it is a delta, the delta will either depend
> on an object we already have or an object that we don't have.  If we don't
> have it, the pull should find it while pulling other commits we don't have.
>

Chris,

Doesn't that assume that the object referenced by the delta is
reachable from the commit being pulled. While that may be true in
practice, I don't think it is a logical certainty.

jon.

^ permalink raw reply	[relevance 0%]

* Re: [PATCH] improved delta support for git
  2005-05-12 14:47  0%       ` Jon Seymour
@ 2005-05-12 15:18  0%         ` Nicolas Pitre
  2005-05-12 17:16  0%           ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Nicolas Pitre @ 2005-05-12 15:18 UTC (permalink / raw)
  To: jon; +Cc: Git Mailing List

On Fri, 13 May 2005, Jon Seymour wrote:

> On 5/13/05, Chris Mason <mason@suse.com> wrote:
> > On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
> > > It appears to me that changes to the make_sure_we_have_it() ...
> >
> > If we fetch the named object and it is a delta, the delta will either depend
> > on an object we already have or an object that we don't have.  If we don't
> > have it, the pull should find it while pulling other commits we don't have.
> >
> 
> Chris,
> 
> Doesn't that assume that the object referenced by the delta is
> reachable from the commit being pulled. While that may be true in
> practice, I don't think it is a logical certainty.

1) If you happen to already have the referenced object in your local 
   repository then you're done.

2) If not you pull the referenced object from the remote repository, 
   repeat with #1 if it happens to be another delta object.

3) If the remote repository doesn't contain the object referenced by any 
   pulled delta object then that repository is inconsistent just like if 
   a blob object referenced by a tree object was missing.  This 
   therefore should not happen.  git-fsck-cache will flag broken delta 
   links soon.


Nicolas

^ permalink raw reply	[relevance 0%]

* Re: [PATCH] improved delta support for git
  2005-05-12 15:18  0%         ` Nicolas Pitre
@ 2005-05-12 17:16  0%           ` Junio C Hamano
  2005-05-13 11:44  0%             ` Chris Mason
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2005-05-12 17:16 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: jon, Git Mailing List

>>>>> "NP" == Nicolas Pitre <nico@cam.org> writes:

>> On 5/13/05, Chris Mason <mason@suse.com> wrote:
>> > On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
>> > > It appears to me that changes to the make_sure_we_have_it() ...
>> >
>> > If we fetch the named object and it is a delta, the delta will either depend
>> > on an object we already have or an object that we don't have.  If we don't
>> > have it, the pull should find it while pulling other commits we don't have.

NP> 1) If you happen to already have the referenced object in your local 
NP>    repository then you're done.

Yes.

NP> 2) If not you pull the referenced object from the remote repository, 
NP>    repeat with #1 if it happens to be another delta object.

Yes, that is the outline of what my (untested) patch does.

Unless I am grossly mistaken, what Chris says is true only when
we are pulling with -a flag to the git-*-pull family.  If we are
pulling "partially near the tip", we do not necessarily pull
"other commits we don't have", hence detecting delta's
requirement at per-object level and pulling the dependent
becomes necessary, which is essentially what you wrote in (2)
above.


^ permalink raw reply	[relevance 0%]

* Re: [RFC] Support projects including other projects
  @ 2005-05-12 17:24  2% ` David Lang
  0 siblings, 0 replies; 200+ results
From: David Lang @ 2005-05-12 17:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git, Petr Baudis, Linus Torvalds

I was thinking about this recently while reading an article on bittorrent 
and how it works and it occured to me that perhapse the network access 
model of git should be reexamined.

git produces a large pool of objects, there are two ways that people want 
to access these objects.

1. pull the current version of a project (either a straight 'ckeckout' 
type pull or a 'merge' to a local project)

2. pull the objects nessasary for past versions of a project (either all 
the way back to the beginning of time or back to some point, that point 
being a number of possibilities (date, version, things you don't have, 
etc)

in either case the important thing that's key are the indexes related to a 
particular project, the objects themselves could all be in one huge pool 
for all projects that ever existed (this doesn't make sense if you use 
rsync to copy repositories as Linux origionally did, but if you have a 
more git-aware transport it can make sense)

I believe that there are going to be quite a number of cases where the 
same object is used for multiple projects (either becouse the project is a 
fork of another project or becouse some functions (or include files) are 
so trivial that they are basicly boilerplate and get reused or recreated) 
if you think about a major mirror server distributing a dozen linux 
distros via git you will realize that in many cases the source files, 
scripts, and (in many cases) even the binaries are really going to be 
identical objects for all the distros so a ftp/http server that used a git 
filesystem could result in a pretty significant saveings in disk space.

In addition, when you are doing a pull you can accept data from 
non-authoritative sources since each object (and it's index info) includes 
enough info to validate the object hasn't been tampered with (at least 
until such time as the hashes are sufficiantly broken, but that's another 
debate, and we had that one :-). so a bittorrent-like peer sharing system 
to fetch objects identified by the index files would open the potential 
for saving significant bandwith on the master servers while not 
comprimising the trees at all.

Going back (somewhat) to the subject at hand, with something like this you 
should be able to combine as many projects as you want in one repository, 
and the only issue would be the work nessasary to go through that 
repository and all the index files that point at it when you want to prune 
old data out of the object pool to save disk space.

thoughts? unfortunnatly I don't have the time to even consider codeing 
something like this up, but hopefully it will spark interest for someone 
who does.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[relevance 2%]

* Re: Mercurial 0.4e vs git network pull
  @ 2005-05-12 20:14  3%     ` Petr Baudis
  2005-05-12 20:57  0%       ` Matt Mackall
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-12 20:14 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, git, mercurial, Linus Torvalds

Dear diary, on Thu, May 12, 2005 at 10:11:16PM CEST, I got a letter
where Matt Mackall <mpm@selenic.com> told me that...
> On Thu, May 12, 2005 at 08:23:41PM +0200, Petr Baudis wrote:
> > Dear diary, on Thu, May 12, 2005 at 11:44:06AM CEST, I got a letter
> > where Matt Mackall <mpm@selenic.com> told me that...
> > > Mercurial is more than 10 times as bandwidth efficient and
> > > considerably more I/O efficient. On the server side, rsync uses about
> > > twice as much CPU time as the Mercurial server and has about 10 times
> > > the I/O and pagecache footprint as well.
> > > 
> > > Mercurial is also much smarter than rsync at determining what
> > > outstanding changesets exist. Here's an empty pull as a demonstration:
> > > 
> > >  $ time hg merge hg://selenic.com/linux-hg/
> > >  retrieving changegroup
> > > 
> > >  real    0m0.363s
> > >  user    0m0.083s
> > >  sys     0m0.007s
> > > 
> > > That's a single http request and a one line response.
> > 
> > So, what about comparing it with something comparable, say git pull over
> > HTTP? :-)
> 
> ..because I get a headache every time I try to figure out how to use git? :-P
> 
> Seriously, have a pointer to how this works?

Either you use cogito and just pass cg-clone an HTTP URL (to the git
repository as in the case of rsync -
http://www.kernel.org/pub/scm/cogito/cogito.git should work), or you
invoke git-http-pull directly (passing it desired commit ID of the
remote HEAD you want to fetch, and the URL; see
Documentation/git-http-pull.txt).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: Mercurial 0.4e vs git network pull
  2005-05-12 20:14  3%     ` Petr Baudis
@ 2005-05-12 20:57  0%       ` Matt Mackall
  0 siblings, 0 replies; 200+ results
From: Matt Mackall @ 2005-05-12 20:57 UTC (permalink / raw)
  To: Petr Baudis; +Cc: linux-kernel, git, mercurial, Linus Torvalds

On Thu, May 12, 2005 at 10:14:06PM +0200, Petr Baudis wrote:
> Dear diary, on Thu, May 12, 2005 at 10:11:16PM CEST, I got a letter
> where Matt Mackall <mpm@selenic.com> told me that...
> > On Thu, May 12, 2005 at 08:23:41PM +0200, Petr Baudis wrote:
> > > Dear diary, on Thu, May 12, 2005 at 11:44:06AM CEST, I got a letter
> > > where Matt Mackall <mpm@selenic.com> told me that...
> > > > Mercurial is more than 10 times as bandwidth efficient and
> > > > considerably more I/O efficient. On the server side, rsync uses about
> > > > twice as much CPU time as the Mercurial server and has about 10 times
> > > > the I/O and pagecache footprint as well.
> > > > 
> > > > Mercurial is also much smarter than rsync at determining what
> > > > outstanding changesets exist. Here's an empty pull as a demonstration:
> > > > 
> > > >  $ time hg merge hg://selenic.com/linux-hg/
> > > >  retrieving changegroup
> > > > 
> > > >  real    0m0.363s
> > > >  user    0m0.083s
> > > >  sys     0m0.007s
> > > > 
> > > > That's a single http request and a one line response.
> > > 
> > > So, what about comparing it with something comparable, say git pull over
> > > HTTP? :-)
> > 
> > ..because I get a headache every time I try to figure out how to use git? :-P
> > 
> > Seriously, have a pointer to how this works?
> 
> Either you use cogito and just pass cg-clone an HTTP URL (to the git
> repository as in the case of rsync -
> http://www.kernel.org/pub/scm/cogito/cogito.git should work), or you
> invoke git-http-pull directly (passing it desired commit ID of the
> remote HEAD you want to fetch, and the URL; see
> Documentation/git-http-pull.txt).

Does this need an HTTP request (and round trip) per object? It appears
to. That's 2200 requests/round trips for my 800 patch benchmark.

How does git find the outstanding changesets?

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[relevance 0%]

* [PATCH 0/4] Pulling refs files
@ 2005-05-13  6:49  4% Daniel Barkalow
  2005-05-13  6:56  6% ` [PATCH 2/4] Generic support for pulling refs Daniel Barkalow
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Daniel Barkalow @ 2005-05-13  6:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

This series makes the following changes:

 1: Adds support for having the C code know about the general existance of
    .git/refs, and functions for writing these files.
 2: Adds support in the generic pull code for fetching refs (and dummy
    implementations).
 3: Adds support in the HTTP pull code for fetching refs
 4: Adds support in the rsh pull code for fetching refs; this requires
    changes to the protocol. These changes should be sufficient to support
    any future extension, however.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 4%]

* [PATCH 2/4] Generic support for pulling refs
  2005-05-13  6:49  4% [PATCH 0/4] Pulling refs files Daniel Barkalow
@ 2005-05-13  6:56  6% ` Daniel Barkalow
  2005-05-13  6:57  4% ` [PATCH 3/4] Pull refs by HTTP Daniel Barkalow
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-13  6:56 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

This adds a pull method to pull refs, provides dummy implementations for
the existing programs, and uses that method to try to get refs if
requested. It also adds generic support for writing the target to a refs
file.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: http-pull.c
===================================================================
--- adc28203a55e7e9d3c0b4f6546ea0c2b99106f24/http-pull.c  (mode:100644 sha1:024457a9895ab10c4ef18aa6e232d12fdaab4da9)
+++ 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/http-pull.c  (mode:100644 sha1:af4e82fdf9c58a15564d40bef85d57e9f6626727)
@@ -98,6 +98,11 @@
 	return 0;
 }
 
+int fetch_ref(char *dir, char *name, unsigned char *sha1)
+{
+	return -1;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;
Index: local-pull.c
===================================================================
--- adc28203a55e7e9d3c0b4f6546ea0c2b99106f24/local-pull.c  (mode:100644 sha1:3a342ab18390d7ce0df1f970a4961b31548a9417)
+++ 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/local-pull.c  (mode:100644 sha1:73aad965d0b190627aa95726e4feaa2623d31d26)
@@ -18,6 +18,11 @@
 
 static char *path;
 
+int fetch_ref(char *dir, char *name, unsigned char *sha1)
+{
+	return -1;
+}
+
 int fetch(unsigned char *sha1)
 {
 	static int object_name_start = -1;
Index: pull.c
===================================================================
--- adc28203a55e7e9d3c0b4f6546ea0c2b99106f24/pull.c  (mode:100644 sha1:0bed44f4cbf6716cfc3152f35626123992766408)
+++ 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/pull.c  (mode:100644 sha1:d4d858cd638e915096a408ba3c37090a1b460c21)
@@ -3,6 +3,12 @@
 #include "cache.h"
 #include "commit.h"
 #include "tree.h"
+#include "tag.h"
+
+#include "refs.h"
+
+char *write_ref_dir = NULL;
+char *write_ref_name = NULL;
 
 int get_tree = 0;
 int get_history = 0;
@@ -98,16 +104,52 @@
 	return 0;
 }
 
+static int process_tag(unsigned char *sha1)
+{
+	return 0;
+}
+
+static int process_unknown(unsigned char *sha1)
+{
+	struct object *obj;
+	if (make_sure_we_have_it(NULL, sha1))
+		return -1;
+	obj = parse_object(sha1);
+	if (obj->type == commit_type) {
+		memcpy(current_commit_sha1, sha1, 20);
+		return process_commit(sha1);
+	} else if (obj->type == tag_type)
+		return process_tag(sha1);
+	return error("Cannot pull a %s object", obj->type);
+}
+
+static int interpret_target(char *target, unsigned char *sha1)
+{
+	char *dir, *name;
+	if (!get_sha1_hex(target, sha1))
+		return 0;
+	if (!split_ref(&dir, &name, target)) {
+		if (!fetch_ref(dir, name, sha1)) {
+			return 0;
+		}
+	}
+	return -1;
+}
+
 int pull(char *target)
 {
 	int retval;
 	unsigned char sha1[20];
-	retval = get_sha1_hex(target, sha1);
-	if (retval)
-		return retval;
-	retval = make_sure_we_have_it(commitS, sha1);
+	retval = interpret_target(target, sha1);
+	if (retval) {
+		return error("Could not interpret %s as something to pull",
+			     target);
+	}
+	retval = process_unknown(sha1);
 	if (retval)
 		return retval;
-	memcpy(current_commit_sha1, sha1, 20);
-	return process_commit(sha1);
+
+	if (write_ref_dir && write_ref_name)
+		write_split_ref_sha1(write_ref_dir, write_ref_name, sha1);
+	return 0;
 }
Index: pull.h
===================================================================
--- adc28203a55e7e9d3c0b4f6546ea0c2b99106f24/pull.h  (mode:100644 sha1:d2dca02de7c23426e84e9f63762df9428933e8d8)
+++ 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/pull.h  (mode:100644 sha1:de0e9245b68856bcf84c033650b6b3eb151641e2)
@@ -4,6 +4,12 @@
 /** To be provided by the particular implementation. **/
 extern int fetch(unsigned char *sha1);
 
+extern int fetch_ref(char *dir, char *name, unsigned char *sha1);
+
+/** Ref filename to write target to. **/
+extern char *write_ref_dir;
+extern char *write_ref_name;
+
 /** Set to fetch the target tree. */
 extern int get_tree;
 
Index: rpull.c
===================================================================
--- adc28203a55e7e9d3c0b4f6546ea0c2b99106f24/rpull.c  (mode:100644 sha1:b48e63157c66c160b9751603a92831f77106044c)
+++ 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/rpull.c  (mode:100644 sha1:493fcdae670ebb1d93b8c75d3e28798e060d7537)
@@ -22,6 +22,11 @@
 	return ret;
 }
 
+int fetch_ref(char *dir, char *name, unsigned char *sha1)
+{
+	return -1;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;


^ permalink raw reply	[relevance 6%]

* [PATCH 3/4] Pull refs by HTTP
  2005-05-13  6:49  4% [PATCH 0/4] Pulling refs files Daniel Barkalow
  2005-05-13  6:56  6% ` [PATCH 2/4] Generic support for pulling refs Daniel Barkalow
@ 2005-05-13  6:57  4% ` Daniel Barkalow
  2005-05-13  7:01  3% ` [PATCH 4/4] Pulling refs by ssh Daniel Barkalow
  2005-05-13 22:19  0% ` [PATCH 0/4] Pulling refs files Petr Baudis
  3 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-13  6:57 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

Adds support for pulling refs by HTTP, and an option for writing the
pulled ref to a file.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: http-pull.c
===================================================================
--- 90e05f81df7b7fd2c39d252b6f9a2374d4dd0cf5/http-pull.c  (mode:100644 sha1:af4e82fdf9c58a15564d40bef85d57e9f6626727)
+++ 4931f2d8b9c2ab83718f6446d5ef3af5fa320b3f/http-pull.c  (mode:100644 sha1:6e8dc48ddd0ea1ae89074f6ae0d89c54303895b7)
@@ -7,6 +7,8 @@
 #include <errno.h>
 #include <stdio.h>
 
+#include "refs.h"
+
 #include "pull.h"
 
 #include <curl/curl.h>
@@ -45,6 +47,23 @@
 	return size;
 }
 
+struct buffer
+{
+	size_t posn;
+	size_t size;
+	void *buffer;
+};
+
+static size_t fwrite_buffer(void *ptr, size_t eltsize, size_t nmemb,
+			    struct buffer *buffer) {
+	size_t size = eltsize * nmemb;
+	if (size > buffer->size - buffer->posn)
+		size = buffer->size - buffer->posn;
+	memcpy(buffer->buffer + buffer->posn, ptr, size);
+	buffer->posn += size;
+	return size;
+}
+
 int fetch(unsigned char *sha1)
 {
 	char *hex = sha1_to_hex(sha1);
@@ -93,14 +112,42 @@
 		unlink(filename);
 		return error("File %s has bad hash\n", hex);
 	}
-	
 	pull_say("got %s\n", hex);
 	return 0;
 }
 
 int fetch_ref(char *dir, char *name, unsigned char *sha1)
 {
-	return -1;
+	char *url, *posn;
+	char hex[42];
+	struct buffer buffer;
+	buffer.size = 41;
+	buffer.posn = 0;
+	buffer.buffer = hex;
+	hex[41] = '\0';
+	
+	curl_easy_setopt(curl, CURLOPT_FILE, &buffer);
+	curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
+
+	url = xmalloc(strlen(base) + 7 + strlen(dir) + strlen(name));
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "refs/");
+	posn += 5;
+	strcpy(posn, dir);
+	posn += strlen(dir);
+	*(posn++) = '/';
+	strcpy(posn, name);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	if (curl_easy_perform(curl))
+		return error("Couldn't get %s for %s/%s\n", url,
+			     dir, name);
+
+	hex[40] = '\0';
+	get_sha1_hex(hex, sha1);
+	return 0;
 }
 
 int main(int argc, char **argv)
@@ -120,6 +167,10 @@
 			get_history = 1;
 		} else if (argv[arg][1] == 'v') {
 			get_verbosely = 1;
+		} else if (argv[arg][1] == 'w') {
+			char *write_ref = argv[arg + 1];
+			split_ref(&write_ref_dir, &write_ref_name, write_ref);
+			arg++;
 		}
 		arg++;
 	}


^ permalink raw reply	[relevance 4%]

* [PATCH 4/4] Pulling refs by ssh
  2005-05-13  6:49  4% [PATCH 0/4] Pulling refs files Daniel Barkalow
  2005-05-13  6:56  6% ` [PATCH 2/4] Generic support for pulling refs Daniel Barkalow
  2005-05-13  6:57  4% ` [PATCH 3/4] Pull refs by HTTP Daniel Barkalow
@ 2005-05-13  7:01  3% ` Daniel Barkalow
  2005-05-13 22:19  0% ` [PATCH 0/4] Pulling refs files Petr Baudis
  3 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-13  7:01 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

Adds support for pulling refs by rsh.

This changes the rsh protocol to allow requests for different things, and
to allow the server to report that it doesn't have something without
breaking the connection.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: rpull.c
===================================================================
--- 4931f2d8b9c2ab83718f6446d5ef3af5fa320b3f/rpull.c  (mode:100644 sha1:493fcdae670ebb1d93b8c75d3e28798e060d7537)
+++ a219d8e31f3882aaa32e7dbac7a1f92b35a9dbff/rpull.c  (mode:100644 sha1:cce9e71becc95f728d320ef49e11a647a420b75d)
@@ -8,6 +8,7 @@
 #include <stdio.h>
 #include "rsh.h"
 #include "pull.h"
+#include "refs.h"
 
 static int fd_in;
 static int fd_out;
@@ -15,16 +16,34 @@
 int fetch(unsigned char *sha1)
 {
 	int ret;
+	signed char remote;
+	char type = 'o';
+	if (has_sha1_file(sha1))
+		return 0;
+	write(fd_out, &type, 1);
 	write(fd_out, sha1, 20);
+	if (read(fd_in, &remote, 1) < 1)
+		return -1;
+	if (remote < 0)
+		return remote;
 	ret = write_sha1_from_fd(sha1, fd_in);
 	if (!ret)
 		pull_say("got %s\n", sha1_to_hex(sha1));
 	return ret;
 }
 
-int fetch_ref(char *dir, char *name, unsigned char *sha1)
+int fetch_ref(char *name, char *dir, unsigned char *sha1)
 {
-	return -1;
+	signed char remote;
+	char type = 'r';
+	write(fd_out, &type, 1);
+	write(fd_out, name, strlen(name) + 1);
+	write(fd_out, dir, strlen(dir) + 1);
+	read(fd_in, &remote, 1);
+	if (remote < 0)
+		return remote;
+	read(fd_in, sha1, 20);
+	return 0;
 }
 
 int main(int argc, char **argv)
@@ -44,6 +63,10 @@
 			get_history = 1;
 		} else if (argv[arg][1] == 'v') {
 			get_verbosely = 1;
+		} else if (argv[arg][1] == 'w') {
+			char *write_ref = argv[arg + 1];
+			split_ref(&write_ref_dir, &write_ref_name, write_ref);
+			arg++;
 		}
 		arg++;
 	}
Index: rpush.c
===================================================================
--- 4931f2d8b9c2ab83718f6446d5ef3af5fa320b3f/rpush.c  (mode:100644 sha1:26518846704ecf63ad00390599b251aa1b32713e)
+++ a219d8e31f3882aaa32e7dbac7a1f92b35a9dbff/rpush.c  (mode:100644 sha1:c3cad4eac186307e743eacb913a0b382a455d1f4)
@@ -2,47 +2,98 @@
 #include "rsh.h"
 #include <sys/socket.h>
 #include <errno.h>
+#include "refs.h"
 
-void service(int fd_in, int fd_out) {
+int serve_object(int fd_in, int fd_out) {
 	ssize_t size;
-	int posn;
+	int posn = 0;
 	char sha1[20];
 	unsigned long objsize;
 	void *buf;
+	signed char remote;
 	do {
-		posn = 0;
-		do {
-			size = read(fd_in, sha1 + posn, 20 - posn);
-			if (size < 0) {
-				perror("rpush: read ");
-				return;
+		size = read(fd_in, sha1 + posn, 20 - posn);
+		if (size < 0) {
+			perror("rpush: read ");
+			return -1;
+		}
+		if (!size)
+			return -1;
+		posn += size;
+	} while (posn < 20);
+	
+	/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+	remote = 0;
+	
+	buf = map_sha1_file(sha1, &objsize);
+	
+	if (!buf) {
+		fprintf(stderr, "rpush: could not find %s\n", 
+			sha1_to_hex(sha1));
+		remote = -1;
+	}
+	
+	write(fd_out, &remote, 1);
+	
+	if (remote < 0)
+		return 0;
+	
+	posn = 0;
+	do {
+		size = write(fd_out, buf + posn, objsize - posn);
+		if (size <= 0) {
+			if (!size) {
+				fprintf(stderr, "rpush: write closed");
+			} else {
+				perror("rpush: write ");
 			}
-			if (!size)
-				return;
-			posn += size;
-		} while (posn < 20);
+			return -1;
+		}
+		posn += size;
+	} while (posn < objsize);
+	return 0;
+}
 
-		/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+int serve_ref(int fd_in, int fd_out)
+{
+	char dir[PATH_MAX], name[PATH_MAX];
+	unsigned char sha1[20];
+	int posn = 0;
+	signed char remote = 0;
+	do {
+		if (read(fd_in, dir + posn, 1) < 1)
+			return -1;
+		posn++;
+	} while (dir[posn - 1]);
+	posn = 0;
+	do {
+		if (read(fd_in, name + posn, 1) < 1)
+			return -1;
+		posn++;
+	} while (name[posn - 1]);
+	if (get_split_ref_sha1(dir, name, sha1))
+		remote = -1;
+	write(fd_out, &remote, 1);
+	if (remote)
+		return 0;
+	write(fd_out, sha1, 20);
+	return 0;
+}
 
-		buf = map_sha1_file(sha1, &objsize);
-		if (!buf) {
-			fprintf(stderr, "rpush: could not find %s\n", 
-				sha1_to_hex(sha1));
+void service(int fd_in, int fd_out) {
+	char type;
+	int retval;
+	do {
+		retval = read(fd_in, &type, 1);
+		if (retval < 1) {
+			if (retval < 0)
+				perror("rpush: read ");
 			return;
 		}
-		posn = 0;
-		do {
-			size = write(fd_out, buf + posn, objsize - posn);
-			if (size <= 0) {
-				if (!size) {
-					fprintf(stderr, "rpush: write closed");
-				} else {
-					perror("rpush: write ");
-				}
-				return;
-			}
-			posn += size;
-		} while (posn < objsize);
+		if (type == 'o' && serve_object(fd_in, fd_out))
+			return;
+		if (type == 'r' && serve_ref(fd_in, fd_out))
+			return;
 	} while (1);
 }
 
@@ -53,6 +104,8 @@
         char *url;
 	int fd_in, fd_out;
 	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 'w')
+			arg++;
                 arg++;
         }
         if (argc < arg + 2) {
Index: rsh.c
===================================================================
--- 4931f2d8b9c2ab83718f6446d5ef3af5fa320b3f/rsh.c  (mode:100644 sha1:5d1cb9d578a8e679fc190a9d7d2c842ad811223f)
+++ a219d8e31f3882aaa32e7dbac7a1f92b35a9dbff/rsh.c  (mode:100644 sha1:192d8f67e9a5e2bf7bb9e14c8c037dff49e74d57)
@@ -36,8 +36,8 @@
 	*(path++) = '\0';
 	/* ssh <host> 'cd /<path>; stdio-pull <arg...> <commit-id>' */
 	snprintf(command, COMMAND_SIZE, 
-		 "cd /%s; %s=objects %s",
-		 path, DB_ENVIRONMENT, remote_prog);
+		 "cd /%s; GIT_DIR=. %s",
+		 path, remote_prog);
 	posn = command + strlen(command);
 	for (i = 0; i < rmt_argc; i++) {
 		*(posn++) = ' ';


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] improved delta support for git
  2005-05-12 17:16  0%           ` Junio C Hamano
@ 2005-05-13 11:44  0%             ` Chris Mason
  0 siblings, 0 replies; 200+ results
From: Chris Mason @ 2005-05-13 11:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, jon, Git Mailing List

On Thursday 12 May 2005 13:16, Junio C Hamano wrote:
> >>>>> "NP" == Nicolas Pitre <nico@cam.org> writes:
> >>
> >> On 5/13/05, Chris Mason <mason@suse.com> wrote:
> >> > On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
> >> > > It appears to me that changes to the make_sure_we_have_it() ...
> >> >
> >> > If we fetch the named object and it is a delta, the delta will either
> >> > depend on an object we already have or an object that we don't have. 
> >> > If we don't have it, the pull should find it while pulling other
> >> > commits we don't have.
>
> NP> 1) If you happen to already have the referenced object in your local
> NP>    repository then you're done.
>
> Yes.
>
> NP> 2) If not you pull the referenced object from the remote repository,
> NP>    repeat with #1 if it happens to be another delta object.
>
> Yes, that is the outline of what my (untested) patch does.
>
> Unless I am grossly mistaken, what Chris says is true only when
> we are pulling with -a flag to the git-*-pull family.  If we are
> pulling "partially near the tip", we do not necessarily pull
> "other commits we don't have", hence detecting delta's
> requirement at per-object level and pulling the dependent
> becomes necessary, which is essentially what you wrote in (2)
> above.
>
Yes, my post does assume that you're pulling everything and the repo you're 
pulling from has a sane state.  This should be the common case though, so I 
would suggest optimizing things to build a list of the delta objects and 
check them at the end to see if we didn't pull any.

We want the list of delta objects regardless, this way we can warn the user 
that they have pulled in deltas and give them the chance to convert them into 
full files.

-chris

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-13  6:49  4% [PATCH 0/4] Pulling refs files Daniel Barkalow
                   ` (2 preceding siblings ...)
  2005-05-13  7:01  3% ` [PATCH 4/4] Pulling refs by ssh Daniel Barkalow
@ 2005-05-13 22:19  0% ` Petr Baudis
  2005-05-13 23:14  3%   ` Daniel Barkalow
  3 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-13 22:19 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dear diary, on Fri, May 13, 2005 at 08:49:25AM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> This series makes the following changes:
> 
>  1: Adds support for having the C code know about the general existance of
>     .git/refs, and functions for writing these files.
>  2: Adds support in the generic pull code for fetching refs (and dummy
>     implementations).
>  3: Adds support in the HTTP pull code for fetching refs
>  4: Adds support in the rsh pull code for fetching refs; this requires
>     changes to the protocol. These changes should be sufficient to support
>     any future extension, however.

Hmm, I've honestly expected something different - a generic way to
specify any file in the repository to be pulled along, instead of a
introducing refs awareness at this level of git. What would be the
advantages of that approach against just specifying list of other files
to pull along?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-13 22:19  0% ` [PATCH 0/4] Pulling refs files Petr Baudis
@ 2005-05-13 23:14  3%   ` Daniel Barkalow
  2005-05-13 23:37  0%     ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-05-13 23:14 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

On Sat, 14 May 2005, Petr Baudis wrote:

> Hmm, I've honestly expected something different - a generic way to
> specify any file in the repository to be pulled along, instead of a
> introducing refs awareness at this level of git. What would be the
> advantages of that approach against just specifying list of other files
> to pull along?

The point is to specify the commit to pull by fetching a file from the
other side, not just to move a file. So you need to be specifying that the
file is a hex encoding of the sha1 hash of the starting point of the pull,
and the refs/ area is where these are expected to be. (Note that it still
doesn't have any knowledge about the meanings of files in refs/; you tell
it which one you want to use, and optionally which one you want to write
to, and it will use the names you provide).

It wouldn't help much to download the head file if you had to know the
contents of that file already in order to do everything as a single
transfer.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-13 23:14  3%   ` Daniel Barkalow
@ 2005-05-13 23:37  0%     ` Petr Baudis
  2005-05-15  3:23  3%       ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-13 23:37 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dear diary, on Sat, May 14, 2005 at 01:14:22AM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> On Sat, 14 May 2005, Petr Baudis wrote:
> 
> > Hmm, I've honestly expected something different - a generic way to
> > specify any file in the repository to be pulled along, instead of a
> > introducing refs awareness at this level of git. What would be the
> > advantages of that approach against just specifying list of other files
> > to pull along?
> 
> The point is to specify the commit to pull by fetching a file from the
> other side, not just to move a file. So you need to be specifying that the
> file is a hex encoding of the sha1 hash of the starting point of the pull,
> and the refs/ area is where these are expected to be. (Note that it still
> doesn't have any knowledge about the meanings of files in refs/; you tell
> it which one you want to use, and optionally which one you want to write
> to, and it will use the names you provide).
> 
> It wouldn't help much to download the head file if you had to know the
> contents of that file already in order to do everything as a single
> transfer.

So what about just something like

	git-wormhole-pull remote:refs/head/master wormhole://localhost/

That is, you could just specify remote:path_relative_to_url instead of
SHA1 id as the commit.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-13 23:37  0%     ` Petr Baudis
@ 2005-05-15  3:23  3%       ` Daniel Barkalow
  2005-05-17 20:14  0%         ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-05-15  3:23 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

On Sat, 14 May 2005, Petr Baudis wrote:

> So what about just something like
> 
> 	git-wormhole-pull remote:refs/head/master wormhole://localhost/
> 
> That is, you could just specify remote:path_relative_to_url instead of
> SHA1 id as the commit.

Do you have any sensible alternatives to "remote:refs/<something>" in
mind? I suppose that "remote:HEAD" would also work. How are you thinking
of having the value get written locally?

Do you also have some idea for user-invoked rpush? It has to call
something that writes the value on the other side (and I'd ideally like it
to do the update atomically and locked against other clients). This series
uses the same mechanism to write it that it uses to write hashes fetched
from remote machines.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* [PATCH] Add --author and --committer match to rev-list and rev-tree.
  @ 2005-05-15  4:57  2% ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-15  4:57 UTC (permalink / raw)
  To: pasky, torvalds; +Cc: git

Zack Brown wondered if handling author match at core GIT level
would make cg-log -u go faster (JIT also can use this in jit-log
--author).  Later Peter Baudis wanted to have --committer match
similar to it.

This version is improved from the one I posted to GIT list
earlier in that:

 (1) I bit the bullet and added author and committer names to
     the commit object, so there is no double unpacking anymore.
     The strings are shared across multiple commits so consider
     them intern'ed and do not free them.

 (2) Determination of if author and committer names are
     "interesting" is done only once per name, not per commit.
     This version uses simple "substring" logic, but it is
     easily extendable for more interesting match such as
     regexps.

This also fixes documentation of git-rev-list which did not
describe its already existing options.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** The previous round that has not been merged to git-pb should
*** be discarded.  This is a patch against the tip of git-pb as
*** of this writing.

Documentation/git-rev-list.txt    |   19 +++++
Documentation/git-rev-tree.txt    |    8 +-
commit.c                          |  135 +++++++++++++++++++++++++++++++++-----
commit.h                          |   18 ++++-
rev-list.c                        |   18 ++++-
rev-tree.c                        |   27 +++++++
t/t6010-rev-tree-author-commit.sh |   98 +++++++++++++++++++++++++++
t/t6110-rev-list-author-commit.sh |   97 +++++++++++++++++++++++++++
8 files changed, 396 insertions(+), 24 deletions(-)
t/t6010-rev-tree-author-commit.sh (. --> 100755)
t/t6110-rev-list-author-commit.sh (. --> 100755)

--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -9,7 +9,7 @@
 
 SYNOPSIS
 --------
-'git-rev-list' <commit>
+'git-rev-list' [--author=author] [--committer=committer] [--max-count=count] [--max-age=unixtime] [--min-age=unixtime] <commit>
 
 DESCRIPTION
 -----------
@@ -18,6 +18,23 @@
 useful to produce human-readable log output.
 
 
+OPTIONS
+-------
+--author::
+	Limit the final output to commits written by the author.
+
+--committer::
+	Limit the final output to commits committed by the author.
+
+--max-count::
+	Stop after showing the specified number of commits.
+
+--max-age::
+	Stop before showing the commits older than specified time.
+
+--min-age::
+	Do not show the commits newer than specified time.
+
 Author
 ------
 Written by Linus Torvalds <torvalds@osdl.org>
--- a/Documentation/git-rev-tree.txt
+++ b/Documentation/git-rev-tree.txt
@@ -9,7 +9,7 @@
 
 SYNOPSIS
 --------
-'git-rev-tree' [--edges] [--cache <cache-file>] [^]<commit> [[^]<commit>]
+'git-rev-tree' [--author author] [--committer committer] [--edges] [--cache <cache-file>] [^]<commit> [[^]<commit>]
 
 DESCRIPTION
 -----------
@@ -17,6 +17,12 @@
 
 OPTIONS
 -------
+--author::
+	Limit the final output to commits written by the author.
+
+--committer::
+	Limit the final output to commits committed by the author.
+
 --edges::
 	Show edges (ie places where the marking changes between parent
 	and child)
--- a/commit.c
+++ b/commit.c
@@ -23,22 +23,111 @@
 	return (struct commit *) obj;
 }
 
-static unsigned long parse_commit_date(const char *buf)
-{
-	unsigned long date;
+static struct person_name **person_name;
+static int person_nr;
+static int person_alloc;
+
+static const char *interesting_author = NULL;
+static const char *interesting_committer = NULL;
+
+void commit_author_committer_match_initialize(const char *author,
+					      const char *committer)
+{
+	interesting_author = author;
+	interesting_committer = committer;
+}
+
+static int person_name_pos(const char *name)
+{
+	int first, last;
+
+	first = 0;
+	last = person_nr;
+	while (last > first) {
+		int next = (last + first) >> 1;
+		struct person_name *pn = person_name[next];
+		int cmp = strcmp(name, pn->name);
+		if (!cmp)
+			return next;
+		if (cmp < 0) {
+			last = next;
+			continue;
+		}
+		first = next+1;
+	}
+	return -first-1;
+}
 
-	if (memcmp(buf, "author", 6))
-		return 0;
-	while (*buf++ != '\n')
-		/* nada */;
-	if (memcmp(buf, "committer", 9))
-		return 0;
-	while (*buf++ != '>')
-		/* nada */;
-	date = strtoul(buf, NULL, 10);
-	if (date == ULONG_MAX)
-		date = 0;
-	return date;
+#define INTERESTING_AUTHOR    01
+#define INTERESTING_COMMITTER 02
+
+static struct person_name *person_name_lookup(const char *name)
+{
+	int pos = person_name_pos(name);
+	struct person_name *pn; 
+	if (0 <= pos)
+		return person_name[pos];
+	pos = -pos-1;
+	pn = xmalloc(sizeof(*pn) + strlen(name) + 1);
+	strcpy(pn->name, name);
+	pn->mark = 0;
+
+	/*
+	 * When we decide we want to go fancier, strstr() below
+	 * can be replaced with something like regexp match to
+	 * pick up more than one author or committer.  For now
+	 * let's try simple substring find and see what happens.
+	 * Note that not specifying anybody means we are interested
+	 * in everybody.
+	 */
+	if (!interesting_author || strstr(name, interesting_author))
+		pn->mark |= INTERESTING_AUTHOR;
+	if (!interesting_committer || strstr(name, interesting_committer))
+		pn->mark |= INTERESTING_COMMITTER;
+
+	if (person_nr == person_alloc) {
+		person_alloc = alloc_nr(person_alloc);
+		person_name = xrealloc(person_name, person_alloc *
+				       sizeof(struct person_name *));
+	}
+	person_nr++;
+	if (pos < person_nr)
+		memmove(person_name + pos + 1, person_name + pos,
+			(person_nr - pos - 1) * sizeof(pn));
+	person_name[pos] = pn;
+	return pn;
+}
+
+static void *parse_commit_nametime(char *ptr, const char *ep,
+				   const char *field,
+				   unsigned long *date,
+				   struct person_name **name)
+{
+	unsigned long d;
+	char *cp;
+	int fldlen = strlen(field);
+	if (ptr == NULL || memcmp(ptr, field, fldlen) || ptr[fldlen] != ' ') {
+		*date = 0;
+		*name = NULL;
+		return NULL; /* malformed commit */
+	}
+	for (cp = ptr + fldlen + 1; cp < ep && *cp != '>'; cp++)
+		; /* skip */
+	if (*cp != '>' || *++cp != ' ') {
+		*date = 0;
+		*name = NULL;
+		return NULL; /* malformed commit */
+	}
+	*cp = 0;
+	*name = person_name_lookup(ptr + fldlen + 1);
+	*cp++ = ' ';
+	d = strtoul(cp, NULL, 10);
+	if (d == ULONG_MAX)
+		d = 0;
+	*date = d;
+	while (cp < ep && *cp++ != '\n')
+		; /* skip */
+	return cp;
 }
 
 int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
@@ -63,7 +152,13 @@
 		}
 		bufptr += 48;
 	}
-	item->date = parse_commit_date(bufptr);
+	
+	bufptr = parse_commit_nametime(bufptr, (char *)buffer + size,
+				       "author", &item->author_date,
+				       &item->author);
+	parse_commit_nametime(bufptr, (char *)buffer + size,
+			      "committer", &item->date,
+			      &item->committer);
 	return 0;
 }
 
@@ -152,3 +247,11 @@
 	}
 	return ret;
 }
+
+int commit_author_committer_match(struct commit *item)
+{
+	return ( ((item->author != NULL) &&
+		  (item->author->mark & INTERESTING_AUTHOR)) &&
+		 ((item->committer != NULL) &&
+		  (item->committer->mark & INTERESTING_COMMITTER)) );
+}
--- a/commit.h
+++ b/commit.h
@@ -11,8 +11,9 @@
 
 struct commit {
 	struct object object;
-	unsigned long date;
+	unsigned long author_date, date;
 	struct commit_list *parents;
+	struct person_name *author, *committer; 
 	struct tree *tree;
 };
 
@@ -36,4 +37,19 @@
 struct commit *pop_most_recent_commit(struct commit_list **list, 
 				      unsigned int mark);
 
+struct person_name {
+	char mark;
+	char name[0];
+};
+
+/* This function must be called before fetching any commit object
+ * if commit-author-committer-match function is to be used to filter
+ * commits for output.  Passing NULL is permitted, which makes all
+ * authors (or committers) "interesting".
+ */
+void commit_author_committer_match_initialize(const char *, const char *);
+
+/* Returns true only if author and committer are "interesting". */
+int commit_author_committer_match(struct commit *);
+
 #endif /* COMMIT_H */
--- a/rev-list.c
+++ b/rev-list.c
@@ -11,6 +11,8 @@
 	unsigned long max_age = -1;
 	unsigned long min_age = -1;
 	int max_count = -1;
+	const char *author = NULL;
+	const char *committer = NULL;
 
 	for (i = 1 ; i < argc; i++) {
 		char *arg = argv[i];
@@ -21,6 +23,10 @@
 			max_age = atoi(arg + 10);
 		} else if (!strncmp(arg, "--min-age=", 10)) {
 			min_age = atoi(arg + 10);
+		} else if (!strncmp(arg, "--author=", 9)) {
+			author = arg + 9;
+		} else if (!strncmp(arg, "--committer=", 12)) {
+			committer = arg + 12;
 		} else {
 			commit_arg = arg;
 		}
@@ -28,9 +34,13 @@
 
 	if (!commit_arg || get_sha1(commit_arg, sha1))
 		usage("usage: rev-list [OPTION] commit-id\n"
-		      "  --max-count=nr\n"
-		      "  --max-age=epoch\n"
-		      "  --min-age=epoch\n");
+		      "  --author=author\n"
+		      "  --committer=committer\n"
+		      "  --max-count=number\n"
+		      "  --max-age=unixtime\n"
+		      "  --min-age=unixtime\n");
+
+	commit_author_committer_match_initialize(author, committer);
 
 	commit = lookup_commit(sha1);
 	if (!commit || parse_commit(commit) < 0)
@@ -44,6 +54,8 @@
 			continue;
 		if (max_age != -1 && (commit->date < max_age))
 			break;
+		if (!commit_author_committer_match(commit))
+			continue;
 		if (max_count != -1 && !max_count--)
 			break;
 		printf("%s\n", sha1_to_hex(commit->object.sha1));
--- a/rev-tree.c
+++ b/rev-tree.c
@@ -64,7 +64,7 @@
 }
 
 /*
- * Usage: rev-tree [--edges] [--cache <cache-file>] <commit-id> [<commit-id2>]
+ * Usage: rev-tree [--edges] [--author <author>] [--cache <cache-file>] <commit-id> [<commit-id2>]
  *
  * The cache-file can be quite important for big trees. This is an
  * expensive operation if you have to walk the whole chain of
@@ -75,6 +75,9 @@
 	int i;
 	int nr = 0;
 	unsigned char sha1[MAX_COMMITS][20];
+	const char *author = NULL; 
+	const char *committer = NULL;
+	int initialized_author_commiter = 0;
 
 	/*
 	 * First - pick up all the revisions we can (both from
@@ -83,6 +86,16 @@
 	for (i = 1; i < argc ; i++) {
 		char *arg = argv[i];
 
+		if (!strcmp(arg, "--author")) {
+			author = argv[++i];
+			continue;
+		}
+
+		if (!strcmp(arg, "--committer")) {
+			committer = argv[++i];
+			continue;
+		}
+
 		if (!strcmp(arg, "--cache")) {
 			read_cache_file(argv[++i]);
 			continue;
@@ -98,7 +111,14 @@
 			basemask |= 1<<nr;
 		}
 		if (nr >= MAX_COMMITS || get_sha1(arg, sha1[nr]))
-			usage("rev-tree [--edges] [--cache <cache-file>] <commit-id> [<commit-id>]");
+			usage("rev-tree [--edges] [--author <author>] [--committer <committer>] [--cache <cache-file>] <commit-id> [<commit-id>]");
+
+		if (!initialized_author_commiter) {
+			commit_author_committer_match_initialize(author,
+								 committer);
+			initialized_author_commiter = 1;
+		}
+
 		process_commit(sha1[nr]);
 		nr++;
 	}
@@ -125,6 +145,9 @@
 		if (!interesting(commit))
 			continue;
 
+		if (!commit_author_committer_match(commit))
+			continue;
+
 		printf("%lu %s:%d", commit->date, sha1_to_hex(obj->sha1), 
 		       obj->flags);
 		p = commit->parents;
--- a/t/t6010-rev-tree-author-commit.sh
+++ b/t/t6010-rev-tree-author-commit.sh
@@ -0,0 +1,98 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-rev-tree --author and --committer flags.
+'
+
+. ./test-lib.sh
+
+export_them () {
+export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL
+}
+
+set_author_zero () {
+    GIT_AUTHOR_NAME="A U Thor" &&
+    GIT_AUTHOR_EMAIL="<author@example.xz>"
+}
+set_author_one () {
+    GIT_AUTHOR_NAME="R O Htua" &&
+    GIT_AUTHOR_EMAIL="<rohtua@example.xz>"
+}
+set_committer_zero () {
+    GIT_COMMITTER_NAME="C O Mmitter" &&
+    GIT_COMMITTER_EMAIL="<committer@example.xz>"
+}
+set_committer_one () {
+    GIT_COMMITTER_NAME="R E Ttimmoc" &&
+    GIT_COMMITTER_EMAIL="<rettimmoc@example.xz>"
+}
+
+# read rev-tree output, and find the author|committer of commit object
+pick_actor () {
+    sed -e 's/^[^ ]* //;s/:.*//' |
+    xargs -r -n1 git-cat-file commit |
+    sed -ne 's/^\('"$1"'\) \([^>]*>\).*/\2/p'
+}
+
+test_expect_success \
+    'preparation' '
+echo frotz >path0 &&
+git-update-cache --add path0 &&
+tree0=$(git-write-tree) &&
+
+set_author_zero &&
+set_committer_zero &&
+export_them &&
+commit0=$(echo frotz | git-commit-tree $tree0) &&
+
+set_author_zero &&
+set_committer_one &&
+export_them &&
+commit1=$(echo frotz | git-commit-tree $tree0 -p $commit0) &&
+
+set_author_one &&
+set_committer_zero &&
+export_them &&
+commit2=$(echo frotz | git-commit-tree $tree0 -p $commit1) &&
+
+set_author_one &&
+set_committer_one &&
+export_them &&
+commit3=$(echo frotz | git-commit-tree $tree0 -p $commit2)'
+
+test_expect_success \
+    'without restriction git-rev-tree should report all four commits.' \
+    'test $(git-rev-tree $commit3 | wc -l) == 4'
+
+test_expect_success \
+    'limiting to A U Thor (two commits)' '
+    case "$(git-rev-tree --author "A U Thor" $commit3 |
+          pick_actor "author" | sort -u)" in
+    "A U Thor <author@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+test_expect_success \
+    'limiting to R E Ttimmoc (two commits)' '
+    case "$(git-rev-tree --committer "R E Ttimmoc" $commit3 |
+          pick_actor "committer" | sort -u)" in
+    "R E Ttimmoc <rettimmoc@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+LF='
+'
+
+test_expect_success \
+    'limiting to A U Thor and C O Mmitter (one commit)' '
+    case "$(git-rev-tree --author "A U Thor" --committer "C O Mmitter" \
+            $commit3 | pick_actor "committer\\|author" | sort -u)" in
+    "A U Thor <author@example.xz>${LF}C O Mmitter <committer@example.xz>")
+        : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_done
--- a/t/t6110-rev-list-author-commit.sh
+++ b/t/t6110-rev-list-author-commit.sh
@@ -0,0 +1,97 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-rev-list --author and --committer flags.
+'
+
+. ./test-lib.sh
+
+export_them () {
+export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL
+}
+
+set_author_zero () {
+    GIT_AUTHOR_NAME="A U Thor" &&
+    GIT_AUTHOR_EMAIL="<author@example.xz>"
+}
+set_author_one () {
+    GIT_AUTHOR_NAME="R O Htua" &&
+    GIT_AUTHOR_EMAIL="<rohtua@example.xz>"
+}
+set_committer_zero () {
+    GIT_COMMITTER_NAME="C O Mmitter" &&
+    GIT_COMMITTER_EMAIL="<committer@example.xz>"
+}
+set_committer_one () {
+    GIT_COMMITTER_NAME="R E Ttimmoc" &&
+    GIT_COMMITTER_EMAIL="<rettimmoc@example.xz>"
+}
+
+# read rev-list output, and find the author|committer of commit object
+pick_actor () {
+    xargs -r -n1 git-cat-file commit |
+    sed -ne 's/^\('"$1"'\) \([^>]*>\).*/\2/p'
+}
+
+test_expect_success \
+    'preparation' '
+echo frotz >path0 &&
+git-update-cache --add path0 &&
+tree0=$(git-write-tree) &&
+
+set_author_zero &&
+set_committer_zero &&
+export_them &&
+commit0=$(echo frotz | git-commit-tree $tree0) &&
+
+set_author_zero &&
+set_committer_one &&
+export_them &&
+commit1=$(echo frotz | git-commit-tree $tree0 -p $commit0) &&
+
+set_author_one &&
+set_committer_zero &&
+export_them &&
+commit2=$(echo frotz | git-commit-tree $tree0 -p $commit1) &&
+
+set_author_one &&
+set_committer_one &&
+export_them &&
+commit3=$(echo frotz | git-commit-tree $tree0 -p $commit2)'
+
+test_expect_success \
+    'without restriction git-rev-tree should report all four commits.' \
+    'test $(git-rev-list $commit3 | wc -l) == 4'
+
+test_expect_success \
+    'limiting to A U Thor (two commits)' '
+    case "$(git-rev-list --author="A U Thor" $commit3 |
+          pick_actor "author" | sort -u)" in
+    "A U Thor <author@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+test_expect_success \
+    'limiting to R E Ttimmoc (two commits)' '
+    case "$(git-rev-list --committer="R E Ttimmoc" $commit3 |
+          pick_actor "committer" | sort -u)" in
+    "R E Ttimmoc <rettimmoc@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+LF='
+'
+
+test_expect_success \
+    'limiting to A U Thor and C O Mmitter (one commit)' '
+    case "$(git-rev-list --author="A U Thor" --committer="C O Mmitter" \
+            $commit3 | pick_actor "committer\\|author" | sort -u)" in
+    "A U Thor <author@example.xz>${LF}C O Mmitter <committer@example.xz>")
+        : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_done
------------------------------------------------


^ permalink raw reply	[relevance 2%]

* Re: Mercurial 0.4e vs git network pull
  @ 2005-05-15 12:40  3% ` Petr Baudis
  2005-05-16 22:22  0%   ` Tristan Wibberley
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-15 12:40 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: mpm, git, jgarzik, linux-kernel, mercurial, torvalds

Dear diary, on Sun, May 15, 2005 at 01:22:19PM CEST, I got a letter
where "Adam J. Richter" <adam@yggdrasil.com> told me that...
> On Sun, 15 May 2005 10:54:05 +0200, Petr Baudis wrote:
> >Dear diary, on Thu, May 12, 2005 at 10:57:35PM CEST, I got a letter
> >where Matt Mackall <mpm@selenic.com> told me that...
> >> Does this need an HTTP request (and round trip) per object? It appears
> >> to. That's 2200 requests/round trips for my 800 patch benchmark.
> 
> >Yes it does. On the other side, it needs no server-side CGI. But I guess
> >it should be pretty easy to write some kind of server-side CGI streamer,
> >and it would then easily take just a single HTTP request (telling the
> >server the commit ID and receiving back all the objects).
> 
> 	I don't understand what was wrong with Jeff Garzik's previous
> suggestion of using http/1.1 pipelining to coalesce the round trips.
> If you're worried about queuing too many http/1.1 requests, the client
> could adopt a policy of not having more than a certain number of
> requests outstanding or perhaps even making a new http connection
> after a certain number of requests to avoid starving other clients
> when the number of clients doing one of these transfers exceeds the
> number of threads that the http server uses.

The problem is that to fetch a revision tree, you have to

	send request for commit A
	receive commit A
	look at commit A for list of its parents
	send request for the parents
	receive the parents
	look inside for list of its parents
	...

(and same for the trees).

> 	Being able to do without a server side CGI script might
> encourage deployment a bit more, both for security reasons and
> effort of deployment.

You could still use it without the server side CGI script as it is now,
just without the speedups.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: Mercurial 0.4e vs git network pull
@ 2005-05-15 11:52  0% Adam J. Richter
  2005-05-15 14:23  0% ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Adam J. Richter @ 2005-05-15 11:52 UTC (permalink / raw)
  To: pasky; +Cc: git, jgarzik, linux-kernel, mercurial, mpm, torvalds

On Sun, 15 May 2005 14:40:42 +0200, Petr Baudis wrote:
>Dear diary, on Sun, May 15, 2005 at 01:22:19PM CEST, I got a letter
>where "Adam J. Richter" <adam@yggdrasil.com> told me that...
[...]
>> 	I don't understand what was wrong with Jeff Garzik's previous
>> suggestion of using http/1.1 pipelining to coalesce the round trips.
>> If you're worried about queuing too many http/1.1 requests, the client
>> could adopt a policy of not having more than a certain number of
>> requests outstanding or perhaps even making a new http connection
>> after a certain number of requests to avoid starving other clients
>> when the number of clients doing one of these transfers exceeds the
>> number of threads that the http server uses.

>The problem is that to fetch a revision tree, you have to

>	send request for commit A
>	receive commit A
>	look at commit A for list of its parents
>	send request for the parents
>	receive the parents
>	look inside for list of its parents
>	...

>(and same for the trees).

	Don't you usually have a list of many files for which you
want to retrieve this information?  I'd imagine that would usually
suffice to fill the pipeline.

                    __     ______________
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l

^ permalink raw reply	[relevance 0%]

* Re: Mercurial 0.4e vs git network pull
  2005-05-15 11:52  0% Adam J. Richter
@ 2005-05-15 14:23  0% ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-05-15 14:23 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: git, jgarzik, linux-kernel, mercurial, mpm, torvalds

Dear diary, on Sun, May 15, 2005 at 01:52:50PM CEST, I got a letter
where "Adam J. Richter" <adam@yggdrasil.com> told me that...
> On Sun, 15 May 2005 14:40:42 +0200, Petr Baudis wrote:
> >Dear diary, on Sun, May 15, 2005 at 01:22:19PM CEST, I got a letter
> >where "Adam J. Richter" <adam@yggdrasil.com> told me that...
> [...]
> >> 	I don't understand what was wrong with Jeff Garzik's previous
> >> suggestion of using http/1.1 pipelining to coalesce the round trips.
> >> If you're worried about queuing too many http/1.1 requests, the client
> >> could adopt a policy of not having more than a certain number of
> >> requests outstanding or perhaps even making a new http connection
> >> after a certain number of requests to avoid starving other clients
> >> when the number of clients doing one of these transfers exceeds the
> >> number of threads that the http server uses.
> 
> >The problem is that to fetch a revision tree, you have to
> 
> >	send request for commit A
> >	receive commit A
> >	look at commit A for list of its parents
> >	send request for the parents
> >	receive the parents
> >	look inside for list of its parents
> >	...
> 
> >(and same for the trees).
> 
> 	Don't you usually have a list of many files for which you
> want to retrieve this information?  I'd imagine that would usually
> suffice to fill the pipeline.

That might be true for the trees, but not for the commit lists. Most
commits have a single parent, except merges, which are however extremely
rare with more than two parents too.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* [PATCH 1/4] Add --author and --committer match to git-rev-list and git-rev-tree.
@ 2005-05-15 21:18  2% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-15 21:18 UTC (permalink / raw)
  To: pasky; +Cc: git, torvalds

Zack Brown wondered if handling author match at core GIT level
would make cg-log -u go faster (JIT also can use this in jit-log
--author).  Later Peter Baudis wanted to have --committer match
similar to it.

This version is improved from the one I posted to GIT list
earlier in that:

 (1) I bit the bullet and added author and committer names to
     the commit object, so there is no double unpacking anymore.
     The strings are shared across multiple commits so consider
     them intern'ed and do not free them.

 (2) Determination of if author and committer names are
     "interesting" is done only once per name, not per commit.
     This version uses simple "substring" logic, but it is
     easily extendable for more interesting match such as
     regexps.

This also fixes documentation of git-rev-list which did not
describe its already existing options.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/git-rev-list.txt    |   19 +++++
Documentation/git-rev-tree.txt    |    8 +-
commit.c                          |  135 +++++++++++++++++++++++++++++++++-----
commit.h                          |   18 ++++-
rev-list.c                        |   18 ++++-
rev-tree.c                        |   27 +++++++
t/t6010-rev-tree-author-commit.sh |   98 +++++++++++++++++++++++++++
t/t6110-rev-list-author-commit.sh |   97 +++++++++++++++++++++++++++
8 files changed, 396 insertions(+), 24 deletions(-)
t/t6010-rev-tree-author-commit.sh (. --> 100755)
t/t6110-rev-list-author-commit.sh (. --> 100755)

--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -9,7 +9,7 @@
 
 SYNOPSIS
 --------
-'git-rev-list' <commit>
+'git-rev-list' [--author=author] [--committer=committer] [--max-count=count] [--max-age=unixtime] [--min-age=unixtime] <commit>
 
 DESCRIPTION
 -----------
@@ -18,6 +18,23 @@
 useful to produce human-readable log output.
 
 
+OPTIONS
+-------
+--author::
+	Limit the final output to commits written by the author.
+
+--committer::
+	Limit the final output to commits committed by the author.
+
+--max-count::
+	Stop after showing the specified number of commits.
+
+--max-age::
+	Stop before showing the commits older than specified time.
+
+--min-age::
+	Do not show the commits newer than specified time.
+
 Author
 ------
 Written by Linus Torvalds <torvalds@osdl.org>
--- a/Documentation/git-rev-tree.txt
+++ b/Documentation/git-rev-tree.txt
@@ -9,7 +9,7 @@
 
 SYNOPSIS
 --------
-'git-rev-tree' [--edges] [--cache <cache-file>] [^]<commit> [[^]<commit>]
+'git-rev-tree' [--author author] [--committer committer] [--edges] [--cache <cache-file>] [^]<commit> [[^]<commit>]
 
 DESCRIPTION
 -----------
@@ -17,6 +17,12 @@
 
 OPTIONS
 -------
+--author::
+	Limit the final output to commits written by the author.
+
+--committer::
+	Limit the final output to commits committed by the author.
+
 --edges::
 	Show edges (ie places where the marking changes between parent
 	and child)
--- a/commit.c
+++ b/commit.c
@@ -23,22 +23,111 @@
 	return (struct commit *) obj;
 }
 
-static unsigned long parse_commit_date(const char *buf)
-{
-	unsigned long date;
+static struct person_name **person_name;
+static int person_nr;
+static int person_alloc;
+
+static const char *interesting_author = NULL;
+static const char *interesting_committer = NULL;
+
+void commit_author_committer_match_initialize(const char *author,
+					      const char *committer)
+{
+	interesting_author = author;
+	interesting_committer = committer;
+}
+
+static int person_name_pos(const char *name)
+{
+	int first, last;
+
+	first = 0;
+	last = person_nr;
+	while (last > first) {
+		int next = (last + first) >> 1;
+		struct person_name *pn = person_name[next];
+		int cmp = strcmp(name, pn->name);
+		if (!cmp)
+			return next;
+		if (cmp < 0) {
+			last = next;
+			continue;
+		}
+		first = next+1;
+	}
+	return -first-1;
+}
 
-	if (memcmp(buf, "author", 6))
-		return 0;
-	while (*buf++ != '\n')
-		/* nada */;
-	if (memcmp(buf, "committer", 9))
-		return 0;
-	while (*buf++ != '>')
-		/* nada */;
-	date = strtoul(buf, NULL, 10);
-	if (date == ULONG_MAX)
-		date = 0;
-	return date;
+#define INTERESTING_AUTHOR    01
+#define INTERESTING_COMMITTER 02
+
+static struct person_name *person_name_lookup(const char *name)
+{
+	int pos = person_name_pos(name);
+	struct person_name *pn; 
+	if (0 <= pos)
+		return person_name[pos];
+	pos = -pos-1;
+	pn = xmalloc(sizeof(*pn) + strlen(name) + 1);
+	strcpy(pn->name, name);
+	pn->mark = 0;
+
+	/*
+	 * When we decide we want to go fancier, strstr() below
+	 * can be replaced with something like regexp match to
+	 * pick up more than one author or committer.  For now
+	 * let's try simple substring find and see what happens.
+	 * Note that not specifying anybody means we are interested
+	 * in everybody.
+	 */
+	if (!interesting_author || strstr(name, interesting_author))
+		pn->mark |= INTERESTING_AUTHOR;
+	if (!interesting_committer || strstr(name, interesting_committer))
+		pn->mark |= INTERESTING_COMMITTER;
+
+	if (person_nr == person_alloc) {
+		person_alloc = alloc_nr(person_alloc);
+		person_name = xrealloc(person_name, person_alloc *
+				       sizeof(struct person_name *));
+	}
+	person_nr++;
+	if (pos < person_nr)
+		memmove(person_name + pos + 1, person_name + pos,
+			(person_nr - pos - 1) * sizeof(pn));
+	person_name[pos] = pn;
+	return pn;
+}
+
+static void *parse_commit_nametime(char *ptr, const char *ep,
+				   const char *field,
+				   unsigned long *date,
+				   struct person_name **name)
+{
+	unsigned long d;
+	char *cp;
+	int fldlen = strlen(field);
+	if (ptr == NULL || memcmp(ptr, field, fldlen) || ptr[fldlen] != ' ') {
+		*date = 0;
+		*name = NULL;
+		return NULL; /* malformed commit */
+	}
+	for (cp = ptr + fldlen + 1; cp < ep && *cp != '>'; cp++)
+		; /* skip */
+	if (*cp != '>' || *++cp != ' ') {
+		*date = 0;
+		*name = NULL;
+		return NULL; /* malformed commit */
+	}
+	*cp = 0;
+	*name = person_name_lookup(ptr + fldlen + 1);
+	*cp++ = ' ';
+	d = strtoul(cp, NULL, 10);
+	if (d == ULONG_MAX)
+		d = 0;
+	*date = d;
+	while (cp < ep && *cp++ != '\n')
+		; /* skip */
+	return cp;
 }
 
 int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
@@ -63,7 +152,13 @@
 		}
 		bufptr += 48;
 	}
-	item->date = parse_commit_date(bufptr);
+	
+	bufptr = parse_commit_nametime(bufptr, (char *)buffer + size,
+				       "author", &item->author_date,
+				       &item->author);
+	parse_commit_nametime(bufptr, (char *)buffer + size,
+			      "committer", &item->date,
+			      &item->committer);
 	return 0;
 }
 
@@ -152,3 +247,11 @@
 	}
 	return ret;
 }
+
+int commit_author_committer_match(struct commit *item)
+{
+	return ( ((item->author != NULL) &&
+		  (item->author->mark & INTERESTING_AUTHOR)) &&
+		 ((item->committer != NULL) &&
+		  (item->committer->mark & INTERESTING_COMMITTER)) );
+}
--- a/commit.h
+++ b/commit.h
@@ -11,8 +11,9 @@
 
 struct commit {
 	struct object object;
-	unsigned long date;
+	unsigned long author_date, date;
 	struct commit_list *parents;
+	struct person_name *author, *committer; 
 	struct tree *tree;
 };
 
@@ -36,4 +37,19 @@
 struct commit *pop_most_recent_commit(struct commit_list **list, 
 				      unsigned int mark);
 
+struct person_name {
+	char mark;
+	char name[0];
+};
+
+/* This function must be called before fetching any commit object
+ * if commit-author-committer-match function is to be used to filter
+ * commits for output.  Passing NULL is permitted, which makes all
+ * authors (or committers) "interesting".
+ */
+void commit_author_committer_match_initialize(const char *, const char *);
+
+/* Returns true only if author and committer are "interesting". */
+int commit_author_committer_match(struct commit *);
+
 #endif /* COMMIT_H */
--- a/rev-list.c
+++ b/rev-list.c
@@ -11,6 +11,8 @@
 	unsigned long max_age = -1;
 	unsigned long min_age = -1;
 	int max_count = -1;
+	const char *author = NULL;
+	const char *committer = NULL;
 
 	for (i = 1 ; i < argc; i++) {
 		char *arg = argv[i];
@@ -21,6 +23,10 @@
 			max_age = atoi(arg + 10);
 		} else if (!strncmp(arg, "--min-age=", 10)) {
 			min_age = atoi(arg + 10);
+		} else if (!strncmp(arg, "--author=", 9)) {
+			author = arg + 9;
+		} else if (!strncmp(arg, "--committer=", 12)) {
+			committer = arg + 12;
 		} else {
 			commit_arg = arg;
 		}
@@ -28,9 +34,13 @@
 
 	if (!commit_arg || get_sha1(commit_arg, sha1))
 		usage("usage: rev-list [OPTION] commit-id\n"
-		      "  --max-count=nr\n"
-		      "  --max-age=epoch\n"
-		      "  --min-age=epoch\n");
+		      "  --author=author\n"
+		      "  --committer=committer\n"
+		      "  --max-count=number\n"
+		      "  --max-age=unixtime\n"
+		      "  --min-age=unixtime\n");
+
+	commit_author_committer_match_initialize(author, committer);
 
 	commit = lookup_commit(sha1);
 	if (!commit || parse_commit(commit) < 0)
@@ -44,6 +54,8 @@
 			continue;
 		if (max_age != -1 && (commit->date < max_age))
 			break;
+		if (!commit_author_committer_match(commit))
+			continue;
 		if (max_count != -1 && !max_count--)
 			break;
 		printf("%s\n", sha1_to_hex(commit->object.sha1));
--- a/rev-tree.c
+++ b/rev-tree.c
@@ -64,7 +64,7 @@
 }
 
 /*
- * Usage: rev-tree [--edges] [--cache <cache-file>] <commit-id> [<commit-id2>]
+ * Usage: rev-tree [--edges] [--author <author>] [--cache <cache-file>] <commit-id> [<commit-id2>]
  *
  * The cache-file can be quite important for big trees. This is an
  * expensive operation if you have to walk the whole chain of
@@ -75,6 +75,9 @@
 	int i;
 	int nr = 0;
 	unsigned char sha1[MAX_COMMITS][20];
+	const char *author = NULL; 
+	const char *committer = NULL;
+	int initialized_author_commiter = 0;
 
 	/*
 	 * First - pick up all the revisions we can (both from
@@ -83,6 +86,16 @@
 	for (i = 1; i < argc ; i++) {
 		char *arg = argv[i];
 
+		if (!strcmp(arg, "--author")) {
+			author = argv[++i];
+			continue;
+		}
+
+		if (!strcmp(arg, "--committer")) {
+			committer = argv[++i];
+			continue;
+		}
+
 		if (!strcmp(arg, "--cache")) {
 			read_cache_file(argv[++i]);
 			continue;
@@ -98,7 +111,14 @@
 			basemask |= 1<<nr;
 		}
 		if (nr >= MAX_COMMITS || get_sha1(arg, sha1[nr]))
-			usage("rev-tree [--edges] [--cache <cache-file>] <commit-id> [<commit-id>]");
+			usage("rev-tree [--edges] [--author <author>] [--committer <committer>] [--cache <cache-file>] <commit-id> [<commit-id>]");
+
+		if (!initialized_author_commiter) {
+			commit_author_committer_match_initialize(author,
+								 committer);
+			initialized_author_commiter = 1;
+		}
+
 		process_commit(sha1[nr]);
 		nr++;
 	}
@@ -125,6 +145,9 @@
 		if (!interesting(commit))
 			continue;
 
+		if (!commit_author_committer_match(commit))
+			continue;
+
 		printf("%lu %s:%d", commit->date, sha1_to_hex(obj->sha1), 
 		       obj->flags);
 		p = commit->parents;
--- a/t/t6010-rev-tree-author-commit.sh
+++ b/t/t6010-rev-tree-author-commit.sh
@@ -0,0 +1,98 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-rev-tree --author and --committer flags.
+'
+
+. ./test-lib.sh
+
+export_them () {
+export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL
+}
+
+set_author_zero () {
+    GIT_AUTHOR_NAME="A U Thor" &&
+    GIT_AUTHOR_EMAIL="<author@example.xz>"
+}
+set_author_one () {
+    GIT_AUTHOR_NAME="R O Htua" &&
+    GIT_AUTHOR_EMAIL="<rohtua@example.xz>"
+}
+set_committer_zero () {
+    GIT_COMMITTER_NAME="C O Mmitter" &&
+    GIT_COMMITTER_EMAIL="<committer@example.xz>"
+}
+set_committer_one () {
+    GIT_COMMITTER_NAME="R E Ttimmoc" &&
+    GIT_COMMITTER_EMAIL="<rettimmoc@example.xz>"
+}
+
+# read rev-tree output, and find the author|committer of commit object
+pick_actor () {
+    sed -e 's/^[^ ]* //;s/:.*//' |
+    xargs -r -n1 git-cat-file commit |
+    sed -ne 's/^\('"$1"'\) \([^>]*>\).*/\2/p'
+}
+
+test_expect_success \
+    'preparation' '
+echo frotz >path0 &&
+git-update-cache --add path0 &&
+tree0=$(git-write-tree) &&
+
+set_author_zero &&
+set_committer_zero &&
+export_them &&
+commit0=$(echo frotz | git-commit-tree $tree0) &&
+
+set_author_zero &&
+set_committer_one &&
+export_them &&
+commit1=$(echo frotz | git-commit-tree $tree0 -p $commit0) &&
+
+set_author_one &&
+set_committer_zero &&
+export_them &&
+commit2=$(echo frotz | git-commit-tree $tree0 -p $commit1) &&
+
+set_author_one &&
+set_committer_one &&
+export_them &&
+commit3=$(echo frotz | git-commit-tree $tree0 -p $commit2)'
+
+test_expect_success \
+    'without restriction git-rev-tree should report all four commits.' \
+    'test $(git-rev-tree $commit3 | wc -l) == 4'
+
+test_expect_success \
+    'limiting to A U Thor (two commits)' '
+    case "$(git-rev-tree --author "A U Thor" $commit3 |
+          pick_actor "author" | sort -u)" in
+    "A U Thor <author@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+test_expect_success \
+    'limiting to R E Ttimmoc (two commits)' '
+    case "$(git-rev-tree --committer "R E Ttimmoc" $commit3 |
+          pick_actor "committer" | sort -u)" in
+    "R E Ttimmoc <rettimmoc@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+LF='
+'
+
+test_expect_success \
+    'limiting to A U Thor and C O Mmitter (one commit)' '
+    case "$(git-rev-tree --author "A U Thor" --committer "C O Mmitter" \
+            $commit3 | pick_actor "committer\\|author" | sort -u)" in
+    "A U Thor <author@example.xz>${LF}C O Mmitter <committer@example.xz>")
+        : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_done
--- a/t/t6110-rev-list-author-commit.sh
+++ b/t/t6110-rev-list-author-commit.sh
@@ -0,0 +1,97 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-rev-list --author and --committer flags.
+'
+
+. ./test-lib.sh
+
+export_them () {
+export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL
+}
+
+set_author_zero () {
+    GIT_AUTHOR_NAME="A U Thor" &&
+    GIT_AUTHOR_EMAIL="<author@example.xz>"
+}
+set_author_one () {
+    GIT_AUTHOR_NAME="R O Htua" &&
+    GIT_AUTHOR_EMAIL="<rohtua@example.xz>"
+}
+set_committer_zero () {
+    GIT_COMMITTER_NAME="C O Mmitter" &&
+    GIT_COMMITTER_EMAIL="<committer@example.xz>"
+}
+set_committer_one () {
+    GIT_COMMITTER_NAME="R E Ttimmoc" &&
+    GIT_COMMITTER_EMAIL="<rettimmoc@example.xz>"
+}
+
+# read rev-list output, and find the author|committer of commit object
+pick_actor () {
+    xargs -r -n1 git-cat-file commit |
+    sed -ne 's/^\('"$1"'\) \([^>]*>\).*/\2/p'
+}
+
+test_expect_success \
+    'preparation' '
+echo frotz >path0 &&
+git-update-cache --add path0 &&
+tree0=$(git-write-tree) &&
+
+set_author_zero &&
+set_committer_zero &&
+export_them &&
+commit0=$(echo frotz | git-commit-tree $tree0) &&
+
+set_author_zero &&
+set_committer_one &&
+export_them &&
+commit1=$(echo frotz | git-commit-tree $tree0 -p $commit0) &&
+
+set_author_one &&
+set_committer_zero &&
+export_them &&
+commit2=$(echo frotz | git-commit-tree $tree0 -p $commit1) &&
+
+set_author_one &&
+set_committer_one &&
+export_them &&
+commit3=$(echo frotz | git-commit-tree $tree0 -p $commit2)'
+
+test_expect_success \
+    'without restriction git-rev-tree should report all four commits.' \
+    'test $(git-rev-list $commit3 | wc -l) == 4'
+
+test_expect_success \
+    'limiting to A U Thor (two commits)' '
+    case "$(git-rev-list --author="A U Thor" $commit3 |
+          pick_actor "author" | sort -u)" in
+    "A U Thor <author@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+test_expect_success \
+    'limiting to R E Ttimmoc (two commits)' '
+    case "$(git-rev-list --committer="R E Ttimmoc" $commit3 |
+          pick_actor "committer" | sort -u)" in
+    "R E Ttimmoc <rettimmoc@example.xz>") : ;;
+    *) (exit 1) ;;
+    esac'
+
+LF='
+'
+
+test_expect_success \
+    'limiting to A U Thor and C O Mmitter (one commit)' '
+    case "$(git-rev-list --author="A U Thor" --committer="C O Mmitter" \
+            $commit3 | pick_actor "committer\\|author" | sort -u)" in
+    "A U Thor <author@example.xz>${LF}C O Mmitter <committer@example.xz>")
+        : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_done
------------------------------------------------


^ permalink raw reply	[relevance 2%]

* Re: Mercurial 0.4e vs git network pull
  2005-05-15 12:40  3% ` Petr Baudis
@ 2005-05-16 22:22  0%   ` Tristan Wibberley
  0 siblings, 0 replies; 200+ results
From: Tristan Wibberley @ 2005-05-16 22:22 UTC (permalink / raw)
  To: git

On Sun, 2005-05-15 at 14:40 +0200, Petr Baudis wrote:
> Dear diary, on Sun, May 15, 2005 at 01:22:19PM CEST, I got a letter
> where "Adam J. Richter" <adam@yggdrasil.com> told me that...
> > 
> > 	I don't understand what was wrong with Jeff Garzik's previous
> > suggestion of using http/1.1 pipelining to coalesce the round trips.
> > If you're worried about queuing too many http/1.1 requests, the client
> > could adopt a policy of not having more than a certain number of
> > requests outstanding or perhaps even making a new http connection
> > after a certain number of requests to avoid starving other clients
> > when the number of clients doing one of these transfers exceeds the
> > number of threads that the http server uses.
> 
> The problem is that to fetch a revision tree, you have to
> 
> 	send request for commit A
> 	receive commit A
> 	look at commit A for list of its parents
> 	send request for the parents
> 	receive the parents
> 	look inside for list of its parents
> 	...

What about IMAP? You could ask for just the parents for several messages
(via a message header), then start asking for message bodies (with the
juicy stuff in). You could also ask for a list of the new commits then
ask for each of the bodies (several at a time). Not as good as a "Just
give me all new data", but an *awful* lot more efficient than HTTP. And
very flexible. You just need to map changesets to IMAP messages (if such
a mapping can actually make sense :)

Prolly a bit more work though.

--
Tristan Wibberley

The opinions expressed in this message are my own opinions and not those
of my employer.



^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-15  3:23  3%       ` Daniel Barkalow
@ 2005-05-17 20:14  0%         ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-05-17 20:14 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dear diary, on Sun, May 15, 2005 at 05:23:18AM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> On Sat, 14 May 2005, Petr Baudis wrote:
> 
> > So what about just something like
> > 
> > 	git-wormhole-pull remote:refs/head/master wormhole://localhost/
> > 
> > That is, you could just specify remote:path_relative_to_url instead of
> > SHA1 id as the commit.
> 
> Do you have any sensible alternatives to "remote:refs/<something>" in
> mind? I suppose that "remote:HEAD" would also work. How are you thinking
> of having the value get written locally?

Anything that gets eventually wound up in the info/ directory. (The name
of the ignore file saved in info/ignore is the current hit.)

> Do you also have some idea for user-invoked rpush? It has to call
> something that writes the value on the other side (and I'd ideally like it
> to do the update atomically and locked against other clients). This series
> uses the same mechanism to write it that it uses to write hashes fetched
> from remote machines.

Well, it'd be again nice to have some generic mechanism for this so that
the user could theoretically push over rsync too or something (although
that'll be even more racy, it is fine for single-user repository).

I think the remote file to write the value inside should be porcelain
business. What you should always check though is that before the pull
(and after the locking) the value in that file is the same as the "push
base". This way you make sure that you are still following a single
branch and in case of multiuser repositories that you were fully merged
before pushing.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  @ 2005-05-19  3:19  4% ` Daniel Barkalow
  2005-05-19  6:52  0%   ` Petr Baudis
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-05-19  3:19 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

On Tue, 17 May 2005, Daniel Barkalow wrote:

> I think I'll get to implementing it Wednesday night. I might be able to
> get the first step done tonight (my previous patch, except with the
> transfer applying to arbitrary files).

Upon further consideration, I think there are three things that pull
implementation needs to handle:

 1) fetching object files by hash, validating them, and writing them to
    the local objects directory
 2) fetching reference files by name, and making them available to the
    local program without writing them to disk at all.
 3) fetching other files by name and writing them to either the
    corresponding filename or a provided replacement.

I had thought that (2) could be done as a special case of (3), but I think
that it has to be separate, because (2) just returns the value, while
(3) can't just return the contents, but has to write it somewhere, since
it isn't constrained to be exactly 20 bytes.

So I think I'd like to do essentially the original series, slightly
rearranged and with a few edits, and then add (3) afterwards; this should
be easy once the rpush/rpull changes to make the protocol extensible are
in place.

I'll also do additional (independant) patches to provide an expected
starting point and lock things appropriately.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 4%]

* [PATCH] Make rsh protocol extensible
@ 2005-05-19  5:11  4% Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-19  5:11 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

This changes the rsh protocol to allow reporting failure in getting an
object without breaking the connection, and to allow other types of
request than for objects to be made. It is a preliminary to any more
extensive pull operation.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: rpull.c
===================================================================
--- 75b95bec390d6728b9b1b4572056af8cee34ea7d/rpull.c  (mode:100644 sha1:b48e63157c66c160b9751603a92831f77106044c)
+++ c70baff05489356575384857c7cc4d4c641c39e3/rpull.c  (mode:100644 sha1:7f5b390d8ed653bfb67363db2c61c576a4d3134d)
@@ -15,7 +15,16 @@
 int fetch(unsigned char *sha1)
 {
 	int ret;
-	write(fd_out, sha1, 20);
+	signed char remote;
+	char type = 'o';
+	if (has_sha1_file(sha1))
+		return 0;
+	write(fd_out, &type, 1);
+        write(fd_out, sha1, 20);
+	if (read(fd_in, &remote, 1) < 1)
+		return -1;
+	if (remote < 0)
+		return remote;
 	ret = write_sha1_from_fd(sha1, fd_in);
 	if (!ret)
 		pull_say("got %s\n", sha1_to_hex(sha1));
Index: rpush.c
===================================================================
--- 75b95bec390d6728b9b1b4572056af8cee34ea7d/rpush.c  (mode:100644 sha1:3f2c898c8f5cf5ba62d689a13c646936b8372ee7)
+++ c70baff05489356575384857c7cc4d4c641c39e3/rpush.c  (mode:100644 sha1:07a8461878ad7b1d8cea27e44003b2ed44632834)
@@ -3,46 +3,68 @@
 #include <sys/socket.h>
 #include <errno.h>
 
-void service(int fd_in, int fd_out) {
+int serve_object(int fd_in, int fd_out) {
 	ssize_t size;
-	int posn;
+	int posn = 0;
 	char unsigned sha1[20];
 	unsigned long objsize;
 	void *buf;
+	signed char remote;
 	do {
-		posn = 0;
-		do {
-			size = read(fd_in, sha1 + posn, 20 - posn);
-			if (size < 0) {
-				perror("rpush: read ");
-				return;
+		size = read(fd_in, sha1 + posn, 20 - posn);
+		if (size < 0) {
+			perror("rpush: read ");
+			return -1;
+		}
+		if (!size)
+			return -1;
+		posn += size;
+	} while (posn < 20);
+	
+	/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+	remote = 0;
+	
+	buf = map_sha1_file(sha1, &objsize);
+	
+	if (!buf) {
+		fprintf(stderr, "rpush: could not find %s\n", 
+			sha1_to_hex(sha1));
+		remote = -1;
+	}
+	
+	write(fd_out, &remote, 1);
+	
+	if (remote < 0)
+		return 0;
+	
+	posn = 0;
+	do {
+		size = write(fd_out, buf + posn, objsize - posn);
+		if (size <= 0) {
+			if (!size) {
+				fprintf(stderr, "rpush: write closed");
+			} else {
+				perror("rpush: write ");
 			}
-			if (!size)
-				return;
-			posn += size;
-		} while (posn < 20);
-
-		/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+			return -1;
+		}
+		posn += size;
+	} while (posn < objsize);
+	return 0;
+}
 
-		buf = map_sha1_file(sha1, &objsize);
-		if (!buf) {
-			fprintf(stderr, "rpush: could not find %s\n", 
-				sha1_to_hex(sha1));
+void service(int fd_in, int fd_out) {
+	char type;
+	int retval;
+	do {
+		retval = read(fd_in, &type, 1);
+		if (retval < 1) {
+			if (retval < 0)
+				perror("rpush: read ");
 			return;
 		}
-		posn = 0;
-		do {
-			size = write(fd_out, buf + posn, objsize - posn);
-			if (size <= 0) {
-				if (!size) {
-					fprintf(stderr, "rpush: write closed");
-				} else {
-					perror("rpush: write ");
-				}
-				return;
-			}
-			posn += size;
-		} while (posn < objsize);
+		if (type == 'o' && serve_object(fd_in, fd_out))
+			return;
 	} while (1);
 }
 
@@ -53,6 +75,8 @@
         char *url;
 	int fd_in, fd_out;
 	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 'w')
+			arg++;
                 arg++;
         }
         if (argc < arg + 2) {
Index: rsh.c
===================================================================
--- 75b95bec390d6728b9b1b4572056af8cee34ea7d/rsh.c  (mode:100644 sha1:5d1cb9d578a8e679fc190a9d7d2c842ad811223f)
+++ c70baff05489356575384857c7cc4d4c641c39e3/rsh.c  (mode:100644 sha1:71afc1aa5c9fb3cfe8d49e73471c30e92df9e327)
@@ -36,8 +36,8 @@
 	*(path++) = '\0';
 	/* ssh <host> 'cd /<path>; stdio-pull <arg...> <commit-id>' */
 	snprintf(command, COMMAND_SIZE, 
-		 "cd /%s; %s=objects %s",
-		 path, DB_ENVIRONMENT, remote_prog);
+		 "GIT_DIR='%s' %s",
+		 path, remote_prog);
 	posn = command + strlen(command);
 	for (i = 0; i < rmt_argc; i++) {
 		*(posn++) = ' ';


^ permalink raw reply	[relevance 4%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-19  3:19  4% ` Daniel Barkalow
@ 2005-05-19  6:52  0%   ` Petr Baudis
  2005-05-19 16:00  0%     ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-05-19  6:52 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dear diary, on Thu, May 19, 2005 at 05:19:01AM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
>  2) fetching reference files by name, and making them available to the
>     local program without writing them to disk at all.
>  3) fetching other files by name and writing them to either the
>     corresponding filename or a provided replacement.
> 
> I had thought that (2) could be done as a special case of (3), but I think
> that it has to be separate, because (2) just returns the value, while
> (3) can't just return the contents, but has to write it somewhere, since
> it isn't constrained to be exactly 20 bytes.

Huh. How would (2) be useful and why can't you just still write it e.g.
to some user-supplied temporary file? I think that'd be still actually
much less trouble for the scripts to handle.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/4] Pulling refs files
  2005-05-19  6:52  0%   ` Petr Baudis
@ 2005-05-19 16:00  0%     ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-05-19 16:00 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

On Thu, 19 May 2005, Petr Baudis wrote:

> Dear diary, on Thu, May 19, 2005 at 05:19:01AM CEST, I got a letter
> where Daniel Barkalow <barkalow@iabervon.org> told me that...
> >  2) fetching reference files by name, and making them available to the
> >     local program without writing them to disk at all.
> >  3) fetching other files by name and writing them to either the
> >     corresponding filename or a provided replacement.
> > 
> > I had thought that (2) could be done as a special case of (3), but I think
> > that it has to be separate, because (2) just returns the value, while
> > (3) can't just return the contents, but has to write it somewhere, since
> > it isn't constrained to be exactly 20 bytes.
> 
> Huh. How would (2) be useful and why can't you just still write it e.g.
> to some user-supplied temporary file? I think that'd be still actually
> much less trouble for the scripts to handle.

(2) is what is needed if the user just requests downloading objects
starting with a reference stored remotely, and doesn't request that the
reference be written anywhere. It is also useful because the system wants
to verify that it has actually downloaded the objects successfully before
writing the reference.

Note that the scripts see a higher-level interface; these are the
operations that (e.g.) http-pull.c has to provide for pull.c, which builds
a larger operation (determine the target hash, download the objects, write
the specified ref file) out of them. It would be inconvenient for pull.c 
to download to a temporary file and then read the temporary file, which
shouldn't normally be visible yet, to figure out what it's doing. It wants
to have a function that takes a string and returns a hash, getting the
value from the remote host, and it's inconvenient to deal with the disk in
the middle.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* Re: cogito - how do I ???
  @ 2005-05-22  7:14  5%     ` Sam Ravnborg
  2005-05-22 16:23  5%       ` Linus Torvalds
  0 siblings, 1 reply; 200+ results
From: Sam Ravnborg @ 2005-05-22  7:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sean, git

> > > 1) Something similar to "bk changes -R". I use this to see what has
> > > happened upstream - to see if I really want to merge stuff.
> > 
> > Not sure what bk did here, but you can do something like:
> > 
> > cg-pull origin
> > cg-log -c -r origin
> 
> In the raw git interfaces, you'd basically have to do the same thing that
> "git-pull-script" does, except that _instead_ of calling the
> git-resolve-script thing, you'd do
> 
> 	git-rev-tree MERGE_HEAD ^HEAD | git-diff-tree -v -m -s --stdin
That looks ... long.
I can teach my fingers to use: cg-log, but the above is just too much 
to type/remeber do a daily operation.

In bk the usage pattern was to check what was in mainline _before_
fetching and merging. So it seems that with git/cogoto one
has to fetch the chages, inspect them, and then decide to apply or not.

When the fetches changes stay in MERGE_HEAD I assume my work when
committed will be based on top of HEAD - so I do not have to know
if I have fetched some (unmerged) updates.

> to show what is in the downloaded MERGE_HEAD but not in HEAD.
> 
> > > 2) Export of individual patches. "bk export -tpatch -r1.2345"
> > > I have nu public git repository yet so I have to feed changes as
> > > plain patches. Browsing cg-* I did not find the command to do this.
> > 
> > cg-diff -p -r SHA1
> 
> And again, without the porcelain this is:
> 
> 	git-diff-tree -v -p <name>

The key here is the SHA1. I hoped to avoid specifying SHA1's with
cogito, I so often miss one character when doing copy'n'paste.


Thanks all for the replies. Now I feel a bit more confident in this.


	Sam

^ permalink raw reply	[relevance 5%]

* Re: cogito - how do I ???
  2005-05-22  7:14  5%     ` Sam Ravnborg
@ 2005-05-22 16:23  5%       ` Linus Torvalds
  0 siblings, 0 replies; 200+ results
From: Linus Torvalds @ 2005-05-22 16:23 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Sean, git



On Sun, 22 May 2005, Sam Ravnborg wrote:
>
> > > > 1) Something similar to "bk changes -R". I use this to see what has
> > > > happened upstream - to see if I really want to merge stuff.
> > > 
> > > Not sure what bk did here, but you can do something like:
> > > 
> > > cg-pull origin
> > > cg-log -c -r origin
> > 
> > In the raw git interfaces, you'd basically have to do the same thing that
> > "git-pull-script" does, except that _instead_ of calling the
> > git-resolve-script thing, you'd do
> > 
> > 	git-rev-tree MERGE_HEAD ^HEAD | git-diff-tree -v -m -s --stdin
> That looks ... long.

That's why people don't generally use git natively. 

I want teach people what the "raw" interfaces are not because you're 
supposed to use them, but because I expect that the raw ones are useful 
for scripting.

> In bk the usage pattern was to check what was in mainline _before_
> fetching and merging.

In git (and cogito, although it's less obvious), the "fetching" is totally 
separate from the "merging". In BK, the two were the same - you couldn't 
merge without fetching, and you couldn't fetch without merging.

> So it seems that with git/cogoto one has to fetch the chages, inspect
> them, and then decide to apply or not.

Well, that's really what you ended up largely doing in BK too, since even
if it _looks_ like you first inspect them with "bk changes -R", the fact
is, in order to do that, you do have to _fetch_ the data first. It's just
that BK (a) fetched just the changeset changes (I think) and (b) then
threw the data away.

With git, you can also do the "fetch just the changeset changes", since if 
you use "git-http-pull" you can instruct it to first _only_ fetch the 
actual commits, and forget about the actual data. But since git considers 
"fetching" and "merging" totally separate phases, it's up to your scripts 
whether they leave the objects around or not afterwards. The normal 
operaion is to leave everything around, since that means that if/when you 
do decide to merge, you already have the data, and you don't need to 
re-fetch.

If you decide to throw it away, you first remove the reference to the
stuff you pulled, and then use "git-prune-script" to get rid of the
objects you used. Right now that is admittedly quite expensive, it's
considered a "rare" thing to do (sicne even if you decide not to merge,
the extra objects never hurt - you can prune things once a week if you
care).

> When the fetches changes stay in MERGE_HEAD I assume my work when
> committed will be based on top of HEAD - so I do not have to know
> if I have fetched some (unmerged) updates.

Yes. Note that MERGE_HEAD ends up being just a totally temporary reference
to the top (that you decided not to merge). It has no meaning for git
itself, and the naming and meaning is purely as a git-pull-script (and
git-resolve-script) internal temporary thing.

> > And again, without the porcelain this is:
> > 
> > 	git-diff-tree -v -p <name>
> 
> The key here is the SHA1. I hoped to avoid specifying SHA1's with
> cogito, I so often miss one character when doing copy'n'paste.

You don't have to use the raw SHA1. git understands tags and references, 
so for example, if you take the "fetch" part of git-pull-script (I really 
should split it up into "git-fetch-script"), then you can, for example, do

	git-diff-tree -v -p MERGE_HEAD

to see the top of the (unmerged) thing. Similarly, doing a

	git-rev-list MERGE_HEAD | git-diff-tree --stdin -v -s

will give you the changelog for the unmerged side. No SHA1's necessary.

Of course, if you don't have a reference to the thing you want to look at, 
you do need to figure out the SHA1 some way. But for example, gitk will 
work fine for the unmerged stuff too - ie you can do

	gitk MERGE_HEAD ^HEAD

_before_ you merge, and get all the nice graphical tools to inspect what 
the hell there is that is new there..

Notice how this "fetch is independent of merge" thing thus means that you 
can do a lot _more_ than "bk changes -R" ever did. But yes, it's a bit 
more complex too (so normally you'd probably only use the porcelain 
layer).

		Linus

^ permalink raw reply	[relevance 5%]

* Re: [PATCH] The diff-raw format updates.
  @ 2005-05-22 18:35  2%     ` Linus Torvalds
  0 siblings, 0 replies; 200+ results
From: Linus Torvalds @ 2005-05-22 18:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Sat, 21 May 2005, Junio C Hamano wrote:
>
> Update the diff-raw format as Linus and I discussed, except that
> it does not use sequence of underscore '_' letters to express
> nonexistence.  All '0' mode is used for that purpose instead.
> 
> The new diff-raw format can express rename/copy

Having looked at this, I have to disagree.

It can _almost_ express rename/copy, but you can't tell the two apart. In 
both cases you have two different modes, two different SHA1's, and two 
different filenames.

Also, while you can trivially tell whether a file is deleted or new (look 
at the 000000... SHA1), it is pretty illogical to give a "filename" for 
the non-existent side, as in the line

:000000 100644 0000000000000000000000000000000000000000 25ab9eda939ad92bb746c2419d083b1e52117a56	diffcore-pathspec.c	diffcore-pathspec.c

Finally, having now looked at it some more, I've come to realize that it's 
actually pretty hard to tell the different cases apart visually (is it a 
rename or just a change), because the full pathnames can be so long that 
it's not at all immediately obvious.

Anyway, I think we can trivially tweak the filename output to handle all 
these problems.

I'd suggest:

 - we'd continue to have two "filename" fields, with the existing
   termination, but they aren't pure filenames any more, they are just
   tab-separated (or zero-separated) "source" and "destination" fields.

 - if no filename exists (ie the source side of a new file, or the 
   destination side of a deleted file), output "/dev/null". In other
   words, a nonexistent file is _always_ associated with mode 000000, SHA1
   00000..  and a "name field" of "/dev/null".

 - ONLY IF HUMAN-READABLE: if the destination filename is the same as the
   source, drop it (and the tab) completely. This just makes things so
   much more readable, and it's still parseable, because the 
   line-termination is different from the inter-file termination.

   NOTE! In the zero-terminated format, you can't do this, since you 
   wouldn't know where the line ended. You might drop the name completely, 
   but you'd have to have two NUL bytes. I'd argue that since this format 
   isn't human-readable anyway, you might just want to keep the filename 
   the same.

 - in all other cases: if the file is new, prepend a "+", if the file is 
   old, prepend a "*", and if the file goes away, prepend a "-". In other 
   words, the actual pathname (if it exists) always starts at the second
   character and is always prepended by _something_ (ie there is no 
   ambiguoity with pathnames that start in -/+/*).

The above hass a few nice properties, notably you can parse the first
character of the name field, and you always know what's up:

 - '/' is always "/dev/null" aka "no file"
 - '+' is always "added file"
 - '-' is always "removed file"
 - '*' is always "existing file"
 - '\0' (ie empty) is always "same filename as source"

So for the above "create" event, it would look like

	:000000 100644 0000.. 25ab..	/dev/null	+diffcore-pathspec.c

which is visually quite obviously a create. Similarly, deletes are also 
visually pretty obvious:

	:100644 000000 25ab.. 0000..	-diffcore-pathspec.c	/dev/null

while a "copy" would be (git-pull-script stays around, so it gets a "*"):

	:100755 100755 bd89.. 17f2..	*git-pull-script	+git-fetch-script

and a "rename" would be:

	:100644 100644 51bb.. 51bb..	-diff-tree-helper.c	+diff-helper.c

(ie the difference is in the source file having a "-" in front of it 
instead of a "*").

A regular modification would be

	:100644 100644 bcd3.. c05b..	*Documentation/git-fsck-cache.txt

which is also very visually distinct from the other cases.

What do you think?

		Linus

^ permalink raw reply	[relevance 2%]

* [PATCH] Add git-format-patch-script.
@ 2005-05-31 21:50 10% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-05-31 21:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

The script takes two HEADs and optionally one directory name,
and prepares a patch file for each commit between the named
HEADS in a separate file in the named directory.  The directory
defaults to the $cwd.

This is in the same spirit as the recent additions of helper
scripts to make core GIT plumbing more comfortable to use.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile                |    3 +-
 git-format-patch-script |   66 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -22,7 +22,8 @@ INSTALL=install
 
 SCRIPTS=git-apply-patch-script git-merge-one-file-script git-prune-script \
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
-	git-deltafy-script git-fetch-script git-status-script git-commit-script
+	git-deltafy-script git-fetch-script git-status-script \
+	git-commit-script git-format-patch-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-format-patch-script b/git-format-patch-script
new file mode 100755
--- /dev/null
+++ b/git-format-patch-script
@@ -0,0 +1,66 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+junio="$1"
+linus="$2"
+outdir="${3:-./}"
+
+tmp=.tmp-series$$
+trap 'rm -f $tmp-*' 0 1 2 3 15
+
+series=$tmp-series
+
+titleScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s/[^-a-z.A-Z_0-9]/-/g
+	s/^--*//g
+	s/--*$//g
+	s/---*/-/g
+	s/$/.txt/
+        s/\.\.\.*/\./g
+	q
+'
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+stripCommitHead='/^'"$_x40"' (from '"$_x40"')$/d'
+
+O=
+if test -f .git/patch-order
+then
+    O=-O.git/patch-order
+fi
+git-rev-list "$junio" "$linus" >$series
+total=`wc -l <$series`
+i=$total
+while read commit
+do
+    title=`git-cat-file commit "$commit" | sed -e "$titleScript"`
+    num=`printf "%d/%d" $i $total`
+    file=`printf '%04d-%s' $i "$title"`
+    i=`expr "$i" - 1`
+    echo "$file"
+    {
+	mailScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s|^|[PATCH '"$num"'] |
+	: body
+	p
+	n
+	b body'
+
+	git-cat-file commit "$commit" | sed -ne "$mailScript"
+	echo '---'
+	echo
+	git-diff-tree -p $O "$commit" | git-apply --stat
+	echo
+	git-diff-tree -p $O "$commit" | sed -e "$stripCommitHead"
+	echo '------------'
+    } >"$outdir$file"
+done <$series
+
------------


^ permalink raw reply	[relevance 10%]

* [COGITO PATCH] Heads and tags in subdirectories
@ 2005-05-31 22:00 16% Santi Béjar
    0 siblings, 1 reply; 200+ results
From: Santi Béjar @ 2005-05-31 22:00 UTC (permalink / raw)
  To: Git Mailing List


Keep heads and tags in their respective subtirectoris named as
the branch. This fixes the case where two repositoris have tags
with the same name.

Add a "-a" flag to cg-pull to download all the repositories heads,
so you can now do a "cg-log -r repo#branch" (cg-Xnormid repo#branch
job).

The transition is automatic when you do the first "cg-pull repo".

Signed-off-by: "Santi Béjar" <sbejar@gmmail.es>

 cg-Xnormid |   14 ++++++++++-
 cg-commit  |    7 +++++
 cg-pull    |   72 +++++++++++++++++++++++++++++++++++++------------------------
 3 files changed, 62 insertions(+), 31 deletions(-)

diff --git a/cg-Xnormid b/cg-Xnormid
--- a/cg-Xnormid
+++ b/cg-Xnormid
@@ -16,15 +16,25 @@
 
 id="$1"
 
+repo=$(echo $id | cut -d '#' -f 1)
+(echo $repo | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
+	die "name contains invalid characters"
+id=$(echo $id | sed 's@#@/@')
+
 if [ ! "$id" ] || [ "$id" = "this" ] || [ "$id" = "HEAD" ]; then
 	read id < "$_git/HEAD"
 
-elif [ -r "$_git/refs/tags/$id" ]; then
+elif [ -r "$_git/refs/tags/$id" ] && [ ! -d "$_git/refs/tags/$id" ]; then
 	read id < "$_git/refs/tags/$id"
 
-elif [ -r "$_git/refs/heads/$id" ]; then
+elif [ -r "$_git/refs/heads/$id" ] && [ ! -d "$_git/refs/tags/$id" ]; then
 	read id < "$_git/refs/heads/$id"
 
+elif [ -r "$_git/branches/$id" ]; then
+	repobranch=$(cat "$_git/branches/$id" | cut -d '#' -f 2 -s)
+	repobranch=${repobranch:-master}
+	read id < "$_git/refs/heads/$id/$repobranch"
+
 # Short id's must be lower case and at least 4 digits.
 elif [[ "$id" == [0-9a-z][0-9a-z][0-9a-z][0-9a-z]* ]]; then
 	idpref=${id:0:2}
diff --git a/cg-commit b/cg-commit
--- a/cg-commit
+++ b/cg-commit
@@ -141,7 +141,12 @@ if [ "$merging" ]; then
 	[ "$msgs" ] && echo -n 'Merge with '
 	[ -s $_git/merging-sym ] || cp $_git/merging $_git/merging-sym
 	for sym in $(cat $_git/merging-sym); do
-		uri=$(cat $_git/branches/$sym)
+		repo=$(echo $sym | cut -d '#' -f 1)
+		branch=$(echo $sym | cut -d '#' -f 2 -s)
+		uri=$(cat $_git/branches/$repo)
+		uribranch=$(echo $uri | cut -d '#' -f 2 -s)
+		[ -z "$uribranch" ] && [ -n "$branch" ] &&
+		[ "$branch" != master ] && uri=${uri}#$branch
 		[ "$uri" ] || uri="$sym"
 		echo "$uri" >>$LOGMSG
 		[ "$msgs" ] && echo "$uri"
diff --git a/cg-pull b/cg-pull
--- a/cg-pull
+++ b/cg-pull
@@ -6,23 +6,41 @@
 # Takes the branch name as an argument, defaulting to "origin".
 #
 # See `cg-branch-add` for some description.
+#
+# OPTIONS
+# -------
+# -a::
+#       Pull all the heads from repositori.
 
-USAGE="cg-pull [BRANCH_NAME]"
+USAGE="cg-pull [-a] [BRANCH_NAME]"
 
 . ${COGITO_LIB}cg-Xlib
 
-name=$1
-
+[ "$1" == "-a" ] && all=yes && shift
+name=$1 && shift
 
 [ "$name" ] || { [ -s $_git/refs/heads/origin ] && name=origin; }
 [ "$name" ] || die "where to pull from?"
-uri=$(cat "$_git/branches/$name" 2>/dev/null) || die "unknown branch: $name"
 
-rembranch=master
+repo=$(echo $name | cut -d '#' -f 1)
+repobranch=$(echo $name | cut -s -d '#' -f 2)
+
+uri=$(cat "$_git/branches/$name" 2>/dev/null) || die "unknown branch: $name"
 if echo "$uri" | grep -q '#'; then
+	[ -z "$repobranch" ] && repobranch=$(echo $uri | cut -d '#' -f 2)
 	rembranch=$(echo $uri | cut -d '#' -f 2)
 	uri=$(echo $uri | cut -d '#' -f 1)
 fi
+repobranch=${repobranch:-master}
+branch=$repo/$repobranch
+[ "$all" ] && repobranch=
+
+# So long we have:
+# $repo       = name of the repositori
+# $uri        = uri of the repositori
+# $repobranch = name of the branch in the repositori
+#               empty if we want all the branches
+# $branch     = name of the local branch in refs/heads/
 
 pull_progress() {
 	percentage=""
@@ -232,39 +250,37 @@ fi
 
 
 orig_head=
-[ -s "$_git/refs/heads/$name" ] && orig_head=$(cat "$_git/refs/heads/$name")
-
+[ -s "$_git/refs/heads/$branch" ] && orig_head=$(cat "$_git/refs/heads/$branch")
 
-mkdir -p $_git/refs/heads
-rsyncerr=
-$fetch -i "$uri/refs/heads/$rembranch" "$_git/refs/heads/$name" || rsyncerr=1
-if [ "$rsyncerr" ]; then
-	rsyncerr=
-	$fetch -s "$uri/heads/$rembranch" "$_git/refs/heads/$name" || rsyncerr=1
-fi
-if [ "$rsyncerr" ] && [ "$rembranch" = "master" ]; then
-	rsyncerr=
-	$fetch -s "$uri/HEAD" "$_git/refs/heads/$name" || rsyncerr=1
+# 2005/05 Convert old layout
+[ -f $_git/refs/heads/$repo ] && orig_head=$(cat $_git/refs/heads/$repo) &&
+rm -f $_git/refs/heads/$repo
+
+mkdir -p $_git/refs/heads/$repo
+if [ "$repobranch" ] ; then
+    $fetch -i "$uri/refs/heads/$repobranch" "$_git/refs/heads/$branch" ||
+    $fetch -s "$uri/heads/$repobranch" "$_git/refs/heads/$branch" ||
+    { [ "$repobranch" = "master" ] && $fetch -s "$uri/HEAD" "$_git/refs/heads/$branch"; } ||
+    rsyncerr=1
+else
+    $fetch -i -d "$uri/refs/heads" "$_git/refs/heads/$repo" ||
+    $fetch -s -d "$uri/heads" "$_git/refs/heads/$repo" ||
+    rsyncerr=1
 fi
-[ "$rsyncerr" ] && die "unable to get the head pointer of branch $rembranch"
+[ "$rsyncerr" ] && die "unable to get the head pointer of branch $repobranch"
 
 [ -d $_git_objects ] || mkdir -p $_git_objects
-$pull "$name" "$uri" || die "objects pull failed"
+$pull "$branch" "$uri" || die "objects pull failed"
 
-# FIXME: Warn about conflicting tag names?
 # XXX: We now throw stderr to /dev/null since not all repositories
 # may have tags/ and users were confused by the harmless errors.
-[ -d $_git/refs/tags ] || mkdir -p $_git/refs/tags
+[ -d $_git/refs/tags/$repo ] || mkdir -p $_git/refs/tags/$repo
 rsyncerr=
-$fetch -i -s -u -d "$uri/refs/tags" "$_git/refs/tags" || rsyncerr=1
-if [ "$rsyncerr" ]; then
-	rsyncerr=
-	$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags" || rsyncerr=1
-fi
+$fetch -i -s -u -d "$uri/refs/tags" "$_git/refs/tags/$repo" ||
+$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags/$repo" || rsyncerr=1
 [ "$rsyncerr" ] && echo "unable to get tags list (non-fatal)" >&2
 
-
-new_head=$(cat "$_git/refs/heads/$name")
+new_head=$(cat "$_git/refs/heads/$branch")
 
 if [ ! "$orig_head" ]; then
 	echo "New branch: $new_head"


^ permalink raw reply	[relevance 16%]

* [PATCH] cg-clone fails to clone tags
@ 2005-06-01  1:58 20% Miguel Bazdresch
  0 siblings, 0 replies; 200+ results
From: Miguel Bazdresch @ 2005-06-01  1:58 UTC (permalink / raw)
  To: git

Hi,

I noticed that cg-clone fails to clone tags when the source and
destination directories are on different filesystems (this is on a local
clone). The command produces warnings of the type:

cp: cannot create link `.git/refs/tags/git-pasky-0.6.2': Invalid cross-device link
`/home/miguel/bin/cogito/.git/refs/tags/git-pasky-0.6.3' -> `.git/refs/tags/git-pasky-0.6.3'

I think the culprit is the "l" flag passed to cp in fetch-local() in
script cg-pull. After thinking a bit on how to solve this, I decided on
a "blunt force" approach, where the cp is done twice, once with "l" and
once without it. Only when both fail is a warning issued about the
failure to get the tag list.

This has the unfortunate side effect of also removing the "u" flag to
cp, but I don't think it's a big deal (cg-pull adds the "l" and "u"
flags at the same time).

Following is the output of cg-mkpatch:

---

Currently, cg-clone fails to clone the tag list if the destination
and source directory are in different filesystem. This caused by the
"l" flag used by fetch_local.

This patch tries the copy without the "l" flag if it fails with it.

Signed-off-by: Miguel Bazdresch <miguelb@ieee.org>

---
commit 5823514635ca67be41914d9294081353b70272a4
tree 19092122a45366f46b7f140d411f875000ff2ba7
parent 20e473c9afd8b5d2d549b0e7881473600beb9c37
author Miguel Bazdresch <miguelb@ieee.org> Tue, 31 May 2005 20:23:44
-0500
committer Miguel Bazdresch <miguelb@ieee.org> Tue, 31 May 2005 20:23:44
-0500

 cg-pull |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/cg-pull b/cg-pull
--- a/cg-pull
+++ b/cg-pull
@@ -259,8 +259,16 @@ rsyncerr=
 $fetch -i -s -u -d "$uri/refs/tags" "$_git/refs/tags" || rsyncerr=1
 if [ "$rsyncerr" ]; then
 rsyncerr=
-	$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags" || rsyncerr=1
+	$fetch -i -s -d "$uri/refs/tags" "$_git/refs/tags" || rsyncerr=1
 fi
+if [ "$rsyncerr" ]; then
+rsyncerr=
+	$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags" || rsyncerr=1
+fi
+if [ "$rsyncerr" ]; then
+rsyncerr=
+	$fetch -i -s -d "$uri/tags" "$_git/refs/tags" || rsyncerr=1
+fi
 [ "$rsyncerr" ] && echo "unable to get tags list (non-fatal)" >&2

^ permalink raw reply	[relevance 20%]

* [PATCH] One Git To Rule Them All - Prep 2
@ 2005-06-01  5:59 25% Jason McMullan
  0 siblings, 0 replies; 200+ results
From: Jason McMullan @ 2005-06-01  5:59 UTC (permalink / raw)
  To: git

one-git prep patch 2/2

Make pull.c usable in one-git by making 'fetch' an overridable function pointer,
instead of an external function.

Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>

diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -39,7 +39,7 @@ static size_t fwrite_sha1_file(void *ptr
 	return size;
 }
 
-int fetch(unsigned char *sha1)
+static int my_fetch(unsigned char *sha1)
 {
 	char *hex = sha1_to_hex(sha1);
 	char *filename = sha1_file_name(sha1);
@@ -98,6 +98,8 @@ int main(int argc, char **argv)
 	char *url;
 	int arg = 1;
 
+	fetch = my_fetch;
+
 	while (arg < argc && argv[arg][0] == '-') {
 		if (argv[arg][1] == 't') {
 			get_tree = 1;
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -11,7 +11,7 @@ static int use_filecopy = 1;
 
 static char *path;
 
-int fetch(unsigned char *sha1)
+static int my_fetch(unsigned char *sha1)
 {
 	static int object_name_start = -1;
 	static char filename[PATH_MAX];
@@ -87,6 +87,8 @@ int main(int argc, char **argv)
 	char *commit_id;
 	int arg = 1;
 
+	fetch = my_fetch;
+
 	while (arg < argc && argv[arg][0] == '-') {
 		if (argv[arg][1] == 't')
 			get_tree = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -14,6 +14,14 @@ static const char commitS[] = "commit";
 static const char treeS[] = "tree";
 static const char blobS[] = "blob";
 
+int null_fetch(unsigned char *sha1)
+{
+	fprintf(stderr,"fetch() routine not implemented.\n");
+	return -1;
+}
+
+int (*fetch)(unsigned char *sha1) = null_fetch;
+
 void pull_say(const char *fmt, const char *hex) {
 	if (get_verbosely)
 		fprintf(stderr, fmt, hex);
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -2,7 +2,7 @@
 #define PULL_H
 
 /** To be provided by the particular implementation. **/
-extern int fetch(unsigned char *sha1);
+extern int (*fetch)(unsigned char *sha1);
 
 /** Set to fetch the target tree. */
 extern int get_tree;
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -6,7 +6,7 @@
 static int fd_in;
 static int fd_out;
 
-int fetch(unsigned char *sha1)
+static int my_fetch(unsigned char *sha1)
 {
 	int ret;
 	write(fd_out, sha1, 20);
@@ -22,6 +22,8 @@ int main(int argc, char **argv)
 	char *url;
 	int arg = 1;
 
+	fetch = my_fetch;
+
 	while (arg < argc && argv[arg][0] == '-') {
 		if (argv[arg][1] == 't') {
 			get_tree = 1;


^ permalink raw reply	[relevance 25%]

* [PATCH] One Git To Rule Them All - Final
@ 2005-06-01  6:00  3% Jason McMullan
  0 siblings, 0 replies; 200+ results
From: Jason McMullan @ 2005-06-01  6:00 UTC (permalink / raw)
  To: git

one-git

One git to rule them all... 

This patch make a 'git' binary that has all of the git-* commands linked in,
instead of a zillion little git-* commands. As an interesting side effect,
documentation for each command is required in Documention/git-<mumble>.txt
for compilation to succeed. ;^)

The install still creates git-<mumble> commands in $HOME/bin, so there should
be no compatability issues.

This patch saves 2.4M on my Athlon 64 system - a 7x size reduction. Whee!

REQURIES one-git pre patches 1 and 2

Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -24,19 +24,48 @@ SCRIPTS=git-apply-patch-script git-merge
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
 	git-deltafy-script git-fetch-script git-status-script git-commit-script
 
-PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
-	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
-	git-checkout-cache git-diff-tree git-rev-tree git-ls-files \
-	git-check-files git-ls-tree git-merge-base git-merge-cache \
-	git-unpack-file git-export git-diff-cache git-convert-cache \
-	git-http-pull git-rpush git-rpull git-rev-list git-mktag \
-	git-diff-helper git-tar-tree git-local-pull git-write-blob \
-	git-get-tar-commit-id git-mkdelta git-apply git-stripspace
-
-all: $(PROG)
-
-install: $(PROG) $(SCRIPTS)
-	$(INSTALL) $(PROG) $(SCRIPTS) $(dest)$(bin)
+# The following order determines 'git help' command order.
+# The general idea is 'pull, push, inspect, commit, db, misc'
+PROG=   \
+	git-local-pull \
+	git-http-pull \
+	git-rpull \
+	git-rpush \
+	git-rev-tree \
+	git-rev-list \
+	git-export \
+	git-read-tree \
+	git-checkout-cache \
+	git-update-cache \
+	git-ls-tree \
+	git-ls-files \
+	git-cat-file \
+	git-unpack-file \
+	git-diff-files \
+	git-diff-cache \
+	git-diff-tree \
+	git-merge-base \
+	git-merge-cache \
+	git-apply \
+	git-check-files \
+	git-write-tree \
+	git-commit-tree \
+	git-mktag \
+	git-init-db \
+	git-fsck-cache \
+	git-convert-cache \
+	git-mkdelta \
+	git-tar-tree \
+	git-get-tar-commit-id \
+	git-write-blob \
+	git-diff-helper \
+	git-stripspace
+
+all: git $(PROG)
+
+install: git $(PROG) $(SCRIPTS)
+	$(INSTALL) git $(SCRIPTS) $(dest)$(bin)
+	for prog in $(PROG); do ln -sf git $(dest)$(bin)/$$prog; done
 
 LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
 	 tag.o delta.o date.o index.o diff-delta.o patch-delta.o
@@ -52,9 +81,16 @@ LIB_OBJS += diff.o diffcore-rename.o dif
 
 LIB_OBJS += gitenv.o
 
+LIB_OBJS += pull.o
+
+LIB_OBJS += rsh.o
+
 LIBS = $(LIB_FILE)
 LIBS += -lz
 
+# For git-http-pull
+LIBS += -lcurl
+
 ifdef MOZILLA_SHA1
   SHA1_HEADER="mozilla-sha1/sha1.h"
   LIB_OBJS += mozilla-sha1/sha1.o
@@ -79,42 +115,39 @@ test-date: test-date.c date.o
 test-delta: test-delta.c diff-delta.o patch-delta.o
 	$(CC) $(CFLAGS) -o $@ $^
 
-git-%: %.c $(LIB_FILE)
-	$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
+git-%.o: %.c #Makefile
+	$(CC) $(CFLAGS) -c -o $@ $*.c -Dmain=git_$(subst -,_,$*) -Ddesc=git_$(subst -,_,$*)_desc
+
+$(PROG):
+	ln -s git $@
+
+git.o: git-commands.h
 
-git-update-cache: update-cache.c
-git-diff-files: diff-files.c
-git-init-db: init-db.c
-git-write-tree: write-tree.c
-git-read-tree: read-tree.c
-git-commit-tree: commit-tree.c
-git-cat-file: cat-file.c
-git-fsck-cache: fsck-cache.c
-git-checkout-cache: checkout-cache.c
-git-diff-tree: diff-tree.c
-git-rev-tree: rev-tree.c
-git-ls-files: ls-files.c
-git-check-files: check-files.c
-git-ls-tree: ls-tree.c
-git-merge-base: merge-base.c
-git-merge-cache: merge-cache.c
-git-unpack-file: unpack-file.c
-git-export: export.c
-git-diff-cache: diff-cache.c
-git-convert-cache: convert-cache.c
-git-http-pull: http-pull.c pull.c
-git-local-pull: local-pull.c pull.c
-git-rpush: rsh.c
-git-rpull: rsh.c pull.c
-git-rev-list: rev-list.c
-git-mktag: mktag.c
-git-diff-helper: diff-helper.c
-git-tar-tree: tar-tree.c
-git-write-blob: write-blob.c
-git-mkdelta: mkdelta.c
-git-stripspace: stripspace.c
+git: git.o $(patsubst %,%.o,$(PROG)) $(LIBS)
+	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
+git-commands.h: Makefile $(patsubst %,Documentation/%.txt,$(PROG))
+	echo -n >git-commands.h
+	for prog in $(subst -,_,$(PROG)); do \
+		echo "extern int $${prog}(int argc, char **argv);" >>git-commands.h ; \
+		echo "extern const char *$${prog}_desc;" >>git-commands.h ; \
+	done
+	echo "struct git_command { ">>git-commands.h
+	echo "	const char *command;" >>git-commands.h
+	echo "	int (*main)(int argc,char **argv);" >>git-commands.h
+	echo "	const char *desc; } commands[]={" >>git-commands.h
+	for cmd in $(patsubst git-%,%,$(PROG)); do \
+		prog=`echo $$cmd | sed -e 's/-/_/g'` ; \
+		desc=`grep "^git-$$cmd - " Documentation/git-$$cmd.txt | cut -d' ' -f3-` ; \
+		desc=`echo "$$desc" | sed -e 's/"/\\\\"/g'` ; \
+		echo -n "	{ " >>git-commands.h ; \
+		echo -n ".command = \"$${cmd}\", " >>git-commands.h ; \
+		echo -n ".main = git_$${prog}, " >>git-commands.h ; \
+	        echo -n ".desc = \"$$desc\", " >>git-commands.h ; \
+		echo "}," >>git-commands.h ; \
+	done
+	echo "};" >>git-commands.h
 
-git-http-pull: LIBS += -lcurl
 
 # Library objects..
 blob.o: $(LIB_H)
@@ -138,7 +171,7 @@ test: all
 	$(MAKE) -C t/ all
 
 clean:
-	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
+	rm -f *.o mozilla-sha1/*.o ppc/*.o git $(PROG) $(LIB_FILE)
 	$(MAKE) -C Documentation/ clean
 
 backup: clean
diff --git a/git.c b/git.c
new file mode 100644
--- /dev/null
+++ b/git.c
@@ -0,0 +1,57 @@
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+
+#include "git-commands.h"
+
+#define COMMAND_LEN	(sizeof(commands)/sizeof(commands[0]))
+
+typedef int (*func)(int argc, char **argv);
+
+func lookup_basename(const char *program)
+{
+	const char *cp;
+	int i;
+
+	cp=strrchr(program,'/');
+	if (cp==NULL)
+		cp=program;
+	else
+		cp++;
+
+	if (strncmp(cp,"git-",4)==0)
+		cp+=4;
+
+	for (i = 0; i < COMMAND_LEN; i++)
+		if (!strcmp(cp,commands[i].command))
+			return commands[i].main;
+
+	return NULL;
+}
+
+int main(int argc, char **argv)
+{
+	int (*do_main)(int argc, char **argv);
+	int i;
+
+	do_main = lookup_basename(argv[0]);
+
+	if (!do_main && argc > 1 ) {
+		do_main = lookup_basename(argv[1]);
+		if (do_main) {
+			argc--;
+			argv++;
+		}
+	}
+
+	if (do_main) {
+		return do_main(argc, argv);
+	}
+
+	fprintf(stderr,"GIT Commands:\n\n");
+	for (i = 0; i < COMMAND_LEN; i++)
+		fprintf(stderr,"\t%-24s%s\n",commands[i].command,commands[i].desc);
+	fprintf(stderr,"\n");
+
+	return 1;
+}

^ permalink raw reply	[relevance 3%]

* [PATCH] Add -d flag to git-pull-* family.
  @ 2005-06-01  8:24 15%     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-01  8:24 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Git Mailing List

When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
Since checking representation type of all objects we retreive
from remote side may be costly, this is made into a separate
option -d; -a implies it for convenience and safety.

Rsync transport does not have this problem since it fetches
everything the remote side has.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/git-http-pull.txt  |    4 +++-
Documentation/git-local-pull.txt |    4 +++-
Documentation/git-rpull.txt      |    4 +++-
http-pull.c                      |    5 ++++-
local-pull.c                     |    5 ++++-
pull.c                           |   15 +++++++++++++++
pull.h                           |    3 +++
rpull.c                          |    5 ++++-
8 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt
--- a/Documentation/git-http-pull.txt
+++ b/Documentation/git-http-pull.txt
@@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r
 
 SYNOPSIS
 --------
-'git-http-pull' [-c] [-t] [-a] [-v] commit-id url
+'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url
 
 DESCRIPTION
 -----------
@@ -17,6 +17,8 @@ Downloads a remote GIT repository via HT
 
 -c::
 	Get the commit objects.
+-d::
+	Get objects that deltified objects are based upon.
 -t::
 	Get trees associated with the commit objects.
 -a::
diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path
 
 DESCRIPTION
 -----------
@@ -19,6 +19,8 @@ OPTIONS
 -------
 -c::
 	Get the commit objects.
+-d::
+	Get objects that deltified objects are based upon.
 -t::
 	Get trees associated with the commit objects.
 -a::
diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt
--- a/Documentation/git-rpull.txt
+++ b/Documentation/git-rpull.txt
@@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito
 
 SYNOPSIS
 --------
-'git-rpull' [-c] [-t] [-a] [-v] commit-id url
+'git-rpull' [-c] [-t] [-a] [-v] [-d] commit-id url
 
 DESCRIPTION
 -----------
@@ -21,6 +21,8 @@ OPTIONS
 -------
 -c::
 	Get the commit objects.
+-d::
+	Get objects that deltified objects are based upon.
 -t::
 	Get trees associated with the commit objects.
 -a::
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -103,17 +103,20 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 1;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
 			get_history = 1;
+			get_delta = 1;
 		} else if (argv[arg][1] == 'v') {
 			get_verbosely = 1;
 		}
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -74,7 +74,7 @@ int fetch(unsigned char *sha1)
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -92,10 +92,13 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		else if (argv[arg][1] == 'c')
 			get_history = 1;
+		else if (argv[arg][1] == 'd')
+			get_delta = 1;
 		else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
 			get_history = 1;
+			get_delta = 1;
 		}
 		else if (argv[arg][1] == 'l')
 			use_link = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -6,6 +6,7 @@
 
 int get_tree = 0;
 int get_history = 0;
+int get_delta = 0;
 int get_all = 0;
 int get_verbosely = 0;
 static unsigned char current_commit_sha1[20];
@@ -37,6 +38,20 @@ static int make_sure_we_have_it(const ch
 	status = fetch(sha1);
 	if (status && what)
 		report_missing(what, sha1);
+	if (get_delta) {
+		unsigned long mapsize, size;
+		void *map, *buf;
+		char type[20];
+
+		map = map_sha1_file(sha1, &mapsize);
+		if (map) {
+			buf = unpack_sha1_file(map, mapsize, type, &size);
+			munmap(map, mapsize);
+			if (buf && !strcmp(type, "delta"))
+				status = make_sure_we_have_it(what, buf);
+			free(buf);
+		}
+	}
 	return status;
 }
 
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -13,6 +13,9 @@ extern int get_history;
 /** Set to fetch the trees in the commit history. **/
 extern int get_all;
 
+/* Set to fetch the base of delta objects.*/
+extern int get_delta;
+
 /* Set to be verbose */
 extern int get_verbosely;
 
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -27,17 +27,20 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 1;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
 			get_history = 1;
+			get_delta = 1;
 		} else if (argv[arg][1] == 'v') {
 			get_verbosely = 1;
 		}
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-rpull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
------------------------------------------------


^ permalink raw reply	[relevance 15%]

* Re: [COGITO PATCH] Heads and tags in subdirectories
  @ 2005-06-01 16:17 16%   ` Santi Béjar
  0 siblings, 0 replies; 200+ results
From: Santi Béjar @ 2005-06-01 16:17 UTC (permalink / raw)
  To: Git Mailing List


Here it is un updated version, fixing some bugs.

 cg-Xnormid |   14 +++++++++-
 cg-commit  |    7 ++++-
 cg-init    |    5 ++-
 cg-pull    |   78 ++++++++++++++++++++++++++++++++++++-------------------------
 4 files changed, 68 insertions(+), 36 deletions(-)

diff --git a/cg-Xnormid b/cg-Xnormid
--- a/cg-Xnormid
+++ b/cg-Xnormid
@@ -16,15 +16,25 @@
 
 id="$1"
 
+repo=$(echo $id | cut -d '#' -f 1)
+(echo $repo | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
+	die "name contains invalid characters"
+id=$(echo $id | sed 's@#@/@')
+
 if [ ! "$id" ] || [ "$id" = "this" ] || [ "$id" = "HEAD" ]; then
 	read id < "$_git/HEAD"
 
-elif [ -r "$_git/refs/tags/$id" ]; then
+elif [ -r "$_git/refs/tags/$id" ] && [ ! -d "$_git/refs/tags/$id" ]; then
 	read id < "$_git/refs/tags/$id"
 
-elif [ -r "$_git/refs/heads/$id" ]; then
+elif [ -r "$_git/refs/heads/$id" ] && [ ! -d "$_git/refs/heads/$id" ]; then
 	read id < "$_git/refs/heads/$id"
 
+elif [ -r "$_git/branches/$id" ]; then
+	repobranch=$(cat "$_git/branches/$id" | cut -d '#' -f 2 -s)
+	repobranch=${repobranch:-master}
+	read id < "$_git/refs/heads/$id/$repobranch"
+
 # Short id's must be lower case and at least 4 digits.
 elif [[ "$id" == [0-9a-z][0-9a-z][0-9a-z][0-9a-z]* ]]; then
 	idpref=${id:0:2}
diff --git a/cg-commit b/cg-commit
--- a/cg-commit
+++ b/cg-commit
@@ -141,7 +141,12 @@ if [ "$merging" ]; then
 	[ "$msgs" ] && echo -n 'Merge with '
 	[ -s $_git/merging-sym ] || cp $_git/merging $_git/merging-sym
 	for sym in $(cat $_git/merging-sym); do
-		uri=$(cat $_git/branches/$sym)
+		repo=$(echo $sym | cut -d '#' -f 1)
+		branch=$(echo $sym | cut -d '#' -f 2 -s)
+		uri=$(cat $_git/branches/$repo)
+		uribranch=$(echo $uri | cut -d '#' -f 2 -s)
+		[ -z "$uribranch" ] && [ -n "$branch" ] &&
+		[ "$branch" != master ] && uri=${uri}#$branch
 		[ "$uri" ] || uri="$sym"
 		echo "$uri" >>$LOGMSG
 		[ "$msgs" ] && echo "$uri"
diff --git a/cg-init b/cg-init
--- a/cg-init
+++ b/cg-init
@@ -29,8 +29,9 @@ ln -s refs/heads/master $_git/HEAD
 if [ "$uri" ]; then
 	echo "$uri" >$_git/branches/origin
 	cg-pull origin || die "pull failed"
-
-	cp $_git/refs/heads/origin $_git/refs/heads/master
+	uribranch=$(echo $uri | cut -d '#' -f 2 -s)
+	uribranch=${uribranch:-master}
+	cp $_git/refs/heads/origin/$uribranch $_git/refs/heads/master
 	git-read-tree HEAD
 	git-checkout-cache -a
 	git-update-cache --refresh
diff --git a/cg-pull b/cg-pull
--- a/cg-pull
+++ b/cg-pull
@@ -6,23 +6,41 @@
 # Takes the branch name as an argument, defaulting to "origin".
 #
 # See `cg-branch-add` for some description.
+#
+# OPTIONS
+# -------
+# -a::
+#       Pull all the heads from repositori.
 
-USAGE="cg-pull [BRANCH_NAME]"
+USAGE="cg-pull [-a] [BRANCH_NAME]"
 
 . ${COGITO_LIB}cg-Xlib
 
-name=$1
-
+[ "$1" == "-a" ] && all=yes && shift
+name=$1 && shift
 
 [ "$name" ] || { [ -s $_git/refs/heads/origin ] && name=origin; }
 [ "$name" ] || die "where to pull from?"
-uri=$(cat "$_git/branches/$name" 2>/dev/null) || die "unknown branch: $name"
 
-rembranch=master
+repo=$(echo $name | cut -d '#' -f 1)
+repobranch=$(echo $name | cut -s -d '#' -f 2)
+
+uri=$(cat "$_git/branches/$name" 2>/dev/null) || die "unknown branch: $name"
 if echo "$uri" | grep -q '#'; then
+	[ -z "$repobranch" ] && repobranch=$(echo $uri | cut -d '#' -f 2)
 	rembranch=$(echo $uri | cut -d '#' -f 2)
 	uri=$(echo $uri | cut -d '#' -f 1)
 fi
+repobranch=${repobranch:-master}
+branch=$repo/$repobranch
+[ "$all" ] && repobranch=
+
+# So long we have:
+# $repo       = name of the repositori
+# $uri        = uri of the repositori
+# $repobranch = name of the branch in the repositori
+#               empty if we want all the branches
+# $branch     = name of the local branch in refs/heads/
 
 pull_progress() {
 	percentage=""
@@ -197,15 +215,15 @@ fetch_local () {
 		shift
 	fi
 
-	cut_last=
+	dirs=
 	if [ "$1" = "-d" ]; then
-		cut_last=1
+		dirs=1
 		shift
 	fi
 
 	src="$1"
 	dest="$2"
-	[ "$cut_last" ] && dest=${dest%/*}
+	[ "$dirs" ] && src="${src%/}/."
 
 	cp $cp_flags_l "$src" "$dest"
 }
@@ -232,39 +250,37 @@ fi
 
 
 orig_head=
-[ -s "$_git/refs/heads/$name" ] && orig_head=$(cat "$_git/refs/heads/$name")
-
+[ -s "$_git/refs/heads/$branch" ] && orig_head=$(cat "$_git/refs/heads/$branch")
 
-mkdir -p $_git/refs/heads
-rsyncerr=
-$fetch -i "$uri/refs/heads/$rembranch" "$_git/refs/heads/$name" || rsyncerr=1
-if [ "$rsyncerr" ]; then
-	rsyncerr=
-	$fetch -s "$uri/heads/$rembranch" "$_git/refs/heads/$name" || rsyncerr=1
-fi
-if [ "$rsyncerr" ] && [ "$rembranch" = "master" ]; then
-	rsyncerr=
-	$fetch -s "$uri/HEAD" "$_git/refs/heads/$name" || rsyncerr=1
+# 2005/05 Convert old layout
+[ -f $_git/refs/heads/$repo ] && orig_head=$(cat $_git/refs/heads/$repo) &&
+rm -f $_git/refs/heads/$repo
+
+mkdir -p $_git/refs/heads/$repo
+if [ "$repobranch" ] ; then
+    $fetch -i "$uri/refs/heads/$repobranch" "$_git/refs/heads/$branch" ||
+    $fetch -s "$uri/heads/$repobranch" "$_git/refs/heads/$branch" ||
+    { [ "$repobranch" = "master" ] && $fetch -s "$uri/HEAD" "$_git/refs/heads/$branch"; } ||
+    rsyncerr=1
+else
+    $fetch -i -d "$uri/refs/heads" "$_git/refs/heads/$repo" ||
+    $fetch -s -d "$uri/heads" "$_git/refs/heads/$repo" ||
+    rsyncerr=1
 fi
-[ "$rsyncerr" ] && die "unable to get the head pointer of branch $rembranch"
+[ "$rsyncerr" ] && die "unable to get the head pointer of branch $repobranch"
 
 [ -d $_git_objects ] || mkdir -p $_git_objects
-$pull "$name" "$uri" || die "objects pull failed"
+$pull "$branch" "$uri" || die "objects pull failed"
 
-# FIXME: Warn about conflicting tag names?
 # XXX: We now throw stderr to /dev/null since not all repositories
 # may have tags/ and users were confused by the harmless errors.
-[ -d $_git/refs/tags ] || mkdir -p $_git/refs/tags
+[ -d $_git/refs/tags/$repo ] || mkdir -p $_git/refs/tags/$repo
 rsyncerr=
-$fetch -i -s -u -d "$uri/refs/tags" "$_git/refs/tags" || rsyncerr=1
-if [ "$rsyncerr" ]; then
-	rsyncerr=
-	$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags" || rsyncerr=1
-fi
+$fetch -i -s -u -d "$uri/refs/tags" "$_git/refs/tags/$repo" ||
+$fetch -i -s -u -d "$uri/tags" "$_git/refs/tags/$repo" || rsyncerr=1
 [ "$rsyncerr" ] && echo "unable to get tags list (non-fatal)" >&2
 
-
-new_head=$(cat "$_git/refs/heads/$name")
+new_head=$(cat "$_git/refs/heads/$branch")
 
 if [ ! "$orig_head" ]; then
 	echo "New branch: $new_head"


^ permalink raw reply	[relevance 16%]

* [PATCH] One-Git Part 2 (Patch 1/3)
@ 2005-06-01 18:22 16% Jason McMullan
  0 siblings, 0 replies; 200+ results
From: Jason McMullan @ 2005-06-01 18:22 UTC (permalink / raw)
  To: git

Add: Empty documentation for some scripts

Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>

diff --git a/Documentation/git-commit-script.txt b/Documentation/git-commit-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-commit-script.txt
@@ -0,0 +1,29 @@
+git-commit-script(1)
+====================
+v0.1, May 2005
+
+NAME
+----
+git-commit-script - Commit working directory
+
+
+SYNOPSIS
+--------
+'git-commit-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-deltafy-script.txt b/Documentation/git-deltafy-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-deltafy-script.txt
@@ -0,0 +1,29 @@
+git-deltafy-script(1)
+=====================
+v0.1, May 2005
+
+NAME
+----
+git-deltafy-script - Convery repository into delta format
+
+
+SYNOPSIS
+--------
+'git-deltafy-script'
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-fetch-script.txt b/Documentation/git-fetch-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-fetch-script.txt
@@ -0,0 +1,29 @@
+git-fetch-script(1)
+===================
+v0.1, May 2005
+
+NAME
+----
+git-fetch-script - Fetch an object from a remote repository
+
+
+SYNOPSIS
+--------
+'git-fetch-script' <sha1>
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-log-script.txt b/Documentation/git-log-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-log-script.txt
@@ -0,0 +1,29 @@
+git-log-script(1)
+=================
+v0.1, May 2005
+
+NAME
+----
+git-log-script - Prettified version of git-rev-list
+
+
+SYNOPSIS
+--------
+'git-log-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-status-script.txt b/Documentation/git-status-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-status-script.txt
@@ -0,0 +1,29 @@
+git-status-script(1)
+====================
+v0.1, May 2005
+
+NAME
+----
+git-status-script - Show status of working directory files
+
+
+SYNOPSIS
+--------
+'git-status-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
======== end ========


^ permalink raw reply	[relevance 16%]

* [PATCH] One-Git Part 2 (Patch 3/3)
@ 2005-06-01 18:23  7% Jason McMullan
  0 siblings, 0 replies; 200+ results
From: Jason McMullan @ 2005-06-01 18:23 UTC (permalink / raw)
  To: git

Add: 'compiled in' scripts, using zlib

one-git now includes everything!

Requires: one-git Part 1 (all patches), and one-git Part 2 (pre 1, pre 2)

Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -20,18 +20,16 @@ CC=gcc
 AR=ar
 INSTALL=install
 
-SCRIPTS=git git-apply-patch-script git-merge-one-file-script git-prune-script \
-	git-pull-script git-tag-script git-resolve-script git-whatchanged \
-	git-deltafy-script git-fetch-script git-status-script git-commit-script \
-	git-log-script
-
 # The following order determines 'git help' command order.
 # The general idea is 'pull, push, inspect, commit, db, misc'
 PROG=   \
+	git-pull \
 	git-local-pull \
 	git-http-pull \
 	git-rpull \
 	git-rpush \
+	git-whatchanged \
+	git-log \
 	git-rev-tree \
 	git-rev-list \
 	git-export \
@@ -45,20 +43,29 @@ PROG=   \
 	git-diff-files \
 	git-diff-cache \
 	git-diff-tree \
+	git-status \
 	git-merge-base \
 	git-merge-cache \
+	git-merge-one-file \
+	git-resolve \
 	git-apply \
+	git-apply-patch \
 	git-check-files \
 	git-write-tree \
 	git-commit-tree \
+	git-commit \
 	git-mktag \
+	git-tag \
 	git-init-db \
 	git-fsck-cache \
+	git-prune \
 	git-convert-cache \
 	git-mkdelta \
+	git-deltafy \
 	git-tar-tree \
 	git-get-tar-commit-id \
 	git-write-blob \
+	git-fetch \
 	git-diff-helper \
 	git-stripspace
 
@@ -86,6 +93,8 @@ LIB_OBJS += pull.o
 
 LIB_OBJS += rsh.o
 
+LIB_OBJS += zscript.o
+
 LIBS = $(LIB_FILE)
 LIBS += -lz
 
@@ -116,8 +125,14 @@ test-date: test-date.c date.o
 test-delta: test-delta.c diff-delta.o patch-delta.o
 	$(CC) $(CFLAGS) -o $@ $^
 
-git-%.o: %.c #Makefile
-	$(CC) $(CFLAGS) -c -o $@ $*.c -Dmain=git_$(subst -,_,$*) -Ddesc=git_$(subst -,_,$*)_desc
+git-%-script.h: git-%-script
+	./zwrap script <git-$*-script >git-$*-script.h
+
+git-%.o: zwrap git-%-script git-%-script.h git-script.c
+	$(CC) $(CFLAGS) -c -o $@ git-script.c -Dscript=git_$(subst -,_,$*)_script -Dmain=git_$(subst -,_,$*) --include git-$*-script.h
+
+git-%.o: %.c Makefile
+	$(CC) $(CFLAGS) -c -o $@ $*.c -Dmain=git_$(subst -,_,$*)
 
 $(PROG):
 	ln -s git $@
@@ -127,7 +142,10 @@ git.o: git-commands.h
 git: git.o $(patsubst %,%.o,$(PROG)) $(LIBS)
 	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
 
-git-commands.h: Makefile $(patsubst %,Documentation/%.txt,$(PROG))
+zwrap: zwrap.o $(LIBS)
+	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
+git-commands.h: Makefile #$(patsubst %,Documentation/%.txt,$(PROG))
 	echo -n >git-commands.h
 	for prog in $(subst -,_,$(PROG)); do \
 		echo "extern int $${prog}(int argc, char **argv);" >>git-commands.h ; \
@@ -139,7 +157,14 @@ git-commands.h: Makefile $(patsubst %,Do
 	echo "	const char *desc; } commands[]={" >>git-commands.h
 	for cmd in $(patsubst git-%,%,$(PROG)); do \
 		prog=`echo $$cmd | sed -e 's/-/_/g'` ; \
-		desc=`grep "^git-$$cmd - " Documentation/git-$$cmd.txt | cut -d' ' -f3-` ; \
+		if [ -f Documentation/git-$${cmd}-script.txt ]; then \
+		  doc=Documentation/git-$${cmd}-script.txt; \
+		else doc=Documentation/git-$${cmd}.txt; fi; \
+		if [ ! -e $${doc} ]; then \
+			echo "MISSING: $$doc" 1>&2; \
+			rm -f git-commands.h; \
+			exit 1; fi; \
+		desc=`grep "^git-$$cmd - " $$doc | cut -d' ' -f3-` ; \
 		desc=`echo "$$desc" | sed -e 's/"/\\\\"/g'` ; \
 		echo -n "	{ " >>git-commands.h ; \
 		echo -n ".command = \"$${cmd}\", " >>git-commands.h ; \
@@ -172,7 +197,8 @@ test: all
 	$(MAKE) -C t/ all
 
 clean:
-	rm -f *.o mozilla-sha1/*.o ppc/*.o git $(PROG) $(LIB_FILE)
+	rm -f *.o mozilla-sha1/*.o ppc/*.o \
+		zwrap git $(PROG) $(LIB_FILE) git-*-script.h
 	$(MAKE) -C Documentation/ clean
 
 backup: clean
diff --git a/git b/git
deleted file mode 100755
--- a/git
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/sh
-cmd="git-$1-script"
-shift
-exec $cmd "$@"
diff --git a/git-script.c b/git-script.c
new file mode 100644
--- /dev/null
+++ b/git-script.c
@@ -0,0 +1,8 @@
+#include <stdio.h>
+
+extern int zscript(const void *s, size_t s_len, int argc, char **argv);
+
+int main(int argc, char **argv)
+{
+	return zscript(script,sizeof(script),argc,argv);
+}
diff --git a/zscript.c b/zscript.c
new file mode 100644
--- /dev/null
+++ b/zscript.c
@@ -0,0 +1,62 @@
+#include <stdio.h>
+#include <sys/wait.h>
+#include <sys/types.h>
+
+#include <zlib.h>
+
+#include "cache.h"
+
+int zscript(void *script, size_t script_len, int argc, char **argv)
+{
+	z_stream stream;
+	char buff[8192];
+	int fd, ret, status;
+	pid_t pid;
+	char template[]="/tmp/git-script.XXXXXX";
+
+	fd = mkstemp(template);
+	if (fd < 0)
+		die("Can't create file %s\n",template);
+
+	memset(&stream, 0, sizeof(stream));
+	stream.next_in = script;
+	stream.avail_in = script_len;
+
+	inflateInit(&stream);
+	do {
+		stream.next_out = buff;
+		stream.avail_out = sizeof(buff);
+		ret = inflate(&stream, Z_SYNC_FLUSH);
+		write(fd,buff,sizeof(buff)-stream.avail_out);
+	} while (stream.avail_in && ret == Z_OK);
+	inflateEnd(&stream);
+
+	pid = fork();
+	if (pid < 0) {
+		unlink(template);
+		die("Can't fork.");
+	}
+
+	if (! pid) {	/* Child */
+		char **args;
+		args=xmalloc(sizeof(char *)*(argc+1));
+		memcpy(&args[2],&argv[1],sizeof(char *)*(argc-1));
+		argv[0]="/bin/sh";
+		argv[1]=template;
+		execv("/bin/sh",argv);
+		exit(1);	/* Hopefully unreachable */
+	}
+
+	if (waitpid(pid, &status, 0) < 0 ||
+		!WIFEXITED(status) || WEXITSTATUS(status))
+		goto error;
+
+	unlink(template);
+	return 0;
+
+
+error:
+	unlink(template);
+	return 1;
+}
+		
diff --git a/zwrap.c b/zwrap.c
new file mode 100644
--- /dev/null
+++ b/zwrap.c
@@ -0,0 +1,88 @@
+/* Takes stdin, and makes a C header file out of it, compressed
+ * with zlib.
+ *
+ * Author: Jason McMullan <jason.mcmullan@timesys.com>
+ */
+
+#include <stdio.h>
+#include <zlib.h>
+
+#include "cache.h"
+
+unsigned char *read_from(int fd, size_t *bsize)
+{
+	unsigned char *buff;
+	int len;
+	size_t size;
+
+	buff=xmalloc(1024);
+	size = 0;
+
+	while ((len = read(fd, buff, 1024-(size % 1024))) > 0) {
+		size += len;
+		if ((size % 1024)==0)
+			buff = xrealloc(buff, size+1024);
+	}
+
+	if (len < 0) {
+		size=0;
+		free(buff);
+		buff=NULL;
+	}
+
+	*bsize = size;
+	return buff;
+}
+
+
+int main(int argc, char **argv)
+{
+	const char *name="data";
+	z_stream stream;
+	size_t size, dsize;
+	unsigned char *dbuff,*buff;
+
+	if (argc == 2)
+		name=argv[1];
+	else if (argc != 1) {
+		fprintf(stderr,"Usage:\n%s [name] <somefile\n",argv[0]);
+		return 1;
+	}
+
+	buff=read_from(0, &size);
+	if (buff == NULL)
+		return 1;
+
+	/* Init zlib */
+	memset(&stream, 0, sizeof(stream));
+	deflateInit(&stream, Z_BEST_COMPRESSION);
+	dsize = deflateBound(&stream, size);
+	dbuff = xmalloc(dsize);
+
+	stream.next_out = dbuff;
+	stream.avail_out = dsize;
+
+	while (deflate(&stream, 0) == Z_OK)
+		/* nothing */;
+
+	stream.next_in  = buff;
+	stream.avail_in  = size;
+	while (deflate(&stream, Z_FINISH) == Z_OK)
+		/* nothing */;
+	deflateEnd(&stream);
+
+	dsize = stream.total_out;
+
+	printf("const unsigned char %s[] = {\n", name);
+	for ( size = 0; size < dsize; size++) {
+		if ((size % 8) == 0)
+			printf("\t");
+		printf("0x%.2x,",dbuff[size]);
+		if ((size % 8) == 7)
+			printf("\n");
+	}
+	printf("};\n");
+
+	return 0;
+}
+
======== end ========


^ permalink raw reply	[relevance 7%]

* Re: I want to release a "git-1.0"
  @ 2005-06-01 22:00  3%     ` Daniel Barkalow
  2005-06-03  9:47  0%       ` Petr Baudis
    1 sibling, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-06-01 22:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eric W. Biederman, Git Mailing List

On Tue, 31 May 2005, Linus Torvalds wrote:

> On Tue, 31 May 2005, Eric W. Biederman wrote:
> > 
> > I way behind the power curve on learning git at this point but
> > one piece of the puzzle that CVS has that I don't believe git does
> > are multiple people committing to the same repository, especially
> > remotely.  I don't see that as a down side of git but it is a common
> > way people CVS so it is worth documenting.
> 
> It's actually one thing git doesn't do per se.
> 
> You have to do a "git-pull-script" from the common repository side, 
> there's no "git-push-script". Ugly.

It shouldn't be hard to do one, except that locking with rsync is going to
be a pain. I had a patch to make it work with the rpush/rpull pair, but I
didn't get its dependancies in at the time. I can dust those patches off
again if you want that functionality included.

The patches are essentially:

 - make the transport protocol handle things other than objects
 - library procedure for locking atomic update of refs files
 - fetching refs in general
 - rpull/rpush that updates a specified ref file atomically

At least the first would be very nice to get in before 1.0, since it is an
incompatible change to the protocol.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 3%]

* [PATCH] Handle deltified object correctly in git-*-pull family.
  @ 2005-06-02 16:46 10%   ` Junio C Hamano
      0 siblings, 2 replies; 200+ results
From: Junio C Hamano @ 2005-06-02 16:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
The initial parts of each retrieved SHA1 file is inflated and
inspected to see if it is deltified, and its base object is
asked from the remote side when it is.  Since this partial
inflation and inspection has a small performance hit, it can
optionally be skipped by giving -d flag to git-*-pull commands.
This flag should be used only when the remote repository is
known to have no deltified objects.

Rsync transport does not have this problem since it fetches
everything the remote side has.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Linus, this uses the new helper you wrote.  The interface is
*** much more pleasant to use.

 Documentation/git-http-pull.txt  |    6 +++++-
 Documentation/git-local-pull.txt |    6 +++++-
 Documentation/git-rpull.txt      |    6 +++++-
 cache.h                          |    3 +++
 pull.h                           |    3 +++
 http-pull.c                      |    4 +++-
 local-pull.c                     |    4 +++-
 pull.c                           |    6 ++++++
 rpull.c                          |    4 +++-
 sha1_file.c                      |   40 ++++++++++++++++++++++++++++++++++++++
 10 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt
--- a/Documentation/git-http-pull.txt
+++ b/Documentation/git-http-pull.txt
@@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r
 
 SYNOPSIS
 --------
-'git-http-pull' [-c] [-t] [-a] [-v] commit-id url
+'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url
 
 DESCRIPTION
 -----------
@@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path
 
 DESCRIPTION
 -----------
@@ -23,6 +23,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt
--- a/Documentation/git-rpull.txt
+++ b/Documentation/git-rpull.txt
@@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito
 
 SYNOPSIS
 --------
-'git-rpull' [-c] [-t] [-a] [-v] commit-id url
+'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url
 
 DESCRIPTION
 -----------
@@ -25,6 +25,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/cache.h b/cache.h
--- a/cache.h
+++ b/cache.h
@@ -158,6 +158,9 @@ extern int write_sha1_file(void *buf, un
 
 extern int check_sha1_signature(unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
+extern int sha1_delta_base(const unsigned char *, unsigned char *);
+
+
 /* Read a tree into the cache */
 extern int read_tree(void *buffer, unsigned long size, int stage);
 
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -13,6 +13,9 @@ extern int get_history;
 /** Set to fetch the trees in the commit history. **/
 extern int get_all;
 
+/* Set to zero to skip the check for delta object base. */
+extern int get_delta;
+
 /* Set to be verbose */
 extern int get_verbosely;
 
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -103,6 +103,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -113,7 +115,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -74,7 +74,7 @@ int fetch(unsigned char *sha1)
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -92,6 +92,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		else if (argv[arg][1] == 'c')
 			get_history = 1;
+		else if (argv[arg][1] == 'd')
+			get_delta = 0;
 		else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -6,6 +6,7 @@
 
 int get_tree = 0;
 int get_history = 0;
+int get_delta = 1;
 int get_all = 0;
 int get_verbosely = 0;
 static unsigned char current_commit_sha1[20];
@@ -37,6 +38,11 @@ static int make_sure_we_have_it(const ch
 	status = fetch(sha1);
 	if (status && what)
 		report_missing(what, sha1);
+	if (get_delta) {
+		char delta_sha1[20];
+		if (sha1_delta_base(sha1, delta_sha1))
+			status = make_sure_we_have_it(what, delta_sha1);
+	}
 	return status;
 }
 
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -27,6 +27,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -37,7 +39,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-rpull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -347,6 +347,46 @@ void * unpack_sha1_file(void *map, unsig
 	return buf;
 }
 
+int sha1_delta_base(const unsigned char *sha1, unsigned char *delta_sha1)
+{
+	unsigned long mapsize, size;
+	void *map;
+	char type[20];
+	char buffer[200];
+	z_stream stream;
+	int ret, bytes, status;
+
+	map = map_sha1_file(sha1, &mapsize);
+	if (!map)
+		return 0;
+	ret = unpack_sha1_header(&stream, map, mapsize, buffer,
+				 sizeof(buffer));
+	status = 0;
+
+	if (ret < Z_OK ||
+	    sscanf(buffer, "%10s %lu", type, &size) != 2 ||
+	    strcmp(type, "delta"))
+		goto out;
+	bytes = strlen(buffer) + 1;
+	if (size - bytes < 20)
+		goto out;
+
+	memmove(buffer, buffer + bytes, stream.total_out - bytes);
+	bytes = stream.total_out - bytes;
+	if (bytes < 20 && ret == Z_OK) {
+		stream.next_out = buffer + bytes;
+		stream.avail_out = sizeof(buffer) - bytes;
+		while (inflate(&stream, Z_FINISH) == Z_OK)
+			; /* nothing */
+	}
+	status = 1;
+	memcpy(delta_sha1, buffer, 20);
+ out:
+	inflateEnd(&stream);
+	munmap(map, mapsize);
+	return status;
+}
+
 void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size)
 {
 	unsigned long mapsize;
------------


^ permalink raw reply	[relevance 10%]

* Re: [PATCH] Handle deltified object correctly in git-*-pull family.
  @ 2005-06-02 18:02  3%       ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-02 18:02 UTC (permalink / raw)
  To: McMullan, Jason; +Cc: Linus Torvalds, GIT Mailling list

>>>>> "JM" == McMullan, Jason <jason.mcmullan@timesys.com> writes:

JM> Eww. Don't you want to attempt to get the referenced sha1 *before*
JM> you stick the delta blob into the repository?

That issue crossed my mind, and I admit I haven't looked at the
issues closely enough, but I suspect that it may not worth it
with the current pull.c structure.

The current pull code fetches and stores a commit object before
it retrieves the tree object associate with it, and similarly a
tree object before its subtree and blobs, which has the same
issue.  Adding -r (recover) option to the pull family to not
check for the existence of required object but its dependents
would be necessary if my suspition turns out to be correct, and
delta dependency should be handled the same way commit and tree
dependencies are handled there.


^ permalink raw reply	[relevance 3%]

* [PATCH] Handle deltified object correctly in git-*-pull family.
  @ 2005-06-02 18:55 10%       ` Junio C Hamano
    0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2005-06-02 18:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
The initial parts of each retrieved SHA1 file is inflated and
inspected to see if it is deltified, and its base object is
asked from the remote side when it is.  Since this partial
inflation and inspection has a small performance hit, it can
optionally be skipped by giving -d flag to git-*-pull commands.
This flag should be used only when the remote repository is
known to have no deltified objects.

Rsync transport does not have this problem since it fetches
everything the remote side has.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Now uses parse_sha1_header() and unpack_sha1_rest().  I
*** decided not to make it the callers responsibility to check
*** what we have already got and fixed unpack_sha1_rest() to
*** avoid copying more than size bytes.

 Documentation/git-http-pull.txt  |    6 ++++-
 Documentation/git-local-pull.txt |    6 ++++-
 Documentation/git-rpull.txt      |    6 ++++-
 cache.h                          |    1 +
 pull.h                           |    3 +++
 http-pull.c                      |    4 +++-
 local-pull.c                     |    4 +++-
 pull.c                           |    7 ++++++
 rpull.c                          |    4 +++-
 sha1_file.c                      |   43 +++++++++++++++++++++++++++++++++++++-
 10 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt
--- a/Documentation/git-http-pull.txt
+++ b/Documentation/git-http-pull.txt
@@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r
 
 SYNOPSIS
 --------
-'git-http-pull' [-c] [-t] [-a] [-v] commit-id url
+'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url
 
 DESCRIPTION
 -----------
@@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path
 
 DESCRIPTION
 -----------
@@ -23,6 +23,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt
--- a/Documentation/git-rpull.txt
+++ b/Documentation/git-rpull.txt
@@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito
 
 SYNOPSIS
 --------
-'git-rpull' [-c] [-t] [-a] [-v] commit-id url
+'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url
 
 DESCRIPTION
 -----------
@@ -25,6 +25,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/cache.h b/cache.h
--- a/cache.h
+++ b/cache.h
@@ -153,6 +153,7 @@ extern char *sha1_file_name(const unsign
 extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size);
 extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep);
+extern int sha1_delta_base(const unsigned char *, unsigned char *);
 extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size);
 extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size);
 extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -13,6 +13,9 @@ extern int get_history;
 /** Set to fetch the trees in the commit history. **/
 extern int get_all;
 
+/* Set to zero to skip the check for delta object base. */
+extern int get_delta;
+
 /* Set to be verbose */
 extern int get_verbosely;
 
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -103,6 +103,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -113,7 +115,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -74,7 +74,7 @@ int fetch(unsigned char *sha1)
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -92,6 +92,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		else if (argv[arg][1] == 'c')
 			get_history = 1;
+		else if (argv[arg][1] == 'd')
+			get_delta = 0;
 		else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -6,6 +6,7 @@
 
 int get_tree = 0;
 int get_history = 0;
+int get_delta = 1;
 int get_all = 0;
 int get_verbosely = 0;
 static unsigned char current_commit_sha1[20];
@@ -37,6 +38,12 @@ static int make_sure_we_have_it(const ch
 	status = fetch(sha1);
 	if (status && what)
 		report_missing(what, sha1);
+	if (get_delta) {
+		char delta_sha1[20];
+		status = sha1_delta_base(sha1, delta_sha1);
+		if (0 < status)
+			status = make_sure_we_have_it(what, delta_sha1);
+	}
 	return status;
 }
 
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -27,6 +27,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -37,7 +39,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-rpull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -325,7 +325,13 @@ void *unpack_sha1_rest(z_stream *stream,
 	int bytes = strlen(buffer) + 1;
 	char *buf = xmalloc(1+size);
 
-	memcpy(buf, buffer + bytes, stream->total_out - bytes);
+	/* (stream->total_out - bytes) is what we already have.  The
+	 * caller could be asking for something smaller than that.
+	 */
+	if (size < stream->total_out - bytes)
+		memcpy(buf, buffer + bytes, size);
+	else
+		memcpy(buf, buffer + bytes, stream->total_out - bytes);
 	bytes = stream->total_out - bytes;
 	if (bytes < size) {
 		stream->next_out = buf + bytes;
@@ -401,6 +407,41 @@ void * unpack_sha1_file(void *map, unsig
 	return unpack_sha1_rest(&stream, hdr, *size);
 }
 
+int sha1_delta_base(const unsigned char *sha1, unsigned char *base_sha1)
+{
+	int ret;
+	unsigned long mapsize, size;
+	void *map;
+	z_stream stream;
+	char hdr[1024], type[20];
+	void *delta_data_head;
+
+	map = map_sha1_file(sha1, &mapsize);
+	if (!map)
+		return -1;
+	ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr));
+	if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) {
+		ret = -1;
+		goto out;
+	}
+	if (strcmp(type, "delta")) {
+		ret = 0;
+		goto out;
+	}
+	delta_data_head = unpack_sha1_rest(&stream, hdr, 20);
+	if (!delta_data_head) {
+		ret = -1;
+		goto out;
+	}
+	ret = 1;
+	memcpy(base_sha1, delta_data_head, 20);
+	free(delta_data_head);
+ out:
+	inflateEnd(&stream);
+	munmap(map, mapsize);
+	return ret;
+}
+
 void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size)
 {
 	unsigned long mapsize;
------------


^ permalink raw reply	[relevance 10%]

* Re: [SCRIPT] cg-rpush & locking
  @ 2005-06-02 19:15  1% ` Dan Holmsand
  0 siblings, 0 replies; 200+ results
From: Dan Holmsand @ 2005-06-02 19:15 UTC (permalink / raw)
  To: git; +Cc: Matthias Urlichs

[-- Attachment #1: Type: text/plain, Size: 1658 bytes --]

Tony Lindgren wrote:
> Anybody have any better ideas for locking that also works with
> rsync?

Since this seems to be Push Day on the git list, here's my feeble 
attempt at the same thing (attached cg-push script).

I do stuff in this order:

     1. read remote HEAD
     2. check that a merge from local HEAD to remote would be
        fast-forward, otherwise tell people to pull and merge.
     3. push objects using --ignore-existing (or equivalent)
     4. write lock file with --ignore-existing. The lock file
        contains, in particular, the HEAD to be written.
     5. read remote HEAD (again) and the lock file. Bail if HEAD
        changed since the first read, or if the lock file isn't
        the one we attempted to write.
     6. write remote HEAD and delete lock file in using rsync's
        --delete-after

This should always be safe, since rsync (as I understand the man page) 
always writes to temp files, and then renames into place. Checking for 
fast-forward-mergability assures that other peoples changes don't get lost.

cg-push determines uri and foreign branch name using the same rules as 
cg-pull, which is real nice as it allows you to do:

$ cg-clone me@myserver.example.com:git-repos/myrepo.git mystuff
$ cd mystuff
# write some stuff
$ cg-commit
$ cg-push

and later

$ cg-update
# write more stuff
$ cg-commit
$ cg-push

and even later

$ cg-branch-add pub me@myserver.example.com:public-repos/myrepo.git
$ cg-push pub

So, you can work pretty much exactly as you would do in CVS or svn, if 
you're so inclined, safely sharing a common repository among many users.

Which is kinda neat, if I may say so myself...

/dan

[-- Attachment #2: cg-push --]
[-- Type: text/plain, Size: 6219 bytes --]

#! /usr/bin/env bash
#
# Push changes to a remote git repository
#
# Copyright (c) Dan Holmsand, 2005.
#
# Based on cg-pull,
# Copyright (c) Petr Baudis, 2005.
#
# Takes the branch name as an argument, defaulting to "origin" (see
# `cg-branch-add` for some description of branch names).
#
# Takes one optional option: --force, that makes cg-push write objects
# regardless of lock-files and remote state. Use with care...
#
# cg-push supports two types of location specifiers:
#
# 1. local paths - simple directory names of git repositories
# 2. rsync - using the "[user@]machine:some/path" syntax
#
# Typical use would look something like this:
#
#	# clone a remote branch:
#	cg-clone me@myserver.example.com:repo.git myrepo
#	cd myrepo
#
#	# make some changes, and then do:
#	cg-commit
#	cg-push
#	
# cg-push is safe to use even if multiple users concurrently push
# to the same repository. Here's how this works:
#
# First, cg-push checks that the local repository is fully merged with
# the remote one (this will always be the case if there is only one
# user). Otherwise, you need to cg-update from the remote repository
# before you can cg-push to it.
#
# Then, cg-push writes all the object files that are missing in the
# remote repository.
#
# To finish, cg-push writes a lock file to the remote site (ignoring
# any preexisting lock file), checks that our lock file actually got 
# written, checks that the remote head is still the same (i.e. that 
# the remote site hasn't been updated while we were copying objects), 
# writes the new remote head and removes the lock.
#
# The head of the local repository is also updated in the process (as if
# the remote branch had been cg-pull'ed).
#
# cg-push requires that there already is a repository in place at the
# remote location, but it actually only checks that it has a 
# "refs/heads" subdirectory. So, creating a remote repo ready for
# cg-pushing is as easy "mkdir -p repo.git/heads/refs" at the remote
# location.

# TODO: Write tags as well.

. ${COGITO_LIB}cg-Xlib

force=
if [ "$1" = --force ]; then
	force=1; shift
fi

name=$1
[ "$name" ] || { [ -s $_git/refs/heads/origin ] && name=origin; }
[ "$name" ] || die "what to push to?"
uri=$(cat "$_git/branches/$name" 2>/dev/null) || die "unknown branch: $name"

rembranch=master
if echo "$uri" | grep -q '#'; then
	rembranch=$(echo $uri | cut -d '#' -f 2)
	uri=$(echo $uri | cut -d '#' -f 1)
fi

case $uri in
	*:*)
	readhead=rsync_readhead
	writeobjects=rsync_writeobjects
	writehead=rsync_writehead
	lock=rsync_lock
	;;
	*)
	if [ -d "$uri" ]; then
		[ -d "$uri/.git" ] && uri=$uri/.git
		readhead=local_readhead
		writeobjects=local_writeobjects
		writehead=local_writehead
		lock=local_lock
	else
		die "Don't know how to push to $uri"
	fi
	;;
esac

tmpd=$(mktemp -d -t cgpush.XXXXXX) || exit 1
trap "rm -rf $tmpd" SIGTERM EXIT

cid=$(commit-id) || exit 1
lock_msg="locked by $USER@$HOSTNAME on $(date) for writing $cid"
unset locked remhead

rsync_readhead() {
	rm -f "$tmpd/*" || return 1
	rsync $RSYNC_FLAGS --include="$rembranch" --include="$rembranch.lock" \
		--exclude='*' -r "$uri/refs/heads/" "$tmpd/" >&2 || 
		die "Fetching heads from $uri failed. Aborting."
	if [ "$locked" ]; then
		[ -s "$tmpd/$rembranch.lock" ] ||
		die "Couldn't acquire lock. Aborting."

		local rem_lock_msg=$(cat "$tmpd/$rembranch.lock")
		[ "$lock_msg" = "$rem_lock_msg" ] ||
		die "Remote is locked ($rem_lock_msg)."
	fi
	[ ! -e "$tmpd/$rembranch" ] || cat "$tmpd/$rembranch"
}

rsync_writeobjects() {
	[ -d "$_git/objects/" ] || die "no objects to copy"
	rsync $RSYNC_FLAGS -vr --ignore-existing --whole-file \
		"$_git/objects/" "$uri/objects/" 
}

rsync_lock() {
	echo "$lock_msg" > $tmpd/new_head_lock_file || return 1
	rsync $RSYNC_FLAGS --ignore-existing --whole-file \
		$tmpd/new_head_lock_file "$uri/refs/heads/$rembranch.lock"
}

rsync_writehead() {
	local heads=$tmpd/newhead
	mkdir $heads && echo "$1" > "$heads/$rembranch" || return 1
	rsync $RSYNC_FLAGS --include="$rembranch" --include="$rembranch.lock" \
		--exclude='*' --delete-after -r $heads/ "$uri/refs/heads/" 
}

local_readhead() {
	local lheads=$uri/refs/heads
	[ -d "$lheads" ] || die "no remote heads found at $uri"
	[ ! -e "$lheads/$rembranch" ] || cat "$lheads/$rembranch" 
}

local_writeobjects() {
	[ -d "$_git/objects/" ] || die "no objects to copy"
	[ -d "$uri/objects" ] || 
		GIT_DIR=$uri GIT_OBJECT_DIRECTORY=$uri/objects git-init-db ||
		die "git-init-db failed"
        # Note: We could use git-local-pull here, but this is safer
	# (git-*-pull don't react well to failures or kills), and
	# has the same semantics as rsync pushing.
	local dest=$(cd "$uri/objects" && pwd) || exit 1
	( cd "$_git/objects" && find -type f | while read f; do
		[ -f "$dest/$f" ] && continue
		ln "$f" "$dest/$f" 2>/dev/null || 
		cp "$f" "$dest/$f" || exit 1
	done ) 
}

local_lock() {
	([ "$force" ] || set -C 
	echo "$lockmsg" > "$uri/refs/heads/$rembranch.lock") 2>/dev/null
}

local_writehead() {
	local head=$uri/refs/heads/$rembranch
	echo "$1" > "$head.new" && mv "$head.new" "$head" &&
	rm "$head.lock"
}

echo "Checking remote repository"
remhead=$($readhead) || exit 1
[ "$remhead" ] || echo "Creating new branch"

if [ "$remhead" -a -z "$force" ]; then
	if [ "$remhead" = "$cid" ]; then
		echo "Remote branch \`$name' is already pushed" 
		exit 0
	fi
	git-cat-file commit "$remhead" &> /dev/null ||
	die "You need to pull from $name first. Aborting."

	base=$(git-merge-base "$remhead" "$cid") && [ "$base" ] ||
	die "You need to merge $name. Aborting."

	if [ "$base" = "$cid" ]; then
		echo "No changes to push"; exit 0
	fi

	[ "$base" = "$remhead" ] || 
	die "You need to merge $name first. Aborting." 
fi

echo "Writing objects"
$writeobjects || die "Failed to write objects. Aborting."

echo
echo "Writing new head"
$lock || die "Couldn't acquire lock on remote. Aborting."

if [ ! "$force" ]; then
	locked=1
	remhead2=$($readhead) || die "Aborting."
	[ "$remhead" = "$remhead2" ] || 
		die "Remote head changed during copy. Aborting."
fi

$writehead "$cid" || die "WARNING: Error writing remote head. Aborting."
echo "Push to $name succeeded"

echo "$cid" > "$_git/refs/heads/$name"
echo "Updated local head for $name to $cid"


^ permalink raw reply	[relevance 1%]

* [PATCH 1/2] Handle deltified object correctly in git-*-pull family.
  @ 2005-06-02 22:19 10%             ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-02 22:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, git

>>>>> "NP" == Nicolas Pitre <nico@cam.org> writes:

>> Here you don't need to call unpack_sha1_rest() at all which would call 
>> xmalloc and another memcpy needlessly.  Instead,...

Like this...

------------
When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
The initial parts of each retrieved SHA1 file is inflated and
inspected to see if it is deltified, and its base object is
asked from the remote side when it is.  Since this partial
inflation and inspection has a small performance hit, it can
optionally be skipped by giving -d flag to git-*-pull commands.
This flag should be used only when the remote repository is
known to have no deltified objects.

Rsync transport does not have this problem since it fetches
everything the remote side has.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Thanks and credits goes to Nico for suggesting not to
*** use unpack_sha1_rest().

 Documentation/git-http-pull.txt  |    6 +++++-
 Documentation/git-local-pull.txt |    6 +++++-
 Documentation/git-rpull.txt      |    6 +++++-
 cache.h                          |    1 +
 pull.h                           |    3 +++
 http-pull.c                      |    4 +++-
 local-pull.c                     |    4 +++-
 pull.c                           |    7 +++++++
 rpull.c                          |    4 +++-
 sha1_file.c                      |   31 +++++++++++++++++++++++++++++++
 10 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt
--- a/Documentation/git-http-pull.txt
+++ b/Documentation/git-http-pull.txt
@@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r
 
 SYNOPSIS
 --------
-'git-http-pull' [-c] [-t] [-a] [-v] commit-id url
+'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url
 
 DESCRIPTION
 -----------
@@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path
 
 DESCRIPTION
 -----------
@@ -23,6 +23,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt
--- a/Documentation/git-rpull.txt
+++ b/Documentation/git-rpull.txt
@@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito
 
 SYNOPSIS
 --------
-'git-rpull' [-c] [-t] [-a] [-v] commit-id url
+'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url
 
 DESCRIPTION
 -----------
@@ -25,6 +25,10 @@ OPTIONS
 	Get trees associated with the commit objects.
 -a::
 	Get all the objects.
+-d::
+	Do not check for delta base objects (use this option
+	only when you know the remote repository is not
+	deltified).
 -v::
 	Report what is downloaded.
 
diff --git a/cache.h b/cache.h
--- a/cache.h
+++ b/cache.h
@@ -153,6 +153,7 @@ extern char *sha1_file_name(const unsign
 extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size);
 extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep);
+extern int sha1_delta_base(const unsigned char *, unsigned char *);
 extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size);
 extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size);
 extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -13,6 +13,9 @@ extern int get_history;
 /** Set to fetch the trees in the commit history. **/
 extern int get_all;
 
+/* Set to zero to skip the check for delta object base. */
+extern int get_delta;
+
 /* Set to be verbose */
 extern int get_verbosely;
 
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -103,6 +103,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -113,7 +115,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -74,7 +74,7 @@ int fetch(unsigned char *sha1)
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -92,6 +92,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		else if (argv[arg][1] == 'c')
 			get_history = 1;
+		else if (argv[arg][1] == 'd')
+			get_delta = 0;
 		else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -6,6 +6,7 @@
 
 int get_tree = 0;
 int get_history = 0;
+int get_delta = 1;
 int get_all = 0;
 int get_verbosely = 0;
 static unsigned char current_commit_sha1[20];
@@ -37,6 +38,12 @@ static int make_sure_we_have_it(const ch
 	status = fetch(sha1);
 	if (status && what)
 		report_missing(what, sha1);
+	if (get_delta) {
+		char delta_sha1[20];
+		status = sha1_delta_base(sha1, delta_sha1);
+		if (0 < status)
+			status = make_sure_we_have_it(what, delta_sha1);
+	}
 	return status;
 }
 
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -27,6 +27,8 @@ int main(int argc, char **argv)
 			get_tree = 1;
 		} else if (argv[arg][1] == 'c') {
 			get_history = 1;
+		} else if (argv[arg][1] == 'd') {
+			get_delta = 0;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -37,7 +39,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-rpull [-c] [-t] [-a] [-v] commit-id url");
+		usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -401,6 +401,37 @@ void * unpack_sha1_file(void *map, unsig
 	return unpack_sha1_rest(&stream, hdr, *size);
 }
 
+int sha1_delta_base(const unsigned char *sha1, unsigned char *base_sha1)
+{
+	int ret;
+	unsigned long mapsize, size;
+	void *map;
+	z_stream stream;
+	char hdr[64], type[20];
+	void *delta_data_head;
+
+	map = map_sha1_file(sha1, &mapsize);
+	if (!map)
+		return -1;
+	ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr));
+	if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) {
+		ret = -1;
+		goto out;
+	}
+	if (strcmp(type, "delta")) {
+		ret = 0;
+		goto out;
+	}
+
+	delta_data_head = hdr + strlen(hdr) + 1;
+	ret = 1;
+	memcpy(base_sha1, delta_data_head, 20);
+ out:
+	inflateEnd(&stream);
+	munmap(map, mapsize);
+	return ret;
+}
+
 void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size)
 {
 	unsigned long mapsize;
------------


^ permalink raw reply	[relevance 10%]

* [ANNOUNCE] cogito-0.11
@ 2005-06-02 22:23  3% Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-06-02 22:23 UTC (permalink / raw)
  To: git

  Hello,

  so I'm happy to finally announce cogito-0.11, a SCMish interface to
Linus' git storage system. (It's actually 0.11.1 because I forgot to do
some stuff before the release.) Get it at

	kernel.org/pub/software/scm/cogito/

or just pull it if you have your Cogito tree Cogito-tracked.

  There's probably too many things which have changed since cg-0.10.
There were plenty of bugfixes, the diff format changed, the git side of
stuff made a giant leap forward, etc. It's just better. :-) (Hopefully.)

  Note that I tried to take all the seemingly-important bugfixes I've
noticed, but my feature patches queue is just huge so if your patch is
not inside (it probably isn't unless I notified you over email), don't
panic. If you think it's an important bugfix which should go in now,
please resend it. And if it's a feature patch but older than a week or
two, resend it too. I will try to process my queue as fast as I will be
able to.

  This release wasn't extensively tested and there were some last-minute
changes. So handle it with a little bit of care and expect cogito-0.11.2
possibly following soon with some more bugfixes (if there are any
problems found).

  Another thing to note if you are pulling - there are might be some
obsolete or (in case of the git-pb tree) completely nonsensical tags
in your tree, and they will slow down cg-pull/cg-update a lot now - it
will complain about them and if they don't get fetched ("different
tree" message is provided instead of "retrieved"), just rm the tags for
good.

  Have fun,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 3%]

* Re: I want to release a "git-1.0"
  @ 2005-06-03  1:34  3%           ` Adam Kropelin
  0 siblings, 0 replies; 200+ results
From: Adam Kropelin @ 2005-06-03  1:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

Linus Torvalds wrote:
> On Thu, 2 Jun 2005, Linus Torvalds wrote:
>>
>> Yeah, I'll try to clarify.
>
> Adam, do you find the current version a bit more clear on this?

Absolutely. I especially like the new digression explaining that 
the --cached flag controls where file _content_ is fetched from and 
reinforcing that the index file always governs which files are involved 
in the diff.

Thanks!

--Adam


^ permalink raw reply	[relevance 3%]

* Re: I want to release a "git-1.0"
  2005-06-01 22:00  3%     ` Daniel Barkalow
@ 2005-06-03  9:47  0%       ` Petr Baudis
  2005-06-03 15:09  0%         ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Petr Baudis @ 2005-06-03  9:47 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Linus Torvalds, Eric W. Biederman, Git Mailing List

Dear diary, on Thu, Jun 02, 2005 at 12:00:55AM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> It shouldn't be hard to do one, except that locking with rsync is going to
> be a pain. I had a patch to make it work with the rpush/rpull pair, but I
> didn't get its dependancies in at the time.

Was that the patch I was replying to recently? It didn't seem to have
any dependencies.

> I can dust those patches off again if you want that functionality included.
> 
> The patches are essentially:
> 
>  - make the transport protocol handle things other than objects
>  - library procedure for locking atomic update of refs files
>  - fetching refs in general
>  - rpull/rpush that updates a specified ref file atomically
> 
> At least the first would be very nice to get in before 1.0, since it is an
> incompatible change to the protocol.

I would like to have this a lot too. Pulling tags now is a PITA, and I
definitively want to go in this way. So it will land at least in git-pb.
:-) (But that's a little troublesome if you say it's incompatible
change.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[relevance 0%]

* Re: I want to release a "git-1.0"
  2005-06-03  9:47  0%       ` Petr Baudis
@ 2005-06-03 15:09  0%         ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-03 15:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, Eric W. Biederman, Git Mailing List

On Fri, 3 Jun 2005, Petr Baudis wrote:

> Dear diary, on Thu, Jun 02, 2005 at 12:00:55AM CEST, I got a letter
> where Daniel Barkalow <barkalow@iabervon.org> told me that...
> > It shouldn't be hard to do one, except that locking with rsync is going to
> > be a pain. I had a patch to make it work with the rpush/rpull pair, but I
> > didn't get its dependancies in at the time.
> 
> Was that the patch I was replying to recently? It didn't seem to have
> any dependencies.

The rpush/rpull changes were at the end of a series that you were replying
to the beginning of.

> > I can dust those patches off again if you want that functionality included.
> > 
> > The patches are essentially:
> > 
> >  - make the transport protocol handle things other than objects
> >  - library procedure for locking atomic update of refs files
> >  - fetching refs in general
> >  - rpull/rpush that updates a specified ref file atomically
> > 
> > At least the first would be very nice to get in before 1.0, since it is an
> > incompatible change to the protocol.
> 
> I would like to have this a lot too. Pulling tags now is a PITA, and I
> definitively want to go in this way. So it will land at least in git-pb.
> :-) (But that's a little troublesome if you say it's incompatible
> change.)

The ssh-based protocol has to change, because the current version doesn't
have any way of being extended. The first patch in the new set makes the
incompatible change without adding anything new (so as to be as
uncontroversial as possible), and now also adds a version number so that
future additions should be less of a big deal. The rest of the series will
add the transfer of refs to the transfer mechanism and the protocol.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* [PATCH] ssh-protocol version, command types, response code
@ 2005-06-03 21:43  4% Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-03 21:43 UTC (permalink / raw)
  To: Petr Baudis, Linus Torvalds; +Cc: git

This patch makes an incompatible change to the protocol used by
rpull/rpush which will let it be extended in the future without
incompatible changes. 

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: rpull.c
===================================================================
--- 33942306faa3093107cc7105dff046de5c981d2e/rpull.c  (mode:100644 sha1:36e49f799a6ac300a00f8d09d9dc9e6636b3d8e0)
+++ 1f3a64b193531d17cafb7db8aae4a08d92be3132/rpull.c  (mode:100644 sha1:a3d13595d3db6c26d4d55fe7ef516efd5c8f6a0c)
@@ -6,16 +6,39 @@
 static int fd_in;
 static int fd_out;
 
+static unsigned char remote_version = 0;
+static unsigned char local_version = 1;
+
 int fetch(unsigned char *sha1)
 {
 	int ret;
+	signed char remote;
+	char type = 'o';
+	if (has_sha1_file(sha1))
+		return 0;
+	write(fd_out, &type, 1);
 	write(fd_out, sha1, 20);
+	if (read(fd_in, &remote, 1) < 1)
+		return -1;
+	if (remote < 0)
+		return remote;
 	ret = write_sha1_from_fd(sha1, fd_in);
 	if (!ret)
 		pull_say("got %s\n", sha1_to_hex(sha1));
 	return ret;
 }
 
+int get_version(void)
+{
+	char type = 'v';
+	write(fd_out, &type, 1);
+	write(fd_out, &local_version, 1);
+	if (read(fd_in, &remote_version, 1) < 1) {
+		return error("Couldn't read version from remote end");
+	}
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;
@@ -46,6 +69,9 @@
 	if (setup_connection(&fd_in, &fd_out, "git-rpush", url, arg, argv + 1))
 		return 1;
 
+	if (get_version())
+		return 1;
+
 	if (pull(commit_id))
 		return 1;
 
Index: rpush.c
===================================================================
--- 33942306faa3093107cc7105dff046de5c981d2e/rpush.c  (mode:100644 sha1:17d5ab8a60ab2ec7fa3a7dc927351e8a34de3a89)
+++ 1f3a64b193531d17cafb7db8aae4a08d92be3132/rpush.c  (mode:100644 sha1:bd381ac9d1787dc979b1eba5bd72c1fd644a094b)
@@ -3,46 +3,81 @@
 #include <sys/socket.h>
 #include <errno.h>
 
-static void service(int fd_in, int fd_out) {
+unsigned char local_version = 1;
+unsigned char remote_version = 0;
+
+int serve_object(int fd_in, int fd_out) {
 	ssize_t size;
-	int posn;
-	char unsigned sha1[20];
+	int posn = 0;
+	char sha1[20];
 	unsigned long objsize;
 	void *buf;
+	signed char remote;
+	do {
+		size = read(fd_in, sha1 + posn, 20 - posn);
+		if (size < 0) {
+			perror("git-rpush: read ");
+			return -1;
+		}
+		if (!size)
+			return -1;
+		posn += size;
+	} while (posn < 20);
+	
+	/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+	remote = 0;
+	
+	buf = map_sha1_file(sha1, &objsize);
+	
+	if (!buf) {
+		fprintf(stderr, "git-rpush: could not find %s\n", 
+			sha1_to_hex(sha1));
+		remote = -1;
+	}
+	
+	write(fd_out, &remote, 1);
+	
+	if (remote < 0)
+		return 0;
+	
+	posn = 0;
 	do {
-		posn = 0;
-		do {
-			size = read(fd_in, sha1 + posn, 20 - posn);
-			if (size < 0) {
-				perror("git-rpush: read ");
-				return;
+		size = write(fd_out, buf + posn, objsize - posn);
+		if (size <= 0) {
+			if (!size) {
+				fprintf(stderr, "git-rpush: write closed");
+			} else {
+				perror("git-rpush: write ");
 			}
-			if (!size)
-				return;
-			posn += size;
-		} while (posn < 20);
-
-		/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
-
-		buf = map_sha1_file(sha1, &objsize);
-		if (!buf) {
-			fprintf(stderr, "git-rpush: could not find %s\n", 
-				sha1_to_hex(sha1));
+			return -1;
+		}
+		posn += size;
+	} while (posn < objsize);
+	return 0;
+}
+
+int serve_version(int fd_in, int fd_out)
+{
+	if (read(fd_in, &remote_version, 1) < 1)
+		return -1;
+	write(fd_out, &local_version, 1);
+	return 0;
+}
+
+void service(int fd_in, int fd_out) {
+	char type;
+	int retval;
+	do {
+		retval = read(fd_in, &type, 1);
+		if (retval < 1) {
+			if (retval < 0)
+				perror("rpush: read ");
 			return;
 		}
-		posn = 0;
-		do {
-			size = write(fd_out, buf + posn, objsize - posn);
-			if (size <= 0) {
-				if (!size) {
-					fprintf(stderr, "git-rpush: write closed");
-				} else {
-					perror("git-rpush: write ");
-				}
-				return;
-			}
-			posn += size;
-		} while (posn < objsize);
+		if (type == 'v' && serve_version(fd_in, fd_out))
+			return;
+		if (type == 'o' && serve_object(fd_in, fd_out))
+			return;
 	} while (1);
 }
 
@@ -56,7 +91,7 @@
                 arg++;
         }
         if (argc < arg + 2) {
-                usage("git-rpush [-c] [-t] [-a] commit-id url");
+		usage("git-rpush [-c] [-t] [-a] commit-id url");
                 return 1;
         }
 	commit_id = argv[arg];


^ permalink raw reply	[relevance 4%]

* [PATCH] pull: gracefully recover from delta retrieval failure.
@ 2005-06-05  6:11 15% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-05  6:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This addresses a concern raised by Jason McMullan in the mailing
list discussion.  After retrieving and storing a potentially
deltified object, pull logic tries to check and fulfil its delta
dependency.  When the pull procedure is killed at this point,
however, there was no easy way to recover by re-running pull,
since next run would have found that we already have that
deltified object and happily reported success, without really
checking its delta dependency is satisfied.

This patch introduces --recover option to git-*-pull family
which causes them to re-validate dependency of deltified objects
we are fetching.  A new test t5100-delta-pull.sh covers such a
failure mode.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Linus, from now on I will go into "calming down" mode and
*** refrain myself from sending you too many "new" stuff, until
*** you tell me otherwise.  I will concentrate on fixes like
*** this one and the "diff-* -B fix" patches I sent you earlier.
*** Perhaps I would also work on CVS migration documents if you
*** would like me to help you in that area as well.

*** Definitely things like the idea of diff-tree switching its
*** pathspec according rename detection results would not be
*** something I'll be bugging you about until 1.0 happens;
*** unless you tell me otherwise, that is.

 Documentation/git-http-pull.txt  |    5 ++
 Documentation/git-local-pull.txt |    5 ++
 Documentation/git-rpull.txt      |    5 ++
 pull.h                           |    4 +-
 http-pull.c                      |    4 +-
 local-pull.c                     |    4 +-
 pull.c                           |   15 +++++--
 rpull.c                          |    4 +-
 t/t5100-delta-pull.sh            |   79 ++++++++++++++++++++++++++++++++++++++
 9 files changed, 113 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt
--- a/Documentation/git-http-pull.txt
+++ b/Documentation/git-http-pull.txt
@@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r
 
 SYNOPSIS
 --------
-'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url
+'git-http-pull' [-c] [-t] [-a] [-v] [-d] [--recover] commit-id url
 
 DESCRIPTION
 -----------
@@ -25,6 +25,9 @@ Downloads a remote GIT repository via HT
 	Do not check for delta base objects (use this option
 	only when you know the remote repository is not
 	deltified).
+--recover::
+	Check dependency of deltified object more carefully than
+	usual, to recover after earlier pull that was interrupted.
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] [--recover] commit-id path
 
 DESCRIPTION
 -----------
@@ -27,6 +27,9 @@ OPTIONS
 	Do not check for delta base objects (use this option
 	only when you know the remote repository is not
 	deltified).
+--recover::
+	Check dependency of deltified object more carefully than
+	usual, to recover after earlier pull that was interrupted.
 -v::
 	Report what is downloaded.
 
diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt
--- a/Documentation/git-rpull.txt
+++ b/Documentation/git-rpull.txt
@@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito
 
 SYNOPSIS
 --------
-'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url
+'git-rpull' [-c] [-t] [-a] [-d] [-v] [--recover] commit-id url
 
 DESCRIPTION
 -----------
@@ -29,6 +29,9 @@ OPTIONS
 	Do not check for delta base objects (use this option
 	only when you know the remote repository is not
 	deltified).
+--recover::
+	Check dependency of deltified object more carefully than
+	usual, to recover after earlier pull that was interrupted.
 -v::
 	Report what is downloaded.
 
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -13,7 +13,9 @@ extern int get_history;
 /** Set to fetch the trees in the commit history. **/
 extern int get_all;
 
-/* Set to zero to skip the check for delta object base. */
+/* Set to zero to skip the check for delta object base;
+ * set to two to check delta dependency even for objects we already have.
+ */
 extern int get_delta;
 
 /* Set to be verbose */
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -105,6 +105,8 @@ int main(int argc, char **argv)
 			get_history = 1;
 		} else if (argv[arg][1] == 'd') {
 			get_delta = 0;
+		} else if (!strcmp(argv[arg], "--recover")) {
+			get_delta = 2;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -115,7 +117,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url");
+		usage("git-http-pull [-c] [-t] [-a] [-d] [-v] [--recover] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -74,7 +74,7 @@ int fetch(unsigned char *sha1)
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] [--recover] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -94,6 +94,8 @@ int main(int argc, char **argv)
 			get_history = 1;
 		else if (argv[arg][1] == 'd')
 			get_delta = 0;
+		else if (!strcmp(argv[arg], "--recover"))
+			get_delta = 2;
 		else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -6,6 +6,7 @@
 
 int get_tree = 0;
 int get_history = 0;
+/* 1 means "get delta", 2 means "really check delta harder */
 int get_delta = 1;
 int get_all = 0;
 int get_verbosely = 0;
@@ -32,12 +33,16 @@ static void report_missing(const char *w
 
 static int make_sure_we_have_it(const char *what, unsigned char *sha1)
 {
-	int status;
-	if (has_sha1_file(sha1))
+	int status = 0;
+
+	if (!has_sha1_file(sha1)) {
+		status = fetch(sha1);
+		if (status && what)
+			report_missing(what, sha1);
+	}
+	else if (get_delta < 2)
 		return 0;
-	status = fetch(sha1);
-	if (status && what)
-		report_missing(what, sha1);
+
 	if (get_delta) {
 		char delta_sha1[20];
 		status = sha1_delta_base(sha1, delta_sha1);
diff --git a/rpull.c b/rpull.c
--- a/rpull.c
+++ b/rpull.c
@@ -52,6 +52,8 @@ int main(int argc, char **argv)
 			get_history = 1;
 		} else if (argv[arg][1] == 'd') {
 			get_delta = 0;
+		} else if (!strcmp(argv[arg], "--recover")) {
+			get_delta = 2;
 		} else if (argv[arg][1] == 'a') {
 			get_all = 1;
 			get_tree = 1;
@@ -62,7 +64,7 @@ int main(int argc, char **argv)
 		arg++;
 	}
 	if (argc < arg + 2) {
-		usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url");
+		usage("git-rpull [-c] [-t] [-a] [-v] [-d] [--recover] commit-id url");
 		return 1;
 	}
 	commit_id = argv[arg];
diff --git a/t/t5100-delta-pull.sh b/t/t5100-delta-pull.sh
new file mode 100644
--- /dev/null
+++ b/t/t5100-delta-pull.sh
@@ -0,0 +1,79 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='Test pulling deltified objects
+
+'
+. ./test-lib.sh
+
+locate_obj='s|\(..\)|.git/objects/\1/|'
+
+test_expect_success \
+    setup \
+    'cat ../README >a &&
+    git-update-cache --add a &&
+    a0=`git-ls-files --stage |
+        sed -e '\''s/^[0-7]* \([0-9a-f]*\) .*/\1/'\''` &&
+
+    sed -e 's/test/TEST/g' ../README >a &&
+    git-update-cache a &&
+    a1=`git-ls-files --stage |
+        sed -e '\''s/^[0-7]* \([0-9a-f]*\) .*/\1/'\''` &&
+    tree=`git-write-tree` &&
+    commit=`git-commit-tree $tree </dev/null` &&
+    a0f=`echo "$a0" | sed -e "$locate_obj"` &&
+    a1f=`echo "$a1" | sed -e "$locate_obj"` &&
+    echo commit $commit &&
+    echo a0 $a0 &&
+    echo a1 $a1 &&
+    ls -l $a0f $a1f &&
+    echo $commit >.git/HEAD &&
+    git-mkdelta -v $a0 $a1 &&
+    ls -l $a0f $a1f'
+
+# Now commit has a tree that records delitified "a" whose SHA1 is a1.
+# Create a new repo and pull this commit into it.
+
+test_expect_success \
+    'setup and cd into new repo' \
+    'mkdir dest && cd dest && rm -fr .git && git-init-db'
+     
+test_expect_success \
+    'pull from deltified repo into a new repo without -d' \
+    'rm -fr .git a && git-init-db &&
+     git-local-pull -v -a $commit ../.git/ &&
+     git-cat-file blob $a1 >a &&
+     diff -u a ../a'
+
+test_expect_failure \
+    'pull from deltified repo into a new repo with -d' \
+    'rm -fr .git a && git-init-db &&
+     git-local-pull -v -a -d $commit ../.git/ &&
+     git-cat-file blob $a1 >a &&
+     diff -u a ../a'
+
+test_expect_failure \
+    'pull from deltified repo after delta failure without --recover' \
+    'rm -f a &&
+     git-local-pull -v -a $commit ../.git/ &&
+     git-cat-file blob $a1 >a &&
+     diff -u a ../a'
+
+test_expect_success \
+    'pull from deltified repo after delta failure with --recover' \
+    'rm -f a &&
+     git-local-pull -v -a --recover $commit ../.git/ &&
+     git-cat-file blob $a1 >a &&
+     diff -u a ../a'
+
+test_expect_success \
+    'missing-tree or missing-blob should be re-fetched without --recover' \
+    'rm -f a $a0f $a1f &&
+     git-local-pull -v -a $commit ../.git/ &&
+     git-cat-file blob $a1 >a &&
+     diff -u a ../a'
+
+test_done
+
------------


^ permalink raw reply	[relevance 15%]

* Re: [PATCH] pull: gracefully recover from delta retrieval failure.
  @ 2005-06-05 20:02  2% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-05 20:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jason McMullan, Linus Torvalds, git

On Sun, 5 Jun 2005, Junio C Hamano wrote:

> >>>>> "JM" == Jason McMullan <jason.mcmullan@timesys.com> writes:
> 
> JM> Sorry about being a pest, but this worries me. Please assuage my fears.
> 
> Earlier I said I suspected that the original code mishandled
> recovery from a botched tree/commit dependent transfer, but that
> was not the case.  The last test in the new test script I added
> in the patch you are responding to covers that case.

It does the O(history) method for correctness at the expense of
efficiency; my hope is that a bit of caching can fix the efficiency issue
as well. So the question is not really "not safe" as "slow". Of course, it
takes a while for this to become an issue, given the relationship of
remote access latency to local access bandwidth. That is, you need a
really big history and to be getting very little new data before you'll
complain.

> JM> (Or, if you'd like, I can rework pull.c to use the
> JM>  verification-before-store technique I used in my git-daemon patch, so
> JM>  all the *-pull mechanisms will be 'safe')
> 
> I would appreciate the offer.  I, however, would have to warn
> you that the "problem" lies in the way the current pull
> structure devides responsibility between the pull.c and transfer
> backends.  The pull.c implements the dependency logic, and
> transfer backends are to populate the database while being
> oblivious of that logic.  From the purist point of view (I am
> sympathetic to your "place only the verified objects in the
> database" principle), I am not entirely happy with that
> division, but at the same time I understand why it is done that
> way and even like it from practical standpoint.

At one point I'd written a patch that split out the tmpfile usage of
write_sha1_file(), made the filenames predictable, and used it for
everything that writes those files. It had an "open" part and a
"close" part (where the close also moved the file into place). This would
give the code better atomicity and protect against races between reading
and validation. On the other hand, there's no reason to use an anonymous
temp file; just <filename>.partial or similar (with the proper open
flags) would be sufficient and easier to clean or commit. Note that we
want to support /tmp and the object directory being on different
filesystems, also. (And all the open and place logic is nicely wrapped up
in sha1_file.c)

Aside from the question of whether we want to insist that the object
database only includes objects such that everything reachable is also
present, we certainly want to only include objects which we have
completely fetched, which are generically well-formed, and which have the
advertized hash, and having there never be an unvalidated file at the
filename would be good.

By this reasoning, a file should only be renamed after all of the delta
requirements are satisfied, but before tree and commit requirements are
satisfied. We certainly aren't going to have much use for files whose
contents we cannot get. This means that we'd like to have multiple
unplaced files, but we don't need to read the contents of an unplaced
file.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 2%]

* [PATCH 3/4] Generic support for pulling refs
  @ 2005-06-06 20:38  6% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-06 20:38 UTC (permalink / raw)
  To: Linus Torvalds, Petr Baudis, Junio C Hamano; +Cc: git

This adds support to pull.c for requesting a reference and writing it to a
file. All of the git-*-pull programs get stubs for now.

Index: http-pull.c
===================================================================
--- 9138b84eb683fc23a285445f7d7fc5a836ba01cb/http-pull.c  (mode:100644 sha1:551663e49234dc9b719ee4abb9f8dc8609d759aa)
+++ 8deba080337c75a41cb456cc8b59000654278e59/http-pull.c  (mode:100644 sha1:4f097e0d0bbd5ae28babf8b685dc3b02747f9f15)
@@ -92,6 +92,11 @@
 	return 0;
 }
 
+int fetch_ref(char *ref, unsigned char *sha1)
+{
+	return -1;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;
Index: local-pull.c
===================================================================
--- 9138b84eb683fc23a285445f7d7fc5a836ba01cb/local-pull.c  (mode:100644 sha1:e5d834ff2f7d6949ca2c7dd2424c65f6431a839b)
+++ 8deba080337c75a41cb456cc8b59000654278e59/local-pull.c  (mode:100644 sha1:867e78dbdd2fc5dacdad3c3e3ab5ef1bfde6ba51)
@@ -73,6 +73,11 @@
 	return -1;
 }
 
+int fetch_ref(char *ref, unsigned char *sha1)
+{
+	return -1;
+}
+
 static const char *local_pull_usage = 
 "git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path";
 
Index: pull.c
===================================================================
--- 9138b84eb683fc23a285445f7d7fc5a836ba01cb/pull.c  (mode:100644 sha1:cd77738ac62be17e7382bc3b368e686f11f7098d)
+++ 8deba080337c75a41cb456cc8b59000654278e59/pull.c  (mode:100644 sha1:a60f1e49bda1bf12e11e0abccfaf4201130f4303)
@@ -3,6 +3,11 @@
 #include "cache.h"
 #include "commit.h"
 #include "tree.h"
+#include "refs.h"
+
+const char *write_ref = NULL;
+
+const unsigned char *current_ref = NULL;
 
 int get_tree = 0;
 int get_history = 0;
@@ -105,16 +110,42 @@
 	return 0;
 }
 
+static int interpret_target(char *target, unsigned char *sha1)
+{
+	if (!get_sha1_hex(target, sha1))
+		return 0;
+	if (!check_ref_format(target)) {
+		if (!fetch_ref(target, sha1)) {
+			return 0;
+		}
+	}
+	return -1;
+}
+
+
 int pull(char *target)
 {
-	int retval;
 	unsigned char sha1[20];
-	retval = get_sha1_hex(target, sha1);
-	if (retval)
-		return retval;
-	retval = make_sure_we_have_it(commitS, sha1);
-	if (retval)
-		return retval;
-	memcpy(current_commit_sha1, sha1, 20);
-	return process_commit(sha1);
+	int fd = -1;
+
+	if (write_ref && current_ref) {
+		fd = lock_ref_sha1(write_ref, current_ref);
+		if (fd < 0)
+			return -1;
+	}
+
+	if (interpret_target(target, sha1))
+		return error("Could not interpret %s as something to pull",
+			     target);
+	if (process_commit(sha1))
+		return -1;
+	
+	if (write_ref) {
+		if (current_ref) {
+			write_ref_sha1(write_ref, fd, sha1);
+		} else {
+			write_ref_sha1_unlocked(write_ref, sha1);
+		}
+	}
+	return 0;
 }
Index: pull.h
===================================================================
--- 9138b84eb683fc23a285445f7d7fc5a836ba01cb/pull.h  (mode:100644 sha1:3cd14cfb811a755a8770a0d01e8e2f96ba604058)
+++ 8deba080337c75a41cb456cc8b59000654278e59/pull.h  (mode:100644 sha1:83295892d1e401e4719ae26f16de07d6eb61a8d2)
@@ -4,6 +4,14 @@
 /** To be provided by the particular implementation. **/
 extern int fetch(unsigned char *sha1);
 
+extern int fetch_ref(char *ref, unsigned char *sha1);
+
+/** If set, the ref filename to write the target value to. **/
+extern const char *write_ref;
+
+/** If set, the hash that the current value of write_ref must be. **/
+extern const unsigned char *current_ref;
+
 /** Set to fetch the target tree. */
 extern int get_tree;
 
Index: ssh-pull.c
===================================================================
--- 9138b84eb683fc23a285445f7d7fc5a836ba01cb/ssh-pull.c  (mode:100644 sha1:f4ab89836455a40aaab3ff4114396185f6d5655a)
+++ 8deba080337c75a41cb456cc8b59000654278e59/ssh-pull.c  (mode:100644 sha1:c0cee73facbbb3ced2e566789ba1dda57b245f47)
@@ -39,6 +39,11 @@
 	return 0;
 }
 
+int fetch_ref(char *ref, unsigned char *sha1)
+{
+	return -1;
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;


^ permalink raw reply	[relevance 6%]

* [PATCH] Add missing Documentation/*
@ 2005-06-07 14:17 12% Jason McMullan
  0 siblings, 0 replies; 200+ results
From: Jason McMullan @ 2005-06-07 14:17 UTC (permalink / raw)
  To: torvalds, git

Id: e774aa5641ca2267e7aba7338da3f7e355b7fb78
tree e42cf7a8ae0e05eb383ad41bc43d6e7d1245441f
parent 63aff4fed94355889be98ad44371e29942ff70e4
author Jason McMullan <jason.mcmullan@gmail.com> 1118153523 -0400
committer Jason McMullan <jason.mcmullan@gmail.com> 1118153523 -0400

Add: Additional missing documentation


======== diff against 63aff4fed94355889be98ad44371e29942ff70e4 ========
diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-apply.txt
@@ -0,0 +1,36 @@
+git-apply(1)
+============
+v0.1, May 2005
+
+NAME
+----
+git-apply - Apply a patch against the current index cache/working directory
+
+
+SYNOPSIS
+--------
+'git-apply' [--check] [--stat] [--show-file] <patch>
+
+DESCRIPTION
+-----------
+This applies patches on top of some (arbitrary) version of the SCM.
+
+NOTE! It does all its work in the index file, and only cares about
+the files in the working directory if you tell it to "merge" the
+patch apply.
+
+Even when merging it always takes the source from the index, and
+uses the working tree as a "branch" for a 3-way merge.
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-commit-script.txt b/Documentation/git-commit-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-commit-script.txt
@@ -0,0 +1,29 @@
+git-commit-script(1)
+====================
+v0.1, May 2005
+
+NAME
+----
+git-commit-script - Commit working directory
+
+
+SYNOPSIS
+--------
+'git-commit-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-deltafy-script.txt b/Documentation/git-deltafy-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-deltafy-script.txt
@@ -0,0 +1,51 @@
+git-deltafy-script(1)
+=====================
+v0.1, May 2005
+
+NAME
+----
+git-deltafy-script - Convery repository into delta format
+
+
+SYNOPSIS
+--------
+'git-deltafy-script'
+
+DESCRIPTION
+-----------
+Script to deltafy an entire GIT repository based on the commit list.
+The most recent version of a file is the reference and previous versions
+are made delta against the best earlier version available. And so on for
+successive versions going back in time.  This way the increasing delta
+overhead is pushed towards older versions of any given file.
+
+The -d argument allows to provide a limit on the delta chain depth.
+If 0 is passed then everything is undeltafied.  Limiting the delta
+depth is meaningful for subsequent access performance to old revisions.
+A value of 16 might be a good compromize between performance and good
+space saving.  Current default is unbounded.
+
+The --max-behind=30 argument is passed to git-mkdelta so to keep
+combinations and memory usage bounded a bit.  If you have lots of memory
+and CPU power you may remove it (or set to 0) to let git-mkdelta find the
+best delta match regardless of the number of revisions for a given file.
+You can also make the value smaller to make it faster and less
+memory hungry.  A value of 5 ought to still give pretty good results.
+When set to 0 or ommitted then look behind is unbounded.  Note that
+git-mkdelta might die with a segmentation fault in that case if it
+runs out of memory.  Note that the GIT repository will still be consistent
+even if git-mkdelta dies unexpectedly.
+
+
+Author
+------
+Written by Nicolas Pitre <nico@cam.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-fetch-script.txt b/Documentation/git-fetch-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-fetch-script.txt
@@ -0,0 +1,29 @@
+git-fetch-script(1)
+===================
+v0.1, May 2005
+
+NAME
+----
+git-fetch-script - Fetch an object from a remote repository
+
+
+SYNOPSIS
+--------
+'git-fetch-script' <sha1>
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-get-tar-commit-id.txt b/Documentation/git-get-tar-commit-id.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-get-tar-commit-id.txt
@@ -0,0 +1,29 @@
+git-get-tar-commit-id(1)
+========================
+v0.1, May 2005
+
+NAME
+----
+git-get-tar-commit-id - Show the commit ID embedded in a git-tar-tree file.
+
+
+SYNOPSIS
+--------
+'git-get-tar-commit-id' <tar-file.tar>
+
+DESCRIPTION
+-----------
+This shows the commit ID embedded in a git-tar-tree generated file.
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-log-script.txt b/Documentation/git-log-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-log-script.txt
@@ -0,0 +1,29 @@
+git-log-script(1)
+=================
+v0.1, May 2005
+
+NAME
+----
+git-log-script - Prettified version of git-rev-list
+
+
+SYNOPSIS
+--------
+'git-log-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-shortlog.txt
@@ -0,0 +1,29 @@
+git-shortlog(1)
+===============
+v0.1, May 2005
+
+NAME
+----
+git-status - Show status of working directory files
+
+
+SYNOPSIS
+--------
+'git-log-script' | 'git-shortlog' 
+
+DESCRIPTION
+-----------
+Summarize the output of 'git-log-script'
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-status-script.txt b/Documentation/git-status-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-status-script.txt
@@ -0,0 +1,29 @@
+git-status-script(1)
+====================
+v0.1, May 2005
+
+NAME
+----
+git-status-script - Show status of working directory files
+
+
+SYNOPSIS
+--------
+'git-status-script' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-stripspace.txt b/Documentation/git-stripspace.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-stripspace.txt
@@ -0,0 +1,31 @@
+git-stripspace(1)
+=================
+v0.1, May 2005
+
+NAME
+----
+git-stripspace - Strip space from stdin
+
+
+SYNOPSIS
+--------
+'git-stripspace' <stream
+
+DESCRIPTION
+-----------
+Remove empty lines from the beginning and end.
+
+Turn multiple consecutive empty lines into just one empty line.
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Documentation/git-whatchanged.txt b/Documentation/git-whatchanged.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-whatchanged.txt
@@ -0,0 +1,29 @@
+git-whatchanged(1)
+==================
+v0.1, May 2005
+
+NAME
+----
+git-whatchanged - Find out what changed
+
+
+SYNOPSIS
+--------
+'git-whatchanged' 
+
+DESCRIPTION
+-----------
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
======== end ========


^ permalink raw reply	[relevance 12%]

* [PATCH] Add git-help-script
@ 2005-06-07 14:19  4% Jason McMullan
  2005-06-07 14:25  3% ` McMullan, Jason
  0 siblings, 1 reply; 200+ results
From: Jason McMullan @ 2005-06-07 14:19 UTC (permalink / raw)
  To: torvalds, git

Id: 13d680c11f5403f3b1b48e71416e36770cd0aecf
tree 91e05412806336f2c9d989b8d9a2eccb21281efe
parent e774aa5641ca2267e7aba7338da3f7e355b7fb78
author Jason McMullan <jason.mcmullan@gmail.com> 1118153648 -0400
committer Jason McMullan <jason.mcmullan@gmail.com> 1118153648 -0400

Add: 'git help' aka 'git-help-script', built from Documentation/* information


======== diff against e774aa5641ca2267e7aba7338da3f7e355b7fb78 ========
diff --git a/Documentation/git-help-script.txt b/Documentation/git-help-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-help-script.txt
@@ -0,0 +1,29 @@
+git-help-script(1)
+==================
+v0.1, May 2005
+
+NAME
+----
+git-help-script - Short help of all the git commands and scripts
+
+
+SYNOPSIS
+--------
+'git-help-script' 
+
+DESCRIPTION
+-----------
+Shows a brief summary of all the git-* commands.
+
+Author
+------
+Written by Jason McMullan <jason.mcmullan@timesys.com>
+
+Documentation
+--------------
+Documentation by Jason McMullan and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -23,7 +23,7 @@ INSTALL=install
 SCRIPTS=git git-apply-patch-script git-merge-one-file-script git-prune-script \
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
-	git-log-script git-shortlog
+	git-log-script git-shortlog git-help-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
@@ -84,6 +84,26 @@ test-delta: test-delta.c diff-delta.o pa
 git-%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
 
+git-help-script: Makefile $(patsubst %,Documentation/%.txt,$(SCRIPTS) $(PROG))
+	echo "#!/bin/sh" >git-help-script
+	echo "cat <<EOF" >>git-help-script
+	echo "Commands:" >>git-help-script
+	echo >>git-help-script
+	for cmd in $(sort $(SCRIPTS)) $(sort $(PROG)); do \
+	  doc="Documentation/$$cmd.txt"; \
+	  if [ ! -e "$$doc" ]; then \
+	    echo "MISSING: $$doc" 1>&2; \
+	    rm -f git-help-script; \
+	    exit 1; \
+	  fi; \
+	  desc=`grep "^$$cmd - " $$doc | cut -d' ' -f3-` ; \
+	  desc=`echo "$$desc" | sed -e 's/\(.\{40\}\) /\1\n                        /g'` ; \
+	  cmd=`echo $$cmd | sed -e 's/^git-\(.*\)-script$$/git \1/'`; \
+	  printf "    %-20s%s\n" "$$cmd" "$$desc" >>git-help-script; \
+	done
+	echo "EOF" >>git-help-script
+	echo "exit 1" >>git-help-script
+
 git-update-cache: update-cache.c
 git-diff-files: diff-files.c
 git-init-db: init-db.c
@@ -143,7 +163,7 @@ test: all
 	$(MAKE) -C t/ all
 
 clean:
-	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
+	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE) git-help-script
 	$(MAKE) -C Documentation/ clean
 
 backup: clean
diff --git a/git b/git
--- a/git
+++ b/git
@@ -1,4 +1,8 @@
 #!/bin/sh
-cmd="git-$1-script"
+
+cmd="$1"
+
+[ -z "$cmd" -o "$cmd" = "-h" -o "$cmd" = "--help" ] && cmd="help"
+cmd="git-$cmd-script"
 shift
 exec $cmd "$@"
======== end ========


^ permalink raw reply	[relevance 4%]

* Re: [PATCH] Add git-help-script
  2005-06-07 14:19  4% [PATCH] Add git-help-script Jason McMullan
@ 2005-06-07 14:25  3% ` McMullan, Jason
  0 siblings, 0 replies; 200+ results
From: McMullan, Jason @ 2005-06-07 14:25 UTC (permalink / raw)
  To: Linus Torvalds, GIT Mailling list

[-- Attachment #1: Type: text/plain, Size: 4041 bytes --]

NOTE: This patch requires '[PATCH] Add missing Documention/*'


Here's the example output of 'git-help-script', which is called
when the user doesn't supply any options to the 'git' script wrapper,
or -h, or --help


$ git 
Commands:

    git                 the stupid content tracker
    git apply-patch     Sample script to apply the diffs from git-diff-*
    git commit          Commit working directory

    git deltafy         Convery repository into delta format
    git fetch           Fetch an object from a remote repository
    git help            Short help of all the git commands and scripts
    git log             
    git merge-one-file  The standard helper program to use with
"git-merge-cache"
    git prune           Prunes all unreachable objects from the object
                        database
    git pull            Script used by Linus to pull and merge a
                        remote repository
    git resolve         Script used to merge two trees
    git-shortlog        
    git status          
    git tag             An example script to create a tag object
                        signed with GPG
    git-whatchanged     Find out what changed
    git-apply           Apply a patch against the current index
cache/working
                        directory
    git-cat-file        Provide content or type information for
repository
                        objects
    git-check-files     Verify a list of files are up-to-date
    git-checkout-cache  Copy files from the cache to the working
                        directory
    git-commit-tree     Creates a new commit object
    git-convert-cache   Converts old-style GIT repository
    git-diff-cache      Compares content and mode of blobs between
                        the cache and repository
    git-diff-files      Compares files in the working tree and the
                        cache
    git-diff-helper     Generates patch format output for git-diff-*
    git-diff-tree       Compares the content and mode of blobs found
                        via two tree objects
    git-export          Exports each commit and a diff against each
                        of its parents
    git-fsck-cache      Verifies the connectivity and validity of
                        the objects in the database
    git-get-tar-commit-idShow the commit ID embedded in a git-tar-tree
                        file.
    git-http-pull       Downloads a remote GIT repository via HTTP
    git-init-db         Creates an empty git object database
    git-local-pull      Duplicates another GIT repository on a local
                        system
    git-ls-files        Information about files in the cache/working
                        directory
    git-ls-tree         Lists the contents of a tree object.
    git-merge-base      Finds as good a common ancestor as possible
                        for a merge
    git-merge-cache     Runs a merge for files needing merging
    git-mkdelta         Creates a delta object
    git-mktag           Creates a tag object
    git-read-tree       Reads tree information into the directory
                        cache
    git-rev-list        Lists commit objects in reverse chronological
                        order
    git-rev-tree        Provides the revision tree for one or more
                        commits
    git-ssh-pull        Pulls from a remote repository over ssh
connection
    git-ssh-push        Pushes to a remote repository over ssh
connection
    git-stripspace      Strip space from stdin
    git-tar-tree        Creates a tar archive of the files in the
                        named tree
    git-unpack-file     Creates a temporary file with a blob's contents
    git-update-cache    Modifies the index or directory cache
    git-write-blob      Creates a blob from a file
    git-write-tree      Creates a tree from the current cache

-- 
Jason McMullan <jason.mcmullan@timesys.com>
TimeSys Corporation


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[relevance 3%]

* [ANNOUNCE] Cogito-0.11.2
@ 2005-06-08 23:07  1% Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-06-08 23:07 UTC (permalink / raw)
  To: git

  Hello,

  I'm happy to announce Cogito-0.11.2, next version of my SCMish layer
over Linus' GIT tree history storage tool. You can get it as usual on

	kernel.org/pub/software/scm/cogito/

or by doing cg-update in your cogito.git repository.

  The changes include especially some bugfixes and portability and
performance enhancements, as well as all the sweet stuff from Linus.

  Note that I discovered a bug few minutes after releasing (as usual).
cg-log won't work correctly if ran with some files specified (the
"cg-log file" usage). I think it actually does not get used like this so
frequently, so I don't think it's worth another release by itself. But
expect new release as soon as some non-trivial amount of bugfixes piles
up (including core git bugfixes), quite soon hopefully.


  Here's git-rev-list --pretty HEAD ^cogito-0.11.1 | git-shortlog
(BTW, Dan, what about another cg-log option for git-shortlog output? ;-):


C. Cooke:
  Check whether the git repository is present before executing the command

Catalin Marinas:
  cg-commit: Fix the log file readin from stdin
  [PATCH Cogito] Add -f parameter also to cg-update

Chris Wedgwood:
  cogitio: sh != bash

Christian Meder:
  Miniscule correction of diff-format.txt

Dan Holmsand:
  Make cg-add use xargs -0

Daniel Barkalow:
  Document git-ssh-pull and git-ssh-push
  -w support for git-ssh-pull/push
  Generic support for pulling refs
  rsh.c environment variable
  Operations on refs
  ssh-protocol version, command types, response code

Eugene Surovegin:
  fix cg-commit new file handling

Jason McMullan:
  Anal retentive 'const unsigned char *sha1'
  Modify git-rev-list to linearise the commit history in merge order.

Jon Seymour:
  three --merge-order bug fixes

Jonas Fonseca:
  cg-commit: prefix pathspec argument with --
  git-diff-cache: handle pathspec beginning with a dash
  git-diff-cache: handle pathspec beginning with a dash
  cg-log: cleanup line wrapping by using bash internals
  Documentation improvements
  Misc cg-log documentation fixes
  Cleanup commit messages with git-stripspace
  [PATCH 10/10] Add -s option to show log summary
  [PATCH 9/10] Move file matching inside the loop.
  [PATCH 8/10] Move the username matching inside the loop
  [PATCH 7/10] Move log printing to separate function
  [PATCH 6/10] Remove the catch all rule
  [PATCH 5/10] Move printing of the commit info line inside the loop
  [PATCH 4/10] First parse all commit header entries then print them
  [PATCH 3/10] Separate handling of author and committer in commit headers
  [PATCH 2/10] Separate handling of tree and parent in commit headers
  [PATCH 1/10] Cleanup conversion to human readable date
  cg-Xnormid: support revision ids specified by date

Junio C Hamano:
  Tests: read-tree -m test updates.
  Documentation: describe diff tweaking (fix).
  Start cvs-migration documentation
  read-tree: update documentation for 3-way merge.
  read-tree: save more user hassles during fast-forward.
  index locking like everybody else
  3-way merge tests for new "git-read-tree -m"?
  rename git-rpush and git-rpull to git-ssh-push and git-ssh-pull
  Documentation: describe git extended diff headers.
  Documentation: describe diff tweaking.
  pull: gracefully recover from delta retrieval failure.
  diffcore-break.c: various fixes.
  diff.c: -B argument passing fix.
  diff.c: locate_size_cache() fix.
  diff: Update -B heuristics.
  diff: Clean up diff_scoreopt_parse().
  diff: Fix docs and add -O to diff-helper.
  Tweak count-delta interface
  Find size of SHA1 object without inflating everything.
  Handle deltified object correctly in git-*-pull family.

Linus Torvalds:
  Remove MERGE_HEAD after committing merge
  Make "git commit" work correctly in the presense of a manual merge
  cvs-migration: add more of a header to the "annotate" discussion
  Leave merge failures in the filesystem
  Fix SIGSEGV on unmerged files in git-diff-files -p
  Make default merge messages denser.
  git-apply: creatign empty files is nonfatal
  Talk about "git cvsimport" in the cvs migration docs
  git-read-tree: -u without -m is meaningless. Don't allow it.
  git-read-tree: make one-way merge also honor the "update" flag
  Add CVS import scripts and programs
  git-ssh-push/pull: usability improvements
  git-resolve-script: stop when the automated merge fails
  Make fetch/pull scripts terminate cleanly on errors
  git-resolve-script: don't wait for three seconds any more
  git-read-tree: some "final" cleanups
  git-read-tree: simplify merge loops enormously
  Add "__noreturn__" attribute to die() and usage()
  git-rev-list: make sure to link with ssl libraries
  Fix off-by-one in new three-way-merge updates
  Three-way merge: fix silly bug that made trivial merges not work
  Fix entry.c dependency and compile problem
  git-read-tree: fix up two-way merge
  More work on merging with git-read-tree..
  Make fiel checkout function available to the git library
  git-read-tree: fix up three-way merge tests
  git-read-tree: be a lot more careful about merging dirty trees
  diff 'rename' format change.
  git-apply: consider it an error to apply no changes
  git-apply: fix rename header parsing
  git-apply: actually apply patches and update the index
  git-apply: fix apply of a new file
  git-apply: find offset fragments, and really apply them
  git-apply: first cut at actually checking fragment data
  git-fsck-cache: complain if no default references found
  pretty_print_commit: add different formats
  git-shortlog: add name translations for 'sparse' repo
  Add git-shortlog perl script
  git-rev-list: allow arbitrary head selections, use git-rev-tree syntax
  Clarify git-diff-cache semantics in the tutorial.

Mark Allen:
  Modify cg-Xlib for non-GNU date.

Michal Rokos:
  [cogito] Sync objects only when needed
  [cogito] paged output for cg-diff
  Abstracted out $PAGER invocation to a pager() function

Petr Baudis:
  Fix cg-log called on specified files
  cogito-0.11.2
  Added trivial cg wrapper
  Use portable sed stuff in cg-log Signed-off-by highlighting
  showdate() now uses $(()) instead of $(expr)
  Fixed cg-log -u
  cg-merge now sometimes allows tree merge + local changes
  Add the t6001 testcase which got missed out at the last merge.
  Move commit line processing to process_commit_line
  Improved cg-Xmergefile
  Fix git-merge-one-file permissions auto-merging
  Fix cg-patch reverting file removal
  Reindent print_commit_log() body
  cg-log is now pure git-rev-list --pretty=raw frontend
  Fix cg-commit doing shell expansion on -m arguments
  Fix mismerged git-r* -> git-ssh-* rename in Makefile
  Move print_commit_log() in cg-log
  Fix an errorneous cg-clone example in the README
  Make git-update-cache --force-remove regular
  Portability sed fix in cg-commit
  Improve git-rev-list --header output
  Implement cg-rm -n for untracking files
  Fixed cg-Xnormid " " call
  cg-commit now updates cache separately for different change types
  Pass revisions to commit-id, parent-id, tree-id and cg-Xnormid quoted
  Do rm -f in make uninstall
  make dist will now produce tarball with sensible name

Rene Scharfe:
  git-tar-tree: do only basic tests in t/t5000-git-tar-tree.sh
  git-tar-tree: fix write_trailer
  git-tar-tree: add a test case
  git-tar-tree: small doc update
  git-tar-tree: cleanup write_trailer()

Sven Verdoolaege:
  git-cvs2git: create tags

Timo Hirvonen:
  Use ntohs instead of htons to convert ce_flags to host byte order


  Have fun,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
<Espy> be careful, some twit might quote you out of context..

^ permalink raw reply	[relevance 1%]

* do people use the 'git' command?
@ 2005-06-10 18:53  5% Sebastian Kuzminsky
    0 siblings, 1 reply; 200+ results
From: Sebastian Kuzminsky @ 2005-06-10 18:53 UTC (permalink / raw)
  To: git

What good is the 'git' command?  It's a shortcut to run the
"git-$FUNCTION-script" programs, but it doesnt do the "git-$FUNCTION"
programs.  It just doesnt seem worth its inode, to me.  And it doesnt seem
worth the pain to distribution maintainers (like me) to avoid the naming
conflict with GNU Interactive Tools' /usr/bin/git.


Can we drop the "git" program?


diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -42,7 +42,7 @@ CC?=gcc
 AR?=ar
 INSTALL?=install
 
-SCRIPTS=git git-apply-patch-script git-merge-one-file-script git-prune-script \
+SCRIPTS=git-apply-patch-script git-merge-one-file-script git-prune-script \
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script
diff --git a/git b/git
deleted file mode 100755
--- a/git
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/sh
-cmd="git-$1-script"
-shift
-exec $cmd "$@"


-- 
Sebastian Kuzminsky

^ permalink raw reply	[relevance 5%]

* [PATCH] Add script for patch submission via e-mail.
@ 2005-06-11  1:32  4% Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-11  1:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This git-format-patch-script is what I use to prepare patches
for e-mail submission.

Typical usage is:

$ git-format-patch-script -B -C --find-copies-harder HEAD linus

to prepare each commit with its patch since "HEAD" forked from
"linus", one file per patch for e-mail submission.  Each output
file is numbered sequentially from 1, and uses the first line of
the commit message (massaged for pathname safety) as the
filename.

$ git-format-patch-script -B -C --find-copies-harder HEAD linus .patch/

creates output files in .patch/ directory.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
*** Linus I am submitting this one because some patches on
*** read-tree I am going to send you will need this for
*** formatting into a form that is easier to review.  And this
*** in turn can use diff-tree --find-copies-harder, which I
*** indeed used to generate the patches that follow.

 Makefile                |    3 +-
 git-format-patch-script |   93 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 95 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -23,7 +23,8 @@ INSTALL=install
 SCRIPTS=git git-apply-patch-script git-merge-one-file-script git-prune-script \
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
-	git-log-script git-shortlog git-cvsimport-script
+	git-log-script git-shortlog git-cvsimport-script \
+	git-format-patch-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-format-patch-script b/git-format-patch-script
new file mode 100755
--- /dev/null
+++ b/git-format-patch-script
@@ -0,0 +1,93 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+# Typical usage is:
+#
+# $ git-format-patch-script -B -C --find-copies-harder HEAD linus
+#
+# to prepare each commit with its patch since "HEAD" forked from
+# "linus", one file per patch for e-mail submission.  Each output file is
+# numbered sequentially from 1, and uses the first line of the commit
+# message (massaged for pathname safety) as the filename.
+#
+# $ git-format-patch-script -B -C --find-copies-harder HEAD linus .patch/
+#
+# creates output files in .patch/ directory.
+
+diff_opts=
+IFS='
+'
+LF='
+'
+while case "$#" in 0) break;; esac
+do
+    case "$1" in
+    -*)	diff_opts="$diff_opts$LF$1" ;;
+    *) break ;;
+    esac
+    shift
+done
+
+junio="$1"
+linus="$2"
+outdir="${3:-./}"
+
+tmp=.tmp-series$$
+trap 'rm -f $tmp-*' 0 1 2 3 15
+
+series=$tmp-series
+
+titleScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s/[^-a-z.A-Z_0-9]/-/g
+	s/^--*//g
+	s/--*$//g
+	s/---*/-/g
+	s/$/.txt/
+        s/\.\.\.*/\./g
+	q
+'
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+stripCommitHead='/^'"$_x40"' (from '"$_x40"')$/d'
+
+O=
+if test -f .git/patch-order
+then
+    O=-O.git/patch-order
+fi
+git-rev-list "$junio" "^$linus" >$series
+total=`wc -l <$series`
+i=$total
+while read commit
+do
+    title=`git-cat-file commit "$commit" | sed -e "$titleScript"`
+    num=`printf "%d/%d" $i $total`
+    file=`printf '%04d-%s' $i "$title"`
+    i=`expr "$i" - 1`
+    echo "$file"
+    {
+	mailScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s|^|[PATCH '"$num"'] |
+	: body
+	p
+	n
+	b body'
+
+	git-cat-file commit "$commit" | sed -ne "$mailScript"
+	echo '---'
+	echo
+	git-diff-tree -p $diff_opts $O "$commit" | git-apply --stat
+	echo
+	git-diff-tree -p $diff_opts $O "$commit" | sed -e "$stripCommitHead"
+	echo '------------'
+    } >"$outdir$file"
+done <$series
+
------------


^ permalink raw reply	[relevance 4%]

* Re: do people use the 'git' command?
  @ 2005-06-11  5:26  4%   ` Sebastian Kuzminsky
  0 siblings, 0 replies; 200+ results
From: Sebastian Kuzminsky @ 2005-06-11  5:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote:
> >>>>> "SK" == Sebastian Kuzminsky <seb@highlab.com> writes:
> 
> SK> Can we drop the "git" program?
> 
> No chance, especially with a patch that is not accompanied with
> necessary changes to Documentation/tutorial.txt that already
> tells the user to type "git commit" and "git log" ;-).

Of course, you're right.  How about this?  Against Cogito but applies
cleanly to Linus' git:


 b/Documentation/cvs-migration.txt |    4 ++--
 b/Documentation/tutorial.txt      |    6 +++---
 b/Makefile                        |    2 +-
 git                               |    4 ----
 4 files changed, 6 insertions(+), 10 deletions(-)


diff --git a/Documentation/cvs-migration.txt b/Documentation/cvs-migration.txt
--- a/Documentation/cvs-migration.txt
+++ b/Documentation/cvs-migration.txt
@@ -63,7 +63,7 @@
 any more familiar with it, but make sure it is in your path. After that,
 the magic command line is
 
-	git cvsimport <cvsroot> <module>
+	git-cvsimport-script <cvsroot> <module>
 
 which will do exactly what you'd think it does: it will create a git
 archive of the named CVS module. The new archive will be created in a
@@ -90,7 +90,7 @@
 
 So, something has gone wrong, and you don't know whom to blame, and
 you're an ex-CVS user and used to do "cvs annotate" to see who caused
-the breakage. You're looking for the "git annotate", and it's just
+the breakage. You're looking for the "git-annotate", and it's just
 claiming not to find such a script. You're annoyed.
 
 Yes, that's right.  Core git doesn't do "annotate", although it's
diff --git a/Documentation/tutorial.txt b/Documentation/tutorial.txt
--- a/Documentation/tutorial.txt
+++ b/Documentation/tutorial.txt
@@ -362,7 +362,7 @@
 for you, and starts up an editor to let you write your commit message
 yourself, so let's just use that:
 
-	git commit
+	git-commit-script
 
 Write whatever message you want, and all the lines that start with '#'
 will be pruned out, and the rest will be used as the commit message for
@@ -417,7 +417,7 @@
 To see the whole history of our pitiful little git-tutorial project, you
 can do
 
-	git log
+	git-log-script
 
 which shows just the log messages, or if we want to see the log together
 with the associated patches use the more complex (and much more
@@ -465,7 +465,7 @@
    history outside of the project you created.
 
  - if you want to move or duplicate a git archive, you can do so. There
-   is no "git clone" command: if you want to create a copy of your
+   is no "git-clone" command: if you want to create a copy of your
    archive (with all the full history that went along with it), you can
    do so with a regular "cp -a git-tutorial new-git-tutorial".
 
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -42,7 +42,7 @@
 AR?=ar
 INSTALL?=install
 
-SCRIPTS=git git-apply-patch-script git-merge-one-file-script git-prune-script \
+SCRIPTS=git-apply-patch-script git-merge-one-file-script git-prune-script \
 	git-pull-script git-tag-script git-resolve-script git-whatchanged \
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script
diff --git a/git b/git
deleted file mode 100755
--- a/git
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/sh
-cmd="git-$1-script"
-shift
-exec $cmd "$@"


-- 
Sebastian Kuzminsky

^ permalink raw reply	[relevance 4%]

* Re: reducing line crossings in gitk
  @ 2005-06-11 18:26  2% ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-11 18:26 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git

>>>>> "PM" == Paul Mackerras <paulus@samba.org> writes:

PM> I could add a heuristic to look for this case and reverse the order of
PM> the parents, which would reduce the line crossings and make the graph
PM> look neater.  Would this be worth the slight loss of information (in
PM> that the stuff pulled in would no longer always be to the right)?

Personally I find the current "crossing lines" display makes
what happened more visually obvious than "reverse order of the
parents", so I'd be happier if you keep things as they are.

Thanks for a wonderful tool.  May I ask for more?  Some are
minor UI enhancements, some are feature ideas.

 - The first time I tried it from somewhere random without
   having GIT_DIR environment, it just gave error and exited.
   Which is correct but could have been nicer.

   Adding a "Browse Repo" in the "File" menu to let the user
   switch which repository to browse (and when chosen start
   afresh, of course) would be nicer, while keeping the default
   of showing the current ${GIT_DIR:=.git} upon start-up.

 - What support from the core GIT side would you need if you
   wanted to let users browse a remote repo?  A way to inspect
   what is under ".git/refs/" hierarchy on the remote side?
   Anything else?

 - Pasting into the SHA1 field to "Go To" was a nuisance when
   the field already had a string in it.  Clearing the SHA1
   field when focus gets in would be one way to solve it, but
   then you would lose the way to pasting out of that field, so
   I do not know what to suggest offhand.

 - How do I "Find" backwards?  Not being able to find a way to
   do this was the most annoying thing for me.

 - After typing something in "Find" and hitting <ENTER>, if the
   focus stays in it and lets me hit <ENTER> again to go to the
   next one would be nicer.  Somehow hitting <ENTER> again would
   not do this for me right now.

 - Indicaing "Find" wrapping around without annoying the user
   too much (i.e. I do _not_ want you to add "Find reached the
   beginning of time, wrapping around and continuing from the
   top" pop-up window) would be nicer.  Currently I can tell by
   looking at the scrollbar on the history pane jumping back, so
   this is not a big issue, though.

 - Can I have a way to "Find" next commit that touches a given
   pathname?

    $ git-rev-list | git-diff-tree -s -r --stdin '<that pathname>'

   which would give you a sequence of lines that look like:
       "commit-SHA1 (from parent-commit-SHA1)"

   you would pick the commit-SHA1 from the output and jump to it.

 - In addition to "Find" which looks at the commit message, can I
   have one that uses pickaxe to find changes?

   Add a new choice "In Patch" to the list of choices ("All
   fields", etc); sorry, but currently pickaxe can only do exact
   matches.  When you are operating in that mode, run

    $ git-rev-list | git-diff-tree -s -r --stdin -S'<that string>'

   which would give you a sequence of lines that look like:
       "commit-SHA1 (from parent-commit-SHA1)"

   you would pick the commit-SHA1 from the output and jump to it.

 - Can I have an option to use diffcore options to tweak the
   diff that is shown in the lower-left pane?

   Add "Diff" menu next to "File" menu, and have the following
   options: "Find Renames", "Find Copies", "Find Rewrites".

   The first two are mutually exclusive so you can have (1) both
   off, (2) Renames, or (3) Copies.  "Rewrites" is independent,
   so you end up with 6 combinations.  Give "-M", "-C", and "-B"
   option to git-diff-tree you run on the commit when these
   "Find foo" options are in effect, respectively.

   A good test case in GIT repository itself to try these are:

    418aaf847a8b3ffffb4f777a2dd5262ca5ce0ef7 (for -M)
	This renames rpull.c to ssh-pull.c etc.  Four renames in
	total.

    7ef76925d9c19ef74874e1735e2436e56d0c4897 (for -C)
	This creates git-fetch-script out of git-pull-script
        by copying.

    6af1f0192ff8740fe77db7cf02c739ccfbdf119c (for -B)
	This rewrites ls-tree.c

This list is based on gitk-1.1 (I downloaded this morning) so you
may already have unpublished solutions.


^ permalink raw reply	[relevance 2%]

* [PATCH] Adding Correct Useage Notification and -h Help flag
@ 2005-06-14  1:47  5% James Purser
    0 siblings, 1 reply; 200+ results
From: James Purser @ 2005-06-14  1:47 UTC (permalink / raw)
  To: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 355 bytes --]

okay this is my first patch so take it easy on me.

I have added both a Correct Useage notification when git is called on
its own without any other parameters and a -h help flag which lists
available scripts for the git command.

Signed-off-by: James Purser <purserj@k-sit.com>


-- 
James Purser
Winnet Developer
+61 2 4223 4131
http://www.winnet.com.au

[-- Attachment #2: git_patch --]
[-- Type: text/plain, Size: 1021 bytes --]

Added a couple of lines to the git wrapper. Includes Correct Useage and available scripts

---
commit bfe72d41d70f9e4c0a6ab0ec6cf49347f980f4de
tree a8eed80ea2ac00ffee0b4f12ed4c931ca7761000
parent 940c1bb0181cb20454bf3573134175f86983a0ce
author James Purser <purserj@k-sit.com> Tue, 14 Jun 2005 11:40:20 +1000
committer James Purser <purserj@k-sit.com> Tue, 14 Jun 2005 11:40:20 +1000

 git |   31 ++++++++++++++++++++++++++++---
 1 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/git b/git
--- a/git
+++ b/git
@@ -1,4 +1,29 @@
 #!/bin/sh
-cmd="git-$1-script"
-shift
-exec $cmd "$@"
+if [ "$1" = "" ]
+then
+	echo "Correct Useage: git [-h] [SCRIPT]";
+else 
+	if [ "$1" = "-h" ]
+	then
+		echo "This is a basic script wrapper for certain git functions. The available commands are:
+
+git apply-patch
+git commit
+git cvsimport
+git deltafy
+git diff
+git fetch
+git log
+git merge-one-file
+git prune
+git pull
+git status
+git tag
+";	
+	else
+		cmd="git-$1-script";
+		shift;
+		exec $cmd "$@";
+	fi
+fi
+

^ permalink raw reply	[relevance 5%]

* Re: [PATCH] Adding Correct Useage Notification and -h Help flag
  @ 2005-06-14 21:49  4%   ` James Purser
  0 siblings, 0 replies; 200+ results
From: James Purser @ 2005-06-14 21:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2751 bytes --]

Heres the revised patch along the lines you recommended yesterday. I am
still working on some of the command descriptions.

On the git-update-cache question, I think having a flag [-u] or
something like that in git commit would allow the user to decide to run
git-update-cache on only the files specified in the git commit command.

Signed-off-by: James Purser <purserj@k-sit.com>
On Tue, 2005-06-14 at 12:23, Junio C Hamano wrote:
> >>Added a couple of lines to the git wrapper. Includes Correct
> >>Useage and available scripts
> 
> I like the general direction of making "git" wrapper novice
> friendly, but have some suggestions to the implementation.
> 
>  (0) Do not mention that only certain subset is accessible.
>      Just saying "Available commands are:" would be enough, and
>      would not leave the end user wondering what he is missing.
> 
>  (1) Instead of explicitly checking for -h, you may want to
>      structure it like this:
> 
>          #!/bin/sh
> 
>          cmd="git-$1-script";
>          shift;
>          exec $cmd "$@";
>          echo "Usage: git <subcommand> [<param>...]"
>          echo 'Available commands are:"
>          git bar
> 	 git foo
>          ...
>          '
>          exit 1
> 
>      Alternatively, you could say:
> 
>          #!/bin/sh
> 
>          case "$1" in
>          -h)
>              echo "Usage: git <subcommand> [<param>...]"
>              echo 'Available commands are:
>          git bar
> 	 git foo
>          ...
>          '
>              exit 0 ;;
> 	 esac
> 
>          cmd="git-$1-script";
>          shift;
>          exec $cmd "$@";
> 	 git -h
>          exit 1
> 
>  (2) Maintaining the list of commands by hand in git script
>      itself have an advantage that you _could_ describe the
>      options and parameters they take, but you are not doing
>      that in your patch (hint, hint).
> 
>      If all you give is the list of subcommand names, have
>      git.in as a source file, and create the "list of available
>      commands" from the Makefile, like this:
> 
>      === Makefile ===
>      ...
>      git : git.in
>          /bin/sh ls -1 git-*-script | \
>          sed -e 's/git-\(.*\)-script/git \1/' >.git-cmd-list
>          sed -e '/@@LIST_OF_COMMANDS@@/{s/.*//;r .git-cmd-list;}' <$@.in >$@
>          rm -f .git-cmd-list
> 
>      === git.in ===
>      #!/bin/sh
> 
>      cmd="git-$1-script";
>      shift;
>      exec $cmd "$@";
>      echo "Usage: git <subcommand> [<param>...]"
>      echo 'Available commands are:
>      @@LIST_OF_COMMANDS@@
>      '
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: git_patch --]
[-- Type: text/plain, Size: 1835 bytes --]

Make the git wrapper a little bit more newbie friendly, when called on its own or with the -h flag will return a list of the commands available

---
commit 2ad499f75f0982953e43904085bd1b48dd821c4d
tree f137c4f3f5ee027b8b4d25a0434929e12eb7c51c
parent de4971b50076b5ef901c2ae0bbee9dd2c14f06ea
parent de4971b50076b5ef901c2ae0bbee9dd2c14f06ea
author James Purser <purserj@k-sit.com> Wed, 15 Jun 2005 07:43:20
committer James Purser <purserj@k-sit.com> Wed, 15 Jun 2005 07:43:20

 git |   58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/git b/git
--- a/git
+++ b/git
@@ -1,4 +1,56 @@
 #!/bin/sh
-cmd="git-$1-script"
-shift
-exec $cmd "$@"
+
+case "$1" in
+-h|"")
+	echo "Useage: git <subcommand> [<param 1>, <param 2> ...]"
+	echo "Available commands are:
+
+git apply-patch <patch file>
+
+	Applies patch file to local files
+	
+git commit <file 1> <file 2> ...
+
+	Commits local changes to the cache. Before running this script you will need to run git-update-cache on the files you wish to commit.
+
+	Will accept wildcards ie git commit git-*-scripts
+
+git cvsimport [-v] [-z fuzz] <cvsroot> <module>
+	
+	CVS to Git import tool. Relies on the cvsps package at least 2.1
+	
+	-v Verbose output
+	-z 
+	<cvsroot> Path to CVS Repository
+	<module> Module you want to import into git
+
+git deltafy:
+
+git diff [<File 1>, <File 2>, ...]
+
+	Diffs between the git cache and specified files. If no files are specified then creates a Diff between contents of the cache and all current files.
+
+git fetch:
+
+git log:
+
+git merge-one-file:
+
+git prune:
+
+git pull:
+
+git status:
+
+git tag:
+"
+	exit 0 ;;
+esac
+
+cmd="git-$1-script";
+shift;
+exec $cmd "$@";
+git -h
+exit 1
+
+

^ permalink raw reply	[relevance 4%]

* qgit-0.6
@ 2005-06-18 10:38  3% Marco Costalba
  2005-06-19 13:00  0% ` qgit-0.6 Ingo Molnar
  0 siblings, 1 reply; 200+ results
From: Marco Costalba @ 2005-06-18 10:38 UTC (permalink / raw)
  To: git; +Cc: berkus

Here is qgit-0.6, a git GUI viewer

New in this version:

- added annotate

- added color highlighting to selected diff target

- added color to file list: green new file, red removed one

- fixed locale visualizations

- fixed correct git-rev-list range handling

- fixed center on deleted files in diff viewer

- fixed disappearing files when reloading (nasty one)

- clean up of diff target logic

- added README

You can download from 
http://prdownloads.sourceforge.net/qgit/qgit-0.6.tar.bz2?download

To try qgit:

1) Unpack downloaded file
2) make
3) cd bin
4) copy qgit bin file anywhere in your path

Some (updated) screenshots at:
http://sourceforge.net/project/screenshots.php?group_id=139897

A word on annotate: In file viewer, after a while :-), the file contents will change to show the
annotations. Annotations are calculated in background so it may takes some time to show (it
depends mostly on fetching history patches with git-diff-tree -p ). History is snapshotted to
actual loaded data so peraphs you need qgit to have loaded an interesting amount of data before
calling file viewer.


I think all known (to me) problems should be fixed now. Apart from the new annotate function, a
bit experimental, qgit should be quite usable. 
So if you find some bugs/issues/inconsistencies/ etc.. please drop me a line.


Marco




		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - Helps protect you from nasty viruses. 
http://promotions.yahoo.com/new_mail

^ permalink raw reply	[relevance 3%]

* Re: qgit-0.6
  2005-06-18 10:38  3% qgit-0.6 Marco Costalba
@ 2005-06-19 13:00  0% ` Ingo Molnar
  0 siblings, 0 replies; 200+ results
From: Ingo Molnar @ 2005-06-19 13:00 UTC (permalink / raw)
  To: Marco Costalba; +Cc: git, berkus


* Marco Costalba <mcostalba@yahoo.it> wrote:

> A word on annotate: In file viewer, after a while :-), the file 
> contents will change to show the annotations. Annotations are 
> calculated in background so it may takes some time to show (it depends 
> mostly on fetching history patches with git-diff-tree -p ). History is 
> snapshotted to actual loaded data so peraphs you need qgit to have 
> loaded an interesting amount of data before calling file viewer.

works fine here and is nice and fast, but there are a few minor visual 
glitches:

- annotated file contents are not properly aligned over each other. E.g.  
  check commit 7875b50d1a9928e683299b283bfe94778b6c344e in the current 
  git repository, and select read-tree.c and view it annotated - the 
  lines start right after the author field ends, not in any aligned way.

- the tree visualization is hard to follow - gitk's output is much 
  nicer. As an example of nice rendering check out the octopus merge 
  around commit 211232bae64bcc60bbf5d1b5e5b2344c22ed767e. One glance at 
  the gitk output shows what happened - qgit's output is in essence 
  unreadable.

and a few requests for enhancements if you dont mind:

 - in annotated mode, it would be nice to select a particular line 
   and then double-click would jump to the commit that added that line.  
   This would nicely round up annotation support.

 - plaintext search capability in every window. E.g. in the annotated
   file window i often would like to search for some code, or to jump to
   a given line.

	Ingo

^ permalink raw reply	[relevance 0%]

* Re: git merging
  @ 2005-06-20 21:15  4%                     ` Linus Torvalds
  2005-06-21 14:59  0%                       ` Jens Axboe
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-20 21:15 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Daniel Barkalow, Jeff Garzik, Git Mailing List



On Mon, 20 Jun 2005, Jens Axboe wrote:
> 
> I pulled with rsync manually from kernel.org, and that did fix things up
> for me. The main tree is rsync'ed, but the development tree gets the
> changes with /opt/kernel/git/linux-2.6/.git/ as the url given to
> git-pull-script.

Ok, that explains it. Since you're using a regular local filename, the
pull will be using "git-local-pull", which will only fetch the objects
directly needed. And doesn't understand the tag-to-tree thing, so doesn't 
fetch the tree (or possibly you just copied the tags by hand totally 
outside of the regular pull?)

		Linus

^ permalink raw reply	[relevance 4%]

* Re: git merging
  2005-06-20 21:15  4%                     ` Linus Torvalds
@ 2005-06-21 14:59  0%                       ` Jens Axboe
  0 siblings, 0 replies; 200+ results
From: Jens Axboe @ 2005-06-21 14:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, Jeff Garzik, Git Mailing List

On Mon, Jun 20 2005, Linus Torvalds wrote:
> 
> 
> On Mon, 20 Jun 2005, Jens Axboe wrote:
> > 
> > I pulled with rsync manually from kernel.org, and that did fix things up
> > for me. The main tree is rsync'ed, but the development tree gets the
> > changes with /opt/kernel/git/linux-2.6/.git/ as the url given to
> > git-pull-script.
> 
> Ok, that explains it. Since you're using a regular local filename, the
> pull will be using "git-local-pull", which will only fetch the objects
> directly needed. And doesn't understand the tag-to-tree thing, so doesn't 
> fetch the tree (or possibly you just copied the tags by hand totally 
> outside of the regular pull?)

Isn't that a little 'end user' confusing from a usability point of view,
that it behaves differently depending on which pull script it ends up
using in the end?

I guess I can just always use rsync even for local trees. And use it
directly, so I always have everything :)

-- 
Jens Axboe


^ permalink raw reply	[relevance 0%]

* Re: ORIG_HEAD
  @ 2005-06-21 21:06  4% ` Linus Torvalds
  0 siblings, 0 replies; 200+ results
From: Linus Torvalds @ 2005-06-21 21:06 UTC (permalink / raw)
  To: David S. Miller; +Cc: git



On Mon, 20 Jun 2005, David S. Miller wrote:
> 
> Is there a really good reason why git-pull-script runs are deleting
> that file now?

No. I've cleaned it up a bit, and codified the stuff we leave around.

I also changed "git fetch" to write FETCH_HEAD instead of MERGE_HEAD, 
because that's obviously what it is (it's perfectly fine to fetch things 
for other reasons, like just checking what somebody else has, and you 
might not ever intend to merge it anyway).

		Linus

^ permalink raw reply	[relevance 4%]

* Re: Patch (apply) vs. Pull
  @ 2005-06-21 22:09  3%   ` Linus Torvalds
    0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-21 22:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Darrin Thompson, git



On Mon, 20 Jun 2005, Junio C Hamano wrote:
>
> FYI, here is what I have been doing:
> 
>  (1) Start from Linus HEAD.
> 
>  (2) Repeat develop-and-commit cycle.
> 
>  (3) Run "git format-patch" (not in Linus tree) to generate
>      patches.
> 
>  (4) Send them out and wait to see which one sticks.
> 
>  (5) Pull from Linus.
> 
>  (6) Throw away my HEAD, making Linus HEAD my HEAD, while
>      preserving changes I have made since I forked from him.  I
>      use "jit-rewind" for this.
> 
>  (7) Examine patches that Linus rejected, and apply ones that I
>      still consider good, making one commit per patch.  I use
>      "jit-patch" and "jit-commit -m" for this.
> 
>  (8) Go back to step 2.

Btw, I'd like to help automate the 6-7 stage with a different kind of 
merge logic.

The current "real merge" is the global history merge, and that's the kind
that I personally want to use, since that's what makes sense from a
"project lead" standpoint and for the people around me in the kernel space
that are project leaders of their own.

However, as you point out, it's not necessarily the best kind of merge for
the "individual developer" standpoint. Most individual developers don't
necessarily want to merge their work, rather they want to "bring it
forward" to the current tip. And I think git could help with that too.

It would be somewhat akin to the current git-merge-script, but instead of 
merging it based on the common parent, it would instead try to re-base all 
the local commits from the common parent onwards on top of the new remote 
head. That often makes more sense from the standpoint of a individual 
developer who wants to update his work to the remote head.

Something like this (this assumes FETCH_HEAD is the remote head that we 
just fetched with "git fetch xxx" and that we want to re-base to):

 - get the different HEAD info set up, and save the original head in 
   ORIG_HEAD, the way "git resolve" does for real merges:

	: ${GIT_DIR=.git}

	orig=$(git-rev-parse HEAD)
	new=$(git-rev-parse FETCH_HEAD)
	common=$(git-merge-base $orig $new)

	echo $orig > $GIT_DIR/ORIG_HEAD

 - fast-forward to the new HEAD. We'll want to re-base everything off 
   that. If that fails, exit out - we've got dirty state

	git-read-tree -m -u $orig $new && exit 1

 - for each commit that we had in our old tree but not in the common part, 
   try to re-base it:

	> FAILED_TO_CHERRYPICK
	for i in $(git-rev-list $orig ^$common)
	do
		git-cherry-pick $i ||
			(echo $i >> FAILED_TO_CHERRYPICK)
	done
	if [ -s FAILED_TO_CHERRYPICK ]; then
		echo Some commits could not be cherry-picked, check by hand:
		cat FAILED_TO_CHERRYPICK
	fi

and here the "git-cherry-pick" thing is just a script that basically takes
an old commit ID, and tries to re-apply it as a patch (with author data
and commit messages, of course) on top of the current head. It would 
basically be nothing more than a "git-diff-tree $1" followed by tryign to 
figure out whether it had already been applied or whether it can be 
applied now.

What do you think?

		Linus

^ permalink raw reply	[relevance 3%]

* [PATCH 2/2] Pull misc objects
  @ 2005-06-22  0:35  3% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-22  0:35 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds

Make pull fetch whatever is specified, parse it to figure out what it is, and
then process it appropriately. This also supports getting tag objects, and
getting whatever they tag.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Index: pull.c
===================================================================
--- d0df139324abdbf701ffcae26e43bcb0350c270e/pull.c  (mode:100644 sha1:e70fc02f3bf5b6c626a138d6d76d819fab76f0c8)
+++ b6a510708036fe29a19c33472f5c0b746e2d26d7/pull.c  (mode:100644 sha1:91d9db6c7b1be84e7a5fe21c5194fbf22dadc8cb)
@@ -3,6 +3,8 @@
 #include "cache.h"
 #include "commit.h"
 #include "tree.h"
+#include "tag.h"
+#include "blob.h"
 #include "refs.h"
 
 const char *write_ref = NULL;
@@ -57,6 +59,8 @@
 	return status;
 }
 
+static int process_unknown(unsigned char *sha1);
+
 static int process_tree(unsigned char *sha1)
 {
 	struct tree *tree = lookup_tree(sha1);
@@ -115,6 +119,35 @@
 	return 0;
 }
 
+static int process_tag(unsigned char *sha1)
+{
+	struct tag *obj = lookup_tag(sha1);
+
+	if (parse_tag(obj))
+		return -1;
+	return process_unknown(obj->tagged->sha1);
+}
+
+static int process_unknown(unsigned char *sha1)
+{
+	struct object *obj;
+	if (make_sure_we_have_it("object", sha1))
+		return -1;
+	obj = parse_object(sha1);
+	if (!obj)
+		return error("Unable to parse object %s", sha1_to_hex(sha1));
+	if (obj->type == commit_type)
+		return process_commit(sha1);
+	if (obj->type == tree_type)
+		return process_tree(sha1);
+	if (obj->type == blob_type)
+		return 0;
+	if (obj->type == tag_type)
+		return process_tag(sha1);
+	return error("Unable to determine requirement of type %s for %s",
+		     obj->type, sha1_to_hex(sha1));
+}
+
 static int interpret_target(char *target, unsigned char *sha1)
 {
 	if (!get_sha1_hex(target, sha1))
@@ -142,7 +175,7 @@
 	if (interpret_target(target, sha1))
 		return error("Could not interpret %s as something to pull",
 			     target);
-	if (process_commit(sha1))
+	if (process_unknown(sha1))
 		return -1;
 	
 	if (write_ref) {


^ permalink raw reply	[relevance 3%]

* [PATCH] local-pull: implement fetch_ref()
  @ 2005-06-22  8:52 10% ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-22  8:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, git

This makes "-w ref" usable for git-local-pull.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Documentation/git-local-pull.txt |    7 +++++--
 local-pull.c                     |   31 ++++++++++++++++++++++++++++---
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt
--- a/Documentation/git-local-pull.txt
+++ b/Documentation/git-local-pull.txt
@@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT 
 
 SYNOPSIS
 --------
-'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] [--recover] commit-id path
+'git-local-pull' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [-l] [-s] [-n] commit-id path
 
 DESCRIPTION
 -----------
@@ -32,10 +32,13 @@ OPTIONS
 	usual, to recover after earlier pull that was interrupted.
 -v::
 	Report what is downloaded.
+-w::
+        Writes the commit-id into the filename under $GIT_DIR/refs/ on
+        the local end after the transfer is complete.
 
 Author
 ------
-Written by Linus Torvalds <torvalds@osdl.org>
+Written by Junio C Hamano <junkio@cox.net>
 
 Documentation
 --------------
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -9,7 +9,7 @@ static int use_link = 0;
 static int use_symlink = 0;
 static int use_filecopy = 1;
 
-static char *path;
+static char *path; /* "Remote" git repository */
 
 int fetch(unsigned char *sha1)
 {
@@ -75,11 +75,34 @@ int fetch(unsigned char *sha1)
 
 int fetch_ref(char *ref, unsigned char *sha1)
 {
-	return -1;
+	static int ref_name_start = -1;
+	static char filename[PATH_MAX];
+	static char hex[41];
+	int ifd;
+
+	if (ref_name_start < 0) {
+		sprintf(filename, "%s/refs/", path);
+		ref_name_start = strlen(filename);
+	}
+	strcpy(filename + ref_name_start, ref);
+	ifd = open(filename, O_RDONLY);
+	if (ifd < 0) {
+		close(ifd);
+		fprintf(stderr, "cannot open %s\n", filename);
+		return -1;
+	}
+	if (read(ifd, hex, 40) != 40 || get_sha1_hex(hex, sha1)) {
+		close(ifd);
+		fprintf(stderr, "cannot read from %s\n", filename);
+		return -1;
+	}
+	close(ifd);
+	pull_say("ref %s\n", sha1_to_hex(sha1));
+	return 0;
 }
 
 static const char *local_pull_usage = 
-"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] [--recover] commit-id path";
+"git-local-pull [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [-l] [-s] [-n] commit-id path";
 
 /* 
  * By default we only use file copy.
@@ -114,6 +137,8 @@ int main(int argc, char **argv)
 			use_filecopy = 0;
 		else if (argv[arg][1] == 'v')
 			get_verbosely = 1;
+		else if (argv[arg][1] == 'w')
+			write_ref = argv[++arg];
 		else
 			usage(local_pull_usage);
 		arg++;
------------------------------------------------


^ permalink raw reply	[relevance 10%]

* The coolest merge EVER!
@ 2005-06-22 21:46  2% Linus Torvalds
    0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-22 21:46 UTC (permalink / raw)
  To: Git Mailing List


Ok, Junio had some cool octopus merges, but I just one-upped him.

I just merged the "gitk" repository into git, and I did it as a real git
merge, which means that I actually retained all the original gitk
repository information intact. IOW, it's not a "import the data" thing,
it's literally a merge of the two trees, and the result has two roots.

Now, the advantage of this kind of merge is that Paul's original gitk
repository is totally unaffected by it, yet because I now have his history
(and the exact same objects), the normal kind of git merge should work
fine for me to continue to import Paul's work - we have the common parent
needed to resolve all differences.

Now, I don't know how often this ends up being actually used in practice, 
but at least in theory this is a totally generic thing, where you create a 
"union" of two git trees. I did the union merge manually, but in theory it 
should be easy to automate, with simply something like

	git fetch <project-to-union-merge>
	GIT_INDEX_FILE=.git/tmp-index git-read-tree FETCH_HEAD
	GIT_INDEX_FILE=.git/tmp-index git-checkout-cache -a -u
	git-update-cache --add -- (GIT_INDEX_FILE=.git/tmp-index git-ls-files)
	cp .git/FETCH_HEAD .git/MERGE_HEAD
	git commit

(this is not exactly how I did it, but that's just because I'd never done
this before so I didn't think it through and I had some stupid extra steps
in between that were unnecessary).

Of course, in order for the union merge to work, the namespaces have to be
fit on top of each other with no clashes, otherwise future merges will be 
quite painful. In the case of gitk, Paul's repository only tracked that 
single file, so that wasn't a problem.

Anyway, you shouldn't notice anything new, except for the fact that "gitk" 
now gets automatically included with the base git distribution. And the 
git repository has two roots, but hey, git itself doesn't care.

			Linus

^ permalink raw reply	[relevance 2%]

* Re: Updated git HOWTO for kernel hackers
  @ 2005-06-23  2:39 10%               ` Linus Torvalds
  2005-06-23  3:06  0%                 ` Jeff Garzik
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-23  2:39 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Greg KH, Linux Kernel, Git Mailing List



On Wed, 22 Jun 2005, Jeff Garzik wrote:
> 
> The problem is still that nothing says "oh, btw, I created 'xyz' tag for 
> you" AFAICS?
> 
> IMO the user (GregKH and me, at least) just wants to know their set of 
> tags and heads is up-to-date on local disk.  Wants to know what tags are 
> out there.  It's quite annoying when two data sets are out of sync 
> (.git/objects and .git/refs/tags).

Well, I really think this is the exact same issue as when you write any 
annoucement, and say "please pull from branch xyz of repo abc".

What I'm saying is that for a tagged release, that really translates to
"please pull tag xyz from repo abc" and the tools like git-ssh-pull will 
just do the right thing: they'll pull the tag itself _and_ they'll pull 
the objects it points to.

Of course, right now "git fetch" is hardcoded to always write FETCH_HEAD 
(not the tag name), but I'm saying ythat _literally_ you can do this 
already:

	git fetch repo-name tags/xyz &&
		( cat .git/FETCH_HEAD > .git/tags/xyz )

and it should do exactly what you want. Hmm?

So if we script this (maybe teach "git-fetch-script" to take "tag" as its 
first argument and do this on its own), and people learn to just do

	git fetch tag v2.6.18.5

when Chris or Greg make an announcement about "v2.6.18.5", then you're all
done, no?

The change to "git-fetch-script" would look something like the appended.. 
Totally untested, of course. Give it a try,

			Linus

---
diff --git a/git-fetch-script b/git-fetch-script
--- a/git-fetch-script
+++ b/git-fetch-script
@@ -1,5 +1,12 @@
 #!/bin/sh
 #
+destination=FETCH_HEAD
+
+if [ "$1" = "tag" ]; then
+	shift
+	destination="refs/tags/$2"
+fi
+
 merge_repo=$1
 merge_name=${2:-HEAD}
 
@@ -35,7 +42,7 @@ download_objects () {
 }
 
 echo "Getting remote $merge_name"
-download_one "$merge_repo/$merge_name" "$GIT_DIR"/FETCH_HEAD || exit 1
+download_one "$merge_repo/$merge_name" "$GIT_DIR/$dest" || exit 1
 
 echo "Getting object database"
-download_objects "$merge_repo" "$(cat "$GIT_DIR"/FETCH_HEAD)" || exit 1
+download_objects "$merge_repo" "$(cat "$GIT_DIR/$dest")" || exit 1

^ permalink raw reply	[relevance 10%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  2:39 10%               ` Linus Torvalds
@ 2005-06-23  3:06  0%                 ` Jeff Garzik
  2005-06-23  3:24  4%                   ` Linus Torvalds
  0 siblings, 1 reply; 200+ results
From: Jeff Garzik @ 2005-06-23  3:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, Linux Kernel, Git Mailing List

Linus Torvalds wrote:
> What I'm saying is that for a tagged release, that really translates to
> "please pull tag xyz from repo abc" and the tools like git-ssh-pull will 
> just do the right thing: they'll pull the tag itself _and_ they'll pull 
> the objects it points to.

Yes, everything does the right there here.


> Of course, right now "git fetch" is hardcoded to always write FETCH_HEAD 
> (not the tag name), but I'm saying ythat _literally_ you can do this 
> already:
> 
> 	git fetch repo-name tags/xyz &&
> 		( cat .git/FETCH_HEAD > .git/tags/xyz )
> 
> and it should do exactly what you want. Hmm?

No, not at all.  This sub-thread is all about tags/ dir updates.  Users 
should be able to do

	git pull-more rsync://...

and get ALL of .git/refs/tags/* that have appeared since their last update.

Concrete example:  I have a git tree on local disk.  I need to find out 
where, between 2.6.12-rc1 and 2.6.12, a driver broke.  This requires 
that I have -ALL- linux-2.6.git/refs/tags on disk already, so that I can 
bounce quickly and easily between tags.

It is valuable to have a local copy of -all- tags, -before- you need 
them.  That is why people like me and GregKH use rsync directly.  We 
want EVERYTHING in the kernel.org linux-2.6.git tree, not just what we 
know we need right now.

	Jeff



^ permalink raw reply	[relevance 0%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  3:06  0%                 ` Jeff Garzik
@ 2005-06-23  3:24  4%                   ` Linus Torvalds
  2005-06-23  5:16  3%                     ` Jeff Garzik
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-23  3:24 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Greg KH, Linux Kernel, Git Mailing List



On Wed, 22 Jun 2005, Jeff Garzik wrote:
> 
> Concrete example:  I have a git tree on local disk.  I need to find out 
> where, between 2.6.12-rc1 and 2.6.12, a driver broke.  This requires 
> that I have -ALL- linux-2.6.git/refs/tags on disk already, so that I can 
> bounce quickly and easily between tags.

Absolutely not.

I might have my private tags in my kernel, and you might have your private 
tags ("tested") in your kernel, so there is no such thing as "ALL".

The fact that BK had it was a BK deficiency, and just meant that you 
basically couldn't use tags at all with BK, the "official ones" excepted. 
It basically meant that nobody else than me could ever tag a tree. Do you 
not see how that violates the very notion of "distributed"?

This is _exactly_ the same thing as if you said "I want to merge with ALL
BRANCHES".  That notion doesn't exist. You can rsync the whole repository,
and you'll get all branches from that repository, that's really by virtue
of doing a filesystem operation, not because you asked git to get you all
branches.

A tag is even _implemented_ exactly like a branch, except it allows (but
does not require) that extra step of signing an object. The only
difference is literally whether it is in refs/branches or refs/tags.

> It is valuable to have a local copy of -all- tags, -before- you need 
> them.

You seem to not realize that "all tags" is a nonsensical statement in a 
distributed system.

If you want to have a list of official tags, why not just do exactly that? 
What's so hard with saying "ok, that place has a list of 'official' tags, 
let's fetch them".

How would you fetch them? You might use rsync, for example. Or maybe wget. 
Or whatever. The point is that this works already. You're asking for 
something nonsensical, outside of just a script that does

	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/

See? What's your complaint with just doing that?

			Linus

^ permalink raw reply	[relevance 4%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  3:24  4%                   ` Linus Torvalds
@ 2005-06-23  5:16  3%                     ` Jeff Garzik
  2005-06-23  5:58  5%                       ` Linus Torvalds
  2005-06-23 14:31  0%                       ` Horst von Brand
  0 siblings, 2 replies; 200+ results
From: Jeff Garzik @ 2005-06-23  5:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, Linux Kernel, Git Mailing List


Linus Torvalds wrote:
> 	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/
> 
> See? What's your complaint with just doing that?

No complaint with that operation.  The complaint is that it's an 
additional operation.  Re-read what Greg said:

> Is there some reason why git doesn't pull the
> tags in properly when doing a merge?  Chris and I just hit this when I
> pulled his 2.6.12.1 tree and and was wondering where the tag went.

Multiple users -- not just me -- would prefer that git-pull-script 
pulled the tags, too.

Suggested solution:  add '--tags' to git-pull-script 
(git-fetch-script?), which calls
	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/


> You seem to not realize that "all tags" is a nonsensical statement in a 
> distributed system.
> 
> If you want to have a list of official tags, why not just do exactly that? 
> What's so hard with saying "ok, that place has a list of 'official' tags, 
> let's fetch them".

I know how tags work, and I like the new flexibility above and beyond BK.

Kernel hackers are surprised when the tags aren't pulled, along with the 
objects.  BK and CVS trained us that tags came with the repo, no 
additional steps needed.  Why not give us the OPTION of working like 
we've always worked?

Let the kernel hacker say "yes, I really do want to download the tags 
Linus publicly posted in linux-2.6.git/refs/tags" because this was a 
common operation in the previous workflow, a common operation that we 
-made use of-.

	Jeff



^ permalink raw reply	[relevance 3%]

* Re: Patch (apply) vs. Pull
  @ 2005-06-23  5:15  2% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-23  5:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Wed, 22 Jun 2005, Linus Torvalds wrote:

> On Wed, 22 Jun 2005, Daniel Barkalow wrote:
> > I think it would we worthwhile for the common case for the system to
> > recognize that a patch went in unmodified, but potentially after a
> > different patch which caused it to have fuzz, and have a header that would
> > get the developer's update to recognize what happened.
> 
> Note that I don't even apply patches with fuzz at all. "git-apply" refuses 
> to recognize anything with fuzz, although it _will_ move the patch around 
> to make it match.

I bet I'm misunderstanding fuzz; what I actually mean is that, if a patch
applies after moving it, then regenerating it from the result would give
the a patch with different line numbers; if these affect the hash, the
author's tools will be sad.

> So as far as I'm concerned, you really could just take the SHA1 of the
> patch (leave out the '@@' lines with line numbers), and you'd have a
> reliable ID for it.
>
> In fact, you could probably replace every run of contiguous whitespace
> with a single space, and then you'd not have to worry about whitespace
> differences either. That would be very simple to do, and quite workable: I
> certainly think it sounds more reliable than just hoping that people
> always pass on a "patch ID" in their emails..

That's actually quite plausible. The only case it wouldn't handle is when
you actually discard parts, and I'm not sure at this point what other
people should see there.

> But yes, if it's a nasty case, I'll just apply it, edit it, re-create a
> diff, and then re-apply it with the re-created diff (since all my tools
> are geared towards getting the log message etc with the patch, I don't
> just commit it after fixing it up: that would screw up the author
> information etc).

I think most people's scripts stash the information needed for the commit
somewhere, and pick it back up at commit time, at least for merges.

> > > Yes. And I think (1) is pretty useful on its own, and that git could 
> > > support that with a nice helper script.
> > 
> > I think that this, by itself, is likely to be a sufficiently common case
> > to be worth just doing. Once the script exists, it makes it worthwhile for
> > developers to organize things such that it works.
> 
> Yeah. It probably works well in 99% of the cases to just do a simple
> "export as patch" + "apply on top with old commit message, author and
> author-date".

I think that you'll get better results out of "merge with top" + "commit
with old commit info, but not listing old commit as a parent". At least,
that's what StGIT is doing, IIRC, and using merge instead of patch seems
like it'll make the remaining 1% a lot more pleasant. In fact, isn't it
necessary if you want to make sense out of "half of my patch got applied",
as a bunch of "still needed" hunks and a bunch of "already applied" hunks 
that disappear after the message?

> > It's not just that
> > they're machine-readable and can apply with fuzz; they're also pretty
> > easy for humans to read, which is why unified diffs are better than
> > context diffs, despite having the same expressive power. In this case,
> > then, they aren't being used for cherry-picking or any other history
> > cleaning; you'll tend to apply the patch straight (or reject it), and
> > then it would be useful to have it act like a merge, with respect to
> > further operations understanding what happened.
> 
> I've _occasionally_ wanted patches to work that way, just because they 
> don't apply, but they'd apply to the right version and then I could just 
> merge them. So yes, sometimes a patch might be more of a merge thing. Most 
> of the time, the patch has really been around the block several times, and 
> it's really lost it's position in the history tree.. So it really ends up 
> being "just apply it to the top" 99% of the time anyway.

It should be fine as a merge if you apply it to the top; the case that's
cherry-picking is when you apply a patch that was second in a series
without applying the first. By "like a merge" I really mean "someone
changed <old> to <patched>; I want to make that change to <new>, such that
future merges aren't confused." In this case, you'd actually generate only
an "apply" commit, not recreate (or fetch) "patched-old" and generate a
merge commit. But, if you put the hash of the patch (as above) in a commit
header, other people who have the same patch (including the author) can
identify the commonality and not be confused. (I think putting
it in a header is likely necessary for efficiency reasons, so that it
isn't necessary to unpack/diff/hash all of the trees while updating.)

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 2%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:16  3%                     ` Jeff Garzik
@ 2005-06-23  5:58  5%                       ` Linus Torvalds
  2005-06-23  6:20  4%                         ` Greg KH
                                           ` (3 more replies)
  2005-06-23 14:31  0%                       ` Horst von Brand
  1 sibling, 4 replies; 200+ results
From: Linus Torvalds @ 2005-06-23  5:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Greg KH, Linux Kernel, Git Mailing List



On Thu, 23 Jun 2005, Jeff Garzik wrote:
>
> No complaint with that operation.  The complaint is that it's an 
> additional operation.  Re-read what Greg said:

Please re-read what I said.

Pulling a regular head _cannot_ and _must_not_ update tags. Tags are not 
associated with the tree, and they _cannot_ and _must_not_ be so, exactly 
because that would make them global instead of private, and it would 
fundamentally make them not be distributed, and would mean that they'd be 
pointless as anything but "Linus' official tags".

That's what we had in BK _AND IT DOES NOT WORK_!

Does it help when I scream?

> > Is there some reason why git doesn't pull the
> > tags in properly when doing a merge?  Chris and I just hit this when I
> > pulled his 2.6.12.1 tree and and was wondering where the tag went.

And I suggested that if you want that, then you pull on the TAG. You take 
my modification, you test it, and you see if

	git fetch tag ..repo.. tagname

works.

That solves exactly the case that Greg is complaining about, and it solves
it in a _sane_ manner: you tell git that you want a tag, and git fetches
it for you. It's that simple, and it does not introduce the _BROKEN_
notion that tags are associated directly with the commit itself and
somehow visible to all.

> Multiple users -- not just me -- would prefer that git-pull-script 
> pulled the tags, too.

And multiple users -- clearly including you -- aren't listening to me. 
Tags are separate from the source they tag, and they HAVE TO BE. There is 
no "you automatically get the tags when you get the tree", because the two 
don't have a 1:1 relationship.

And not making them separate breaks a lot of things. As mentioned, it
fundamentally breaks the distributed nature, but that also means that it
breaks whenever two people use the same name for a tag, for example. You
can't "merge" tags. BK had a very strange form of merging, which was (I
think) to pick the one last in the BK ChangeSet file, but that didn't make
it "right". You just never noticed, because Linux could never use tags at
all due to the lack of privacy, except for big releases..

> Suggested solution:  add '--tags' to git-pull-script 
> (git-fetch-script?), which calls
> 	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/

How is this AT ALL different from just having a separate script that does
this? You've introduced nothing but syntactic fluff, and you've made it
less flexible at the same time. First off, you might want to get new tags
_without_ fetching anything else, and you might indeed want to get the 
tags _first_ in order to decide what you want to fetch. In fact, in many 
cases that's exactly what you want, namely you want to fetch the data 
based on the tag.

Secondly, if your worry is that you forget, then hell, write a small shell 
function, and be done with it.

BUT DO NOT MESS UP THINGS FOR OTHER PEOPLE.

When I fetch somebody elses head, I had better not fetch his tags. His
tags may not even make _sense_ in what I have - he may tag things in other
branches that I'm not fetching at all. In fact, his tag-namespace might be
_different_ from mine, ie he might have tagged something "broken" in his
tree, and I tagged something _else_ "broken" in mine, just because it
happens to be a very useful tag for when you want to mark "ok, that was a
broken tree".

It is wrong, wrong, _wrong_ to think that fetching somebody elses tree
means that you should fetch his tags. The _only_ reason you think it's
right is because you've only ever seen centralized tags: tags were the one
thing that BK kept centralized.

But once people realize that they can use tags in their own trees, and 
nobody else will ever notice, they'll slowly start using them. Maybe it 
takes a few months or even longer. But it will happen. And I refuse to 
make stupid decisions that makes it not work.

And thinking that "fetching a tree fetches all the tags from that tree"  
really _is_ a stupid decision. It's missing the big picture. It's missing
the fact that tags _should_ be normal every-day things that you just use
as "book-marks", and that the kind of big "synchronization point for many
people" tag should actually be the _rare_ case.

The fact that global tags make that private "bookmark" usage impossible
should be a big red blinking sign saying "don't do global tags".

> Let the kernel hacker say "yes, I really do want to download the tags 
> Linus publicly posted in linux-2.6.git/refs/tags" because this was a 
> common operation in the previous workflow, a common operation that we 
> -made use of-.

And I already suggested a trivial script. Send me the script patch,
instead of arguing for stupid things.

			Linus

^ permalink raw reply	[relevance 5%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:58  5%                       ` Linus Torvalds
@ 2005-06-23  6:20  4%                         ` Greg KH
  2005-06-23  6:51 10%                           ` Linus Torvalds
  2005-06-23  7:03  0%                         ` Jeff Garzik
                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: Greg KH @ 2005-06-23  6:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Linux Kernel, Git Mailing List

On Wed, Jun 22, 2005 at 10:58:13PM -0700, Linus Torvalds wrote:
> And I suggested that if you want that, then you pull on the TAG. You take 
> my modification, you test it, and you see if
> 
> 	git fetch tag ..repo.. tagname
> 
> works.

Hm, that doesn't work right now.  Both:
  git fetch rsync://rsync.kernel.org/pub/scm/linux/kernel/git/chrisw/linux-2.6.12.y.git tag v2.6.12.1
or
  git fetch tag rsync://rsync.kernel.org/pub/scm/linux/kernel/git/chrisw/linux-2.6.12.y.git v2.6.12.1

die.  Or am I just trying to take a point you were making about not
pulling all tags (which I can live with, just was not aware it was this
way, and I agree that it does offer up a lot of possiblities of me using
local tags in the future), and taking it literally?

thanks,

greg k-h

^ permalink raw reply	[relevance 4%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  6:20  4%                         ` Greg KH
@ 2005-06-23  6:51 10%                           ` Linus Torvalds
  2005-06-23  7:11  0%                             ` Greg KH
  0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-23  6:51 UTC (permalink / raw)
  To: Greg KH; +Cc: Jeff Garzik, Linux Kernel, Git Mailing List



On Wed, 22 Jun 2005, Greg KH wrote:
> 
> Hm, that doesn't work right now.

Yeah, my suggested mod sucks.

Try the following slightly modified version instead, with

	git fetch rsync://rsync.kernel.org/pub/scm/linux/kernel/git/chrisw/linux-2.6.12.y.git tag v2.6.12.1

and now it should work.

		Linus

---
diff --git a/git-fetch-script b/git-fetch-script
--- a/git-fetch-script
+++ b/git-fetch-script
@@ -1,7 +1,13 @@
 #!/bin/sh
 #
+destination=FETCH_HEAD
+
 merge_repo=$1
 merge_name=${2:-HEAD}
+if [ "$2" = "tag" ]; then
+	merge_name="refs/tags/$3"
+	destination="$merge_name"
+fi
 
 : ${GIT_DIR=.git}
 : ${GIT_OBJECT_DIRECTORY="${SHA1_FILE_DIRECTORY-"$GIT_DIR/objects"}"}
@@ -35,7 +41,7 @@ download_objects () {
 }
 
 echo "Getting remote $merge_name"
-download_one "$merge_repo/$merge_name" "$GIT_DIR"/FETCH_HEAD || exit 1
+download_one "$merge_repo/$merge_name" "$GIT_DIR/$destination" || exit 1
 
 echo "Getting object database"
-download_objects "$merge_repo" "$(cat "$GIT_DIR"/FETCH_HEAD)" || exit 1
+download_objects "$merge_repo" "$(cat "$GIT_DIR/$destination")" || exit 1

^ permalink raw reply	[relevance 10%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:58  5%                       ` Linus Torvalds
  2005-06-23  6:20  4%                         ` Greg KH
@ 2005-06-23  7:03  0%                         ` Jeff Garzik
  2005-06-23  7:38  3%                         ` Petr Baudis
  2005-06-23  8:30  0%                         ` Vojtech Pavlik
  3 siblings, 0 replies; 200+ results
From: Jeff Garzik @ 2005-06-23  7:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, Linux Kernel, Git Mailing List

Linus Torvalds wrote:
> Pulling a regular head _cannot_ and _must_not_ update tags. Tags are not 
> associated with the tree, and they _cannot_ and _must_not_ be so, exactly 

For general git implementation, strongly agreed.


> And not making them separate breaks a lot of things. As mentioned, it
> fundamentally breaks the distributed nature, but that also means that it
> breaks whenever two people use the same name for a tag, for example. You
> can't "merge" tags. BK had a very strange form of merging, which was (I
> think) to pick the one last in the BK ChangeSet file, but that didn't make
> it "right". You just never noticed, because Linux could never use tags at
> all due to the lack of privacy, except for big releases..

Agreed.


> How is this AT ALL different from just having a separate script that does
> this? You've introduced nothing but syntactic fluff, and you've made it
> less flexible at the same time. First off, you might want to get new tags
> _without_ fetching anything else, and you might indeed want to get the 
> tags _first_ in order to decide what you want to fetch.

That's a fair point.  A separate script would be better.


> because that would make them global instead of private, and it would 
> fundamentally make them not be distributed, and would mean that they'd be 
> pointless as anything but "Linus' official tags".
[...]
> the fact that tags _should_ be normal every-day things that you just use
> as "book-marks", and that the kind of big "synchronization point for many
> people" tag should actually be the _rare_ case.

For my use, I require all "Linus official tags" to be present in all my 
kernel trees, precisely because it is a big sync point for many people.

User A sends me a patch against 2.6.12-rc2, user B sends me a patch 
against 2.6.12-rc3, user C sends me a patch against 2.6.12...  I create 
a branch with
	cp .git/refs/tags/$kversion .git/refs/heads/foo-net-drvr
	git checkout -f foo-net-drvr
apply the patch, then pull linux-2.6.git to merge up to the latest version.

So in my case, the rare case is the 99% common case :)

I suppose this usage is just highly specific to me.

	Jeff



^ permalink raw reply	[relevance 0%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  6:51 10%                           ` Linus Torvalds
@ 2005-06-23  7:11  0%                             ` Greg KH
  0 siblings, 0 replies; 200+ results
From: Greg KH @ 2005-06-23  7:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Linux Kernel, Git Mailing List

On Wed, Jun 22, 2005 at 11:51:40PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 22 Jun 2005, Greg KH wrote:
> > 
> > Hm, that doesn't work right now.
> 
> Yeah, my suggested mod sucks.
> 
> Try the following slightly modified version instead, with
> 
> 	git fetch rsync://rsync.kernel.org/pub/scm/linux/kernel/git/chrisw/linux-2.6.12.y.git tag v2.6.12.1
> 
> and now it should work.

Yes, that patch works for me.

thanks,

greg k-h

^ permalink raw reply	[relevance 0%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:58  5%                       ` Linus Torvalds
  2005-06-23  6:20  4%                         ` Greg KH
  2005-06-23  7:03  0%                         ` Jeff Garzik
@ 2005-06-23  7:38  3%                         ` Petr Baudis
  2005-06-23  8:30  0%                         ` Vojtech Pavlik
  3 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-06-23  7:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Greg KH, Linux Kernel, Git Mailing List

Dear diary, on Thu, Jun 23, 2005 at 07:58:13AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> told me that...
> Does it help when I scream?

Nope. I still think you are wrong. :-) (BTW, Cogito always fetches all
the tags now - but it's not that I would have a huge problem with
changing that to some better behaviour.)

> > Multiple users -- not just me -- would prefer that git-pull-script 
> > pulled the tags, too.
> 
> And multiple users -- clearly including you -- aren't listening to me. 
> Tags are separate from the source they tag, and they HAVE TO BE. There is 
> no "you automatically get the tags when you get the tree", because the two 
> don't have a 1:1 relationship.
> 
> And not making them separate breaks a lot of things. As mentioned, it
> fundamentally breaks the distributed nature, but that also means that it
> breaks whenever two people use the same name for a tag, for example. You
> can't "merge" tags. BK had a very strange form of merging, which was (I
> think) to pick the one last in the BK ChangeSet file, but that didn't make
> it "right". You just never noticed, because Linux could never use tags at
> all due to the lack of privacy, except for big releases..

I think there should simply be two namespaces - public tags and private
tags. Private tags for stuff like "broken", "merged", or "funnychange".
Other people don't care about those, and they certainly shouldn't get
them by default (but they should have a way to get them explicitly, if
you tell them). But then there are the official tags, like "v2.6.13" or
even "v2.6.12-ck2" - if you merge with those branches, you should always
get those precisely for what Jeff says - they are big syncing points for
a lot of people and you should be always able to refer to v2.6.13 if you
have the commit in your tree.

Since there should be _few_ of those tags, you might even want to get
tags only from branches marked "tagtrusted" (Cogito's origin branch
would be by default), or want to interactively confirm new tag additions
during a pull. Also, ideally there would be no or only extremely rare
tag conflicts.

I think it would be simplest to use a special prefix for the private
tags. ~ and ! might get touched by shell, so what about %?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
<Espy> be careful, some twit might quote you out of context..

^ permalink raw reply	[relevance 3%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:58  5%                       ` Linus Torvalds
                                           ` (2 preceding siblings ...)
  2005-06-23  7:38  3%                         ` Petr Baudis
@ 2005-06-23  8:30  0%                         ` Vojtech Pavlik
  3 siblings, 0 replies; 200+ results
From: Vojtech Pavlik @ 2005-06-23  8:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Greg KH, Linux Kernel, Git Mailing List

On Wed, Jun 22, 2005 at 10:58:13PM -0700, Linus Torvalds wrote:

> And thinking that "fetching a tree fetches all the tags from that tree"  
> really _is_ a stupid decision. It's missing the big picture. It's missing
> the fact that tags _should_ be normal every-day things that you just use
> as "book-marks", and that the kind of big "synchronization point for many
> people" tag should actually be the _rare_ case.
> 
> The fact that global tags make that private "bookmark" usage impossible
> should be a big red blinking sign saying "don't do global tags".

Maybe it'd make sense to differentiate between the two types of tags?
To have local tags which don't propagate, and global (version) tags
which do? They could live in different namespaces and thus wouldn't
interfere.

-- 
Vojtech Pavlik
SuSE Labs, SuSE CR

^ permalink raw reply	[relevance 0%]

* [RFC] Order of push/pull file transfers
@ 2005-06-23 10:12  3% Russell King
  2005-06-24 16:38  0% ` Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Russell King @ 2005-06-23 10:12 UTC (permalink / raw)
  To: git

Hi,

I'd like to start a discussion on the ordering of the various git files
being transferred.

Last night, I pulled Linus' kernel tree from k.o, but Linus was in the
middle of pushing an update to it.  The way cogito works, it grabs the
HEAD first, and then rsyncs the objects.

However, this retrieved the updated HEAD, and only some of the objects.
cogito happily tried to merge the result, and failed.  A later pull
and git-fsck-cache confirmed everything was fine _in this instance_.

Therefore, may I suggest the following two changes in the way git
works:

1. a push updates HEAD only after the rsync/upload of all objects is
   complete.  This means that any pull will not try to update to the
   new head with a partial object tree.

2. a pull only tries to fetch objects if HEAD has been updated since
   the last pull.

This gives a pull-er an additional safety margin which ensures that
merges will not be attempted when a simultaneous pull and push occurs
at the same time.

-- 
Russell King


^ permalink raw reply	[relevance 3%]

* Re: Updated git HOWTO for kernel hackers
  2005-06-23  5:16  3%                     ` Jeff Garzik
  2005-06-23  5:58  5%                       ` Linus Torvalds
@ 2005-06-23 14:31  0%                       ` Horst von Brand
  1 sibling, 0 replies; 200+ results
From: Horst von Brand @ 2005-06-23 14:31 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linus Torvalds, Greg KH, Linux Kernel, Git Mailing List

Jeff Garzik <jgarzik@pobox.com> said:
> Linus Torvalds wrote:
> > 	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/
> > 
> > See? What's your complaint with just doing that?
> 
> No complaint with that operation.  The complaint is that it's an 
> additional operation.  Re-read what Greg said:
> 
> > Is there some reason why git doesn't pull the
> > tags in properly when doing a merge?  Chris and I just hit this when I
> > pulled his 2.6.12.1 tree and and was wondering where the tag went.
> 
> Multiple users -- not just me -- would prefer that git-pull-script 
> pulled the tags, too.
> 
> Suggested solution:  add '--tags' to git-pull-script 
> (git-fetch-script?), which calls
> 	rsync -r --ignore-existing repo/refs/tags/ .git/refs/tags/

I don't think either is really a solution. IMHO there should be a
distinction between "official tags" (that get passed around together with
everything else) and "private tags" for everybodys own home use (that could
be passed around, but only explicitly). Plus the possibility to erase,
move, &c private tags, and perhaps upgrade them to official status (thus
setting them in stone).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[relevance 0%]

* [PATCH 1/2] Add a bit of developer documentation to pull.h
  @ 2005-06-23 23:23 25% ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-23 23:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Describe what to implement in fetch() and fetch_ref() for
pull backend writers a bit better.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 pull.h |   21 +++++++++++++++------
 1 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -1,24 +1,33 @@
 #ifndef PULL_H
 #define PULL_H
 
-/** To be provided by the particular implementation. **/
+/*
+ * Fetch object given SHA1 from the remote, and store it locally under
+ * GIT_OBJECT_DIRECTORY.  Return 0 on success, -1 on failure.  To be
+ * provided by the particular implementation.
+ */
 extern int fetch(unsigned char *sha1);
 
+/*
+ * Fetch ref (relative to $GIT_DIR/refs) from the remote, and store
+ * the 20-byte SHA1 in sha1.  Return 0 on success, -1 on failure.  To
+ * be provided by the particular implementation.
+ */
 extern int fetch_ref(char *ref, unsigned char *sha1);
 
-/** If set, the ref filename to write the target value to. **/
+/* If set, the ref filename to write the target value to. */
 extern const char *write_ref;
 
-/** If set, the hash that the current value of write_ref must be. **/
+/* If set, the hash that the current value of write_ref must be. */
 extern const unsigned char *current_ref;
 
-/** Set to fetch the target tree. */
+/* Set to fetch the target tree. */
 extern int get_tree;
 
-/** Set to fetch the commit history. */
+/* Set to fetch the commit history. */
 extern int get_history;
 
-/** Set to fetch the trees in the commit history. **/
+/* Set to fetch the trees in the commit history. */
 extern int get_all;
 
 /* Set to zero to skip the check for delta object base;
------------


^ permalink raw reply	[relevance 25%]

* [PATCH 2/3] git-cherry: find commits not merged upstream.
  @ 2005-06-23 23:28  4%       ` Junio C Hamano
  2005-06-23 23:29  4%       ` [PATCH 3/3] git-rebase-script: rebase local commits to new upstream head Junio C Hamano
  1 sibling, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-23 23:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This script helps the git-rebase script by finding commits that
have not been merged upstream.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile   |    2 +
 git-cherry |   90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+), 1 deletions(-)
 create mode 100755 git-cherry

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk
+	gitk git-cherry
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-cherry b/git-cherry
new file mode 100755
--- /dev/null
+++ b/git-cherry
@@ -0,0 +1,90 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano.
+#
+
+usage="usage: $0 "'<upstream> [<head>]
+
+             __*__*__*__*__> <upstream>
+            /
+  fork-point
+            \__+__+__+__+__+__+__+__> <head>
+
+Each commit between the fork-point and <head> is examined, and
+compared against the change each commit between the fork-point and
+<upstream> introduces.  If the change does not seem to be in the
+upstream, it is shown on the standard output.
+
+The output is intended to be used as:
+
+    OLD_HEAD=$(git-rev-parse HEAD)
+    git-rev-parse linus >${GIT_DIR-.}/HEAD
+    git-cherry linus OLD_HEAD |
+    while read commit
+    do
+        GIT_EXTERNAL_DIFF=git-apply-patch-script git-diff-tree -p "$commit" &&
+	git-commit-script -m "$commit"
+    done
+'
+
+case "$#" in
+1) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse HEAD` || exit
+   ;;
+2) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse "$2"` || exit
+   ;;
+*) echo >&2 "$usage"; exit 1 ;;
+esac
+
+# Note that these list commits in reverse order;
+# not that the order in inup matters...
+inup=`git-rev-list ^$junio $linus` &&
+ours=`git-rev-list $junio ^$linus` || exit
+
+tmp=.cherry-tmp$$
+patch=$tmp-patch
+diff=$tmp-diff
+mkdir $patch
+trap "rm -rf $tmp-*" 0 1 2 3 15
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+
+for c in $inup $ours
+do
+	git-diff-tree -p $c |
+	sed -e "/^$_x40 (from $_x40)\$/d;/^--- /d;/^+++ /d;/^@@ /d" >$patch/$c
+	git-diff-tree -r $c |
+	sed -e "/^$_x40 (from $_x40)\$/d;s/ $_x40 $_x40 / X X /" >$patch/$c.s
+done
+
+LF='
+'
+O=
+for c in $ours
+do
+	found=
+	for d in $inup
+	do
+		cmp $patch/$c.s $patch/$d.s >/dev/null ||
+		continue
+
+		diff --unified=0 $patch/$c $patch/$d >$diff
+		cmp /dev/null $diff >/dev/null && {
+			found=t
+			break
+		}
+	done
+	case "$found,$O" in
+	t,*)	;;
+	,)
+		O="$c" ;;
+	,*)
+		O="$c$LF$O" ;;
+	esac
+done
+case "$O" in
+'') ;;
+*)  echo "$O" ;;
+esac
------------


^ permalink raw reply	[relevance 4%]

* [PATCH 3/3] git-rebase-script: rebase local commits to new upstream head.
    2005-06-23 23:28  4%       ` [PATCH 2/3] git-cherry: find commits not merged upstream Junio C Hamano
@ 2005-06-23 23:29  4%       ` Junio C Hamano
  1 sibling, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-23 23:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Using git-cherry, forward port local commits missing from the
new upstream head.  This depends on "-m" flag support in
git-commit-script.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile          |    2 +-
 git-rebase-script |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletions(-)
 create mode 100755 git-rebase-script

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk git-cherry
+	gitk git-cherry git-rebase-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-rebase-script b/git-rebase-script
new file mode 100755
--- /dev/null
+++ b/git-rebase-script
@@ -0,0 +1,46 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano.
+#
+
+usage="usage: $0 "'<upstream> [<head>]
+
+Uses output from git-cherry to rebase local commits to the new head of
+upstream tree.'
+
+: ${GIT_DIR=.git}
+
+case "$#" in
+1) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse HEAD` || exit
+   ;;
+2) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse "$2"` || exit
+   ;;
+*) echo >&2 "$usage"; exit 1 ;;
+esac
+
+git-read-tree -m -u $junio $linus &&
+echo "$linus" >"$GIT_DIR/HEAD" || exit
+
+tmp=.rebase-tmp$$
+fail=$tmp-fail
+trap "rm -rf $tmp-*" 0 1 2 3 15
+
+>$fail
+
+git-cherry $linus $junio |
+while read commit
+do
+	S=`cat "$GIT_DIR/HEAD"` &&
+        GIT_EXTERNAL_DIFF=git-apply-patch-script git-diff-tree -p $commit &&
+	git-commit-script -m "$commit" || {
+		echo $commit >>$fail
+		git-read-tree --reset -u $S
+	}
+done
+if test -s $fail
+then
+	echo Some commits could not be rebased, check by hand:
+	cat $fail
+fi
------------


^ permalink raw reply	[relevance 4%]

* Re: Mercurial vs Updated git HOWTO for kernel hackers
  @ 2005-06-24  6:41  2%   ` Petr Baudis
  0 siblings, 0 replies; 200+ results
From: Petr Baudis @ 2005-06-24  6:41 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Jeff Garzik, Linux Kernel, Git Mailing List, mercurial

Dear diary, on Fri, Jun 24, 2005 at 01:56:34AM CEST, I got a letter
where Matt Mackall <mpm@selenic.com> told me that...
> On Wed, Jun 22, 2005 at 06:24:54PM -0400, Jeff Garzik wrote:
> > 
> > Things in git-land are moving at lightning speed, and usability has 
> > improved a lot since my post a month ago:  http://lkml.org/lkml/2005/5/26/11
> 
> And here's a quick comparison with the current state of Mercurial..

And here's a quick back-comparison with Cogito. ;-)

> > 1) installing git
> > 
> > git requires bootstrapping, since you must have git installed in order 
> > to check out git.git (git repo), and linux-2.6.git (kernel repo).  I 
> > have put together a bootstrap tarball of today's git repository.
> > 
> > Download tarball from:
> > http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-20050622.tar.bz2
> > 
> > tarball build-deps:  zlib, libcurl, libcrypto (openssl)
> > 
> > install tarball:  unpack && make && sudo make prefix=/usr/local install
> > 
> > jgarzik helper scripts, not in official git distribution:
> > http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-new-branch
> > http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-changes-script
> > 
> > After reading the rest of this document, come back and update your copy 
> > of git to the latest:
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git
> 
> Download from: http://selenic.com/mercurial/mercurial-snapshot.tar.gz
> Build-deps: Python 2.3
> Install: unpack && python setup.py install [--home=/usr/local]

Download from: http://www.kernel.org/pub/software/scm/cogito/
Deps: git's + bash + reasonable shell environment
Install: edit Makefile, make + make install

> > 2) download a linux kernel tree for the very first time
> > 
> > $ mkdir -p linux-2.6/.git
> > $ cd linux-2.6
> > $ rsync -a --delete --verbose --stats --progress \
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/ 
> > \          <- word-wrapped backslash; sigh
> >     .git/
> 
> $ mkdir linux-2.6
> $ cd linux-2.6
> $ hg init http://www.kernel.org/hg/    # obviously you can also browse this

$ cg-clone \
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/

(that will checkout to linux-2.6/ directory; you can specify the target
directory as the optional second parameter)

> > 3) update local kernel tree to latest 2.6.x upstream ("fast-forward merge")
> > 
> > $ cd linux-2.6
> > $ git-pull-script \
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> 
> $ hg pull        # defaults to where you originally pulled from

$ cg-update	# defaults to where you originally pulled from

(cg-pull just gets the changes to your repository, but won't merge them
into your branch)

> > 4) check out files from the git repository into the working directory
> > 
> > $ git checkout -f
> 
> $ hg update      # or up or checkout or co, depending on your SCM habits

In Cogito, all files are always checked out.

> > 5) check in your own modifications (e.g. do some hacking, or apply a patch)
> > 
> > # go to repo
> > $ cd linux-2.6
> > 
> > # make some modifications
> > $ patch -sp1 < /tmp/my.patch
> > $ diffstat -p1 < /tmp/my.patch
> > 
> > # NOTE: add '--add' and/or '--remove' if files were added or removed
> > $ git-update-cache <list of all files changed>
> > 
> > # check in changes
> > $ git commit
> 
> $ hg commit [files]    # check in everything changed or just the named files

$ cg-commit [-m"Message"...] [files] # check in everything changed or just
                                     # the named files

If you pass multiple -m arguments, they get formatted as separate
paragraphs in the log message. It is customary for the first -m argument
to contain a short one-line summary.

Note that you must add/remove files by

$ cg-add files...

and

$ cg-rm files...

> 5.1) undo the last commit or pull
> 
> $ hg undo

$ cg-admin-uncommit

Note that you should never do this if you already pushed the changes
out, or someone might get them. (That holds for regular Git too.) See

$ cg-help cg-admin-uncommit   # (or cg-admin-uncommit --help)

for details. (That's another Cogito's cool feature. Handy docs! ;-)

> > 6) List all changes in working dir, in diff format.
> > 
> > $ git-diff-cache -p HEAD
> 
> $ hg status            # show changed files

$ cg-status		# show changed files
$ cg-diff [-c] [files]	# show the diffs, -c colourfully

> > 7) List all changesets (i.e. show each cset's description text) in local 
> > branch of local tree, that are not present in remote tree.
> > 
> > $ cd my-kernel-tree-2.6
> > $ git-changes-script -L ../linux-2.6 | less
> 
> $ hg history | less         # How does git know what's not in the
>                             # remote tree? Psychic?

# -c colourfully, -s prints only summaries, one line per changeset
$ cg-log [-c] [-s] -m -r linux-2.6 # List changes only in linux-2.6

Note that | less is unnecessary (even undesirable with -c).

> > 8) List all changesets:
> > 
> > $ git-whatchanged
> 
> $ hg history | less

$ cg-log [-c] [-s]

8.1) List all changesets in the origin branch:

$ cg-log [-c] [-s] -r origin

8.2) List all changesets concerning files CREDITS and fs/inode.c:

$ cg-log [-c] [-s] CREDITS fs/inode.c

> > 9) apply all patches in a Berkeley mbox-format file
> > 
> > First, download and add to your PATH Linus's git tools:
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git-tools.git
> > 
> > $ cd my-kernel-tree-2.6
> > $ dotest /path/to/mbox  # yes, Linus has no taste in naming scripts
> 
> hg doesn't do mboxes directly, but you can do:
> 
> $ cat patch-list | xargs hg import

Theoretically, dotest should work just fine even if you use Cogito.
Anyone tested it?

> > 10) don't forget to download tags from time to time.
> > 
> > git-pull-script only downloads sha1-indexed object data, and the 
> > requested remote head.  This misses updates to the .git/refs/tags/ and 
> > .git/refs/heads directories.  It is advisable to update your kernel .git 
> > directories periodically with a full rsync command, to make sure you got 
> > everything:
> >
> > $ cd linux-2.6
> > $ rsync -a --delete --verbose --stats --progress \
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/
> > \          <- word-wrapped backslash; sigh
> >     .git/
> 
> Tags in mercurial are properly version controlled and come along for
> the ride with pulls. Also, the right thing happens with merges.

cg-update and cg-pull takes fetches new tags during a pull.

> > 11) list all branches, such as those found in my netdev-2.6 or 
> > libata-dev trees.
> > 
> > Download
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
> > 	or
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git
> > 
> > 
> > $ cd netdev-2.6
> > $ ls .git/refs/heads/
> > 
> > { these are the current netdev-2.6 branches }
> > >8139cp       forcedeth    master     qeth           smc91x         we18
> > >8139too-iomap  for-linus    natsemi      r8169      smc91x-eeprom  wifi
> > >airo           hdlc         ns83820      register-netdev  starfire
> > >atmel          ieee80211    orinoco      remove-drivers   tlan
> > >chelsio        iff-running  orinoco-hch  sis900           veth
> > >dm9000         janitor      ppp          skge             viro
> 
> $ hg heads   # Has Andrew mentioned your git forest gives him a headache?

$ cg-branch-ls

# Note that Cogito supports only remote  branches properly now; that
# will yet evolve (in some backwards-compatible way).

> > 12) make desired branch current in working directory
> > 
> > $ git checkout -f $branch
> 
> $ hg update -C <rev or id or tag>

You can check the desired branch out into another directory:

$ cg-clone path/to/linux-2.6/.git#branch anotherdir

Switching branches in place will be supported soon (although I have
doubts about its usefulness).

> > 13) create a new branch, and make it current
> > 
> > $ cp .git/refs/heads/master .git/refs/heads/my-new-branch-name
> > $ git checkout -f my-new-branch-name
> 
> Since the hg repo is lightweight, this is usually done by just having
> different directories. Thus we don't explicitly name branches.
> 
> $ mkdir new-branch
> $ cd new-branch
> $ hg init -u ../linux   # makes hardlinks and does a checkout

$ mkdir new-branch
$ cd new-branch
$ cg-clone -s ../linux-2.6

(Note that cg-clone given local path will do hardlinks too.)

We don't explicitly name branches either. You can make the branch
visible from the other tree by

$ cg-branch-add new-branch ../new-branch

and then refer to it as new-branch.

> > 14) examine which branch is current
> > 
> > $ ls -l .git/HEAD
> 
> $ echo $PWD

Always the "master" branch.

> > 15) undo all local modifications (same as checkout):
> > 
> > $ git checkout -f
> 
> $ hg update -C

$ cg-cancel

> > 16) obtain a diff between current branch, and master branch
> > 
> > In most trees WITH BRANCHES, .git/refs/heads/master contains the current 
> > 'vanilla' upstream tree, for easy diffing and merging.  (in trees 
> > without branches, 'master' simply contains your latest changes)
> > 
> > $ git-diff-tree -p master HEAD
> 
> $ hg diff -r <rev> -r <rev> 

$ cg-diff -r <rev> -r <rev>

> 17) run a browsable, pullable repo server of the current repo on your
> local machine
> 
> $ hg serve

Make it accessible over HTTP, SSH, rsync, or for the local users if you
just want them to access it.

> 18) push your changes to a remote server
> 
> $ hg push ssh://user@host/path/  # aliases and defaults in .hgrc

Will be supported Real Soon (tm) (well, probably sometimes next week).

> 19) get per-file history
> 
> $ hg log <file> | less

$ cg-log [-c] [-s] <file>

> 20) get annotated file contents
> 
> $ hg annotate [file]

Planned.

> 22) get online help
> 
> $ hg help [command]

$ cg-help [command]

Cool. Except where the concepts are just different, Cogito mostly
appears at least equally simple to use as Mercurial. Yes, some features
are missing yet. I hope to fix that soon. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
<Espy> be careful, some twit might quote you out of context..

^ permalink raw reply	[relevance 2%]

* Re: The coolest merge EVER!
  @ 2005-06-24 11:54  3%   ` Matthias Urlichs
  2005-06-24 17:49  0%     ` Daniel Barkalow
  2005-06-24 19:22  0%     ` Linus Torvalds
  0 siblings, 2 replies; 200+ results
From: Matthias Urlichs @ 2005-06-24 11:54 UTC (permalink / raw)
  To: git

Hi, Junio C Hamano wrote:

> I suspect there
> would be a massive additional support needed if you want to make it easy
> for Paul to pull changes made to gitk in your tree.

I don't think that's possible; after all, the trees are now merged, so any
pull would fetch all of Linus' tree.

Paul could do it as patches, or Linus could do it in a branch, or we could
write something entirely different from git that happens to support
cherrypicking. ;-)

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
What's so funny 'bout peace, love, and understanding?



^ permalink raw reply	[relevance 3%]

* Re: [RFC] Order of push/pull file transfers
  2005-06-23 10:12  3% [RFC] Order of push/pull file transfers Russell King
@ 2005-06-24 16:38  0% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-24 16:38 UTC (permalink / raw)
  To: Russell King; +Cc: git

On Thu, 23 Jun 2005, Russell King wrote:

> Last night, I pulled Linus' kernel tree from k.o, but Linus was in the
> middle of pushing an update to it.  The way cogito works, it grabs the
> HEAD first, and then rsyncs the objects.

It needs to do this, in case HEAD changes after or during the rsync (to
include objects written after the rsync looked for them).

> However, this retrieved the updated HEAD, and only some of the objects.
> cogito happily tried to merge the result, and failed.  A later pull
> and git-fsck-cache confirmed everything was fine _in this instance_.

It should be fine in all instances; it makes no assuptions about the
presence or absence of objects in the local database before the pull, so
doing a pull after the previous one didn't work right should be just as
likely to result in a functional state as any other pull.

> Therefore, may I suggest the following two changes in the way git
> works:
> 
> 1. a push updates HEAD only after the rsync/upload of all objects is
>    complete.  This means that any pull will not try to update to the
>    new head with a partial object tree.

git-ssh-push only updates the HEAD (or, rather, the thing the HEAD is a
symlink to) afterwards. I'm not sure how Linus was getting things
there. It's also possible that the mirroring process is failing to
maintain this constraint.

> 2. a pull only tries to fetch objects if HEAD has been updated since
>    the last pull.

That's no good; if the only recent change is a new tag, you want to get 
the tag object. Also, having it not do this is what let it recover in your
case on the second try. The only risk is that you'll pick up some objects
that you don't need yet (but would need if you pulled again when the push
completes).

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* Re: The coolest merge EVER!
  2005-06-24 11:54  3%   ` Matthias Urlichs
@ 2005-06-24 17:49  0%     ` Daniel Barkalow
  2005-06-24 19:22  0%     ` Linus Torvalds
  1 sibling, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-24 17:49 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git

On Fri, 24 Jun 2005, Matthias Urlichs wrote:

> Hi, Junio C Hamano wrote:
> 
> > I suspect there
> > would be a massive additional support needed if you want to make it easy
> > for Paul to pull changes made to gitk in your tree.
> 
> I don't think that's possible; after all, the trees are now merged, so any
> pull would fetch all of Linus' tree.

Linus could do:

 git-read-tree gitk-head
 git-update-cache gitk
 git-commit-tree `write-tree` -p gitk-head > gitk-patched-head
 git-read-tree HEAD
 git merge gitk-patched-head

(or, better, use a separate index file for the gitk index)

(to commit changes to the gitk script made in a git working directory)

The change I proposed earlier would be so that the system would know what
was going on and users wouldn't have to. Then someone who didn't know that
gitk was (also) a separate project and just committed changes to it would
still generate gitk commits when appropriate.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[relevance 0%]

* Re: The coolest merge EVER!
  2005-06-24 11:54  3%   ` Matthias Urlichs
  2005-06-24 17:49  0%     ` Daniel Barkalow
@ 2005-06-24 19:22  0%     ` Linus Torvalds
  1 sibling, 0 replies; 200+ results
From: Linus Torvalds @ 2005-06-24 19:22 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git



On Fri, 24 Jun 2005, Matthias Urlichs wrote:

> Hi, Junio C Hamano wrote:
> 
> > I suspect there
> > would be a massive additional support needed if you want to make it easy
> > for Paul to pull changes made to gitk in your tree.
> 
> I don't think that's possible; after all, the trees are now merged, so any
> pull would fetch all of Linus' tree.

No no. 

A merge is a one-way thing. I merged Paul's tree, but Paul didn't merge 
mine. His is still independent, and you can still pull his tree without 
getting the rest of git, and Paul can still continue to work on his tree 
as if I never merged it at all.

Now, merging back isn't as easy: if any gitk changes get done in my 
"union" tree, Paul can't just pull those, becasue they now end up being 
linked to the history of the unified thing, so when pulling, he'd now end 
up getting all of the regular git stuff too.

Which is probably acceptable, but Junio's point was that this is not a 
symmetric setup: git is like a black hole that never lets any information 
escape, and once you've been sucked into a git archive, you end up not 
being able to separate it.

Or rather, you _can_ separate out pieces of it, but now it's a matter of
cherry-picking, not automatic merges. Of course, people want to be able to 
do that anyway, and normally that will also merge back perfectly, so 
there's no huge downside, except that we should make it fairly easy.

		Linus

^ permalink raw reply	[relevance 0%]

* Re: kernel.org and GIT tree rebuilding
  @ 2005-06-25  5:04  4% ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-25  5:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: git

>>>>> "DSM" == David S Miller <davem@davemloft.net> writes:

DSM> To get a clean history to push to Linus, I typically blow
DSM> away my trees and make fresh ones to stick patches into
DSM> which I want to merge.

DSM> Should I:

DSM> 1) Do a git pull from Linus's tree once he takes my changes, then
DSM>    ask GIT to prune the tree?  How do I do that and how does it work?

DSM> 2) Should I use .git/object/ database symlinking?

DSM>    Are there any scripts out there which do this automatically?
DSM>    Something as simple to run as "git-pull-script" and it takes
DSM>    care of using links when possible on a local filesystem.

git-pull-script internally uses git-fetch-script which knows how
to do the local tree using hardlinks.  Presumably, the following
workflow would work:

 (1) You hack away in your private tree, while you keep a "to be
     published" clean tree, both on your local machine.

 (2) Do a GIT pull, merge in your private tree, to come up with
     a clean set of changes in your private tree.  This is the
     tree you "typically blow away".  Reordering the commits to
     come up with a clean history since you last pulled from
     Linus would also happen in this tree.

 (3) Once you have a commit that you want to publish (i.e. the
     commit chain between that commit and the point you last
     pulled from Linus is the "clean history to push to Linus"),
     you go to your "to be published" clean tree, and run
     git-fetch-script to fetch the commit you want to publish
     from your private tree.  When you give an absolute path as
     the "remote repo", git-local-pull with linking behaviour is
     used by git-fetch-script; otherwise rsync backend is used
     so you end up polluted object database.  This way you copy
     only the clean stuff from your private tree.  Your HEAD in
     this tree should be set to the commit you wanted to
     publish.  Running git-prune would be nicer but if your
     history is truly clean it should not be necessary.

 (4) Garbage collecting with git-prune your private tree is your
     business.


^ permalink raw reply	[relevance 4%]

* [PATCH 3/9] git-cherry: find commits not merged upstream.
  @ 2005-06-25  9:22  4%         ` Junio C Hamano
  2005-06-25  9:23  4%         ` [PATCH 4/9] git-rebase-script: rebase local commits to new upstream head Junio C Hamano
  2005-06-25  9:26 25%         ` [PATCH 9/9] Add a bit of developer documentation to pull.h Junio C Hamano
  2 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-25  9:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

The git-cherry command helps the git-rebase script by finding
commits that have not been merged upstream.  Commits already
included in upstream are prefixed with '-' (meaning "drop from
my local pull"), while commits missing from upstream are
prefixed with '+' (meaning "add to the updated upstream").

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile   |    2 +
 git-cherry |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+), 1 deletions(-)
 create mode 100755 git-cherry

350e3957925c9b4404977bbd6f65bf68ba28d26f
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk
+	gitk git-cherry
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-cherry b/git-cherry
new file mode 100755
--- /dev/null
+++ b/git-cherry
@@ -0,0 +1,86 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano.
+#
+
+usage="usage: $0 "'<upstream> [<head>]
+
+             __*__*__*__*__> <upstream>
+            /
+  fork-point
+            \__+__+__+__+__+__+__+__> <head>
+
+Each commit between the fork-point and <head> is examined, and
+compared against the change each commit between the fork-point and
+<upstream> introduces.  If the change does not seem to be in the
+upstream, it is shown on the standard output.
+
+The output is intended to be used as:
+
+    OLD_HEAD=$(git-rev-parse HEAD)
+    git-rev-parse linus >${GIT_DIR-.}/HEAD
+    git-cherry linus OLD_HEAD |
+    while read commit
+    do
+        GIT_EXTERNAL_DIFF=git-apply-patch-script git-diff-tree -p "$commit" &&
+	git-commit-script -m "$commit"
+    done
+'
+
+case "$#" in
+1) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse HEAD` || exit
+   ;;
+2) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse "$2"` || exit
+   ;;
+*) echo >&2 "$usage"; exit 1 ;;
+esac
+
+# Note that these list commits in reverse order;
+# not that the order in inup matters...
+inup=`git-rev-list ^$junio $linus` &&
+ours=`git-rev-list $junio ^$linus` || exit
+
+tmp=.cherry-tmp$$
+patch=$tmp-patch
+mkdir $patch
+trap "rm -rf $tmp-*" 0 1 2 3 15
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+
+for c in $inup
+do
+	git-diff-tree -p $c
+done | git-patch-id |
+while read id name
+do
+	echo $name >>$patch/$id
+done
+
+LF='
+'
+
+O=
+for c in $ours
+do
+	set x `git-diff-tree -p $c | git-patch-id`
+	if test "$2" != ""
+	then
+		if test -f "$patch/$2"
+		then
+			sign=-
+		else
+			sign=+
+		fi
+		case "$O" in
+		'')	O="$sign $c" ;;
+		*)	O="$sign $c$LF$O" ;;
+		esac
+	fi
+done
+case "$O" in
+'') ;;
+*)  echo "$O" ;;
+esac
------------


^ permalink raw reply	[relevance 4%]

* [PATCH 4/9] git-rebase-script: rebase local commits to new upstream head.
    2005-06-25  9:22  4%         ` [PATCH 3/9] git-cherry: find commits not merged upstream Junio C Hamano
@ 2005-06-25  9:23  4%         ` Junio C Hamano
  2005-06-25  9:26 25%         ` [PATCH 9/9] Add a bit of developer documentation to pull.h Junio C Hamano
  2 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-25  9:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Using git-cherry, forward port local commits missing from the
new upstream head.  This also depends on "-m" flag support in
git-commit-script.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile          |    2 +-
 git-rebase-script |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletions(-)
 create mode 100755 git-rebase-script

39830aca0319e04ed6c45203614543418974f877
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk git-cherry
+	gitk git-cherry git-rebase-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-rebase-script b/git-rebase-script
new file mode 100755
--- /dev/null
+++ b/git-rebase-script
@@ -0,0 +1,49 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano.
+#
+
+usage="usage: $0 "'<upstream> [<head>]
+
+Uses output from git-cherry to rebase local commits to the new head of
+upstream tree.'
+
+: ${GIT_DIR=.git}
+
+case "$#" in
+1) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse HEAD` || exit
+   ;;
+2) linus=`git-rev-parse "$1"` &&
+   junio=`git-rev-parse "$2"` || exit
+   ;;
+*) echo >&2 "$usage"; exit 1 ;;
+esac
+
+git-read-tree -m -u $junio $linus &&
+echo "$linus" >"$GIT_DIR/HEAD" || exit
+
+tmp=.rebase-tmp$$
+fail=$tmp-fail
+trap "rm -rf $tmp-*" 0 1 2 3 15
+
+>$fail
+
+git-cherry $linus $junio |
+while read sign commit
+do
+	case "$sign" in
+	-) continue ;;
+	esac
+	S=`cat "$GIT_DIR/HEAD"` &&
+        GIT_EXTERNAL_DIFF=git-apply-patch-script git-diff-tree -p $commit &&
+	git-commit-script -m "$commit" || {
+		echo $commit >>$fail
+		git-read-tree --reset -u $S
+	}
+done
+if test -s $fail
+then
+	echo Some commits could not be rebased, check by hand:
+	cat $fail
+fi
------------


^ permalink raw reply	[relevance 4%]

* [PATCH 9/9] Add a bit of developer documentation to pull.h
    2005-06-25  9:22  4%         ` [PATCH 3/9] git-cherry: find commits not merged upstream Junio C Hamano
  2005-06-25  9:23  4%         ` [PATCH 4/9] git-rebase-script: rebase local commits to new upstream head Junio C Hamano
@ 2005-06-25  9:26 25%         ` Junio C Hamano
  2 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-25  9:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Describe what to implement in fetch() and fetch_ref() for
pull backend writers a bit better.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 pull.h |   21 +++++++++++++++------
 1 files changed, 15 insertions(+), 6 deletions(-)

190061e326b73dcf76d301d8b17ce96c783d7251
diff --git a/pull.h b/pull.h
--- a/pull.h
+++ b/pull.h
@@ -1,24 +1,33 @@
 #ifndef PULL_H
 #define PULL_H
 
-/** To be provided by the particular implementation. **/
+/*
+ * Fetch object given SHA1 from the remote, and store it locally under
+ * GIT_OBJECT_DIRECTORY.  Return 0 on success, -1 on failure.  To be
+ * provided by the particular implementation.
+ */
 extern int fetch(unsigned char *sha1);
 
+/*
+ * Fetch ref (relative to $GIT_DIR/refs) from the remote, and store
+ * the 20-byte SHA1 in sha1.  Return 0 on success, -1 on failure.  To
+ * be provided by the particular implementation.
+ */
 extern int fetch_ref(char *ref, unsigned char *sha1);
 
-/** If set, the ref filename to write the target value to. **/
+/* If set, the ref filename to write the target value to. */
 extern const char *write_ref;
 
-/** If set, the hash that the current value of write_ref must be. **/
+/* If set, the hash that the current value of write_ref must be. */
 extern const unsigned char *current_ref;
 
-/** Set to fetch the target tree. */
+/* Set to fetch the target tree. */
 extern int get_tree;
 
-/** Set to fetch the commit history. */
+/* Set to fetch the commit history. */
 extern int get_history;
 
-/** Set to fetch the trees in the commit history. **/
+/* Set to fetch the trees in the commit history. */
 extern int get_all;
 
 /* Set to zero to skip the check for delta object base;
------------


^ permalink raw reply	[relevance 25%]

* [PATCH] Add git-relink-script to fix up missing hardlinks
@ 2005-06-26 18:15  3% Ryan Anderson
  0 siblings, 0 replies; 200+ results
From: Ryan Anderson @ 2005-06-26 18:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Junio C Hamano


Add git-relink-script

This will scan 2 or more object repositories and look for common objects, check
if they are hardlinked, and replace one with a hardlink to the other if not.

This version warns when skipping files because of size differences, and
handle more than 2 repositories automatically.

Signed-off-by: Ryan Anderson <ryan@michonline.com>

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-deltafy-script git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk git-cherry git-rebase-script
+	gitk git-cherry git-rebase-script git-relink-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-relink-script b/git-relink-script
new file mode 100644
--- /dev/null
+++ b/git-relink-script
@@ -0,0 +1,173 @@
+#!/usr/bin/env perl
+# Copyright 2005, Ryan Anderson <ryan@michonline.com>
+# Distribution permitted under the GPL v2, as distributed
+# by the Free Software Foundation.
+# Later versions of the GPL at the discretion of Linus Torvalds
+#
+# Scan two git object-trees, and hardlink any common objects between them.
+
+use 5.006;
+use strict;
+use warnings;
+use Getopt::Long;
+
+sub get_canonical_form($);
+sub do_scan_directory($$$);
+sub compare_two_files($$);
+sub usage();
+sub link_two_files($$);
+
+# stats
+my $total_linked = 0;
+my $total_already = 0;
+my ($linked,$already);
+
+my $fail_on_different_sizes = 0;
+my $help = 0;
+GetOptions("safe" => \$fail_on_different_sizes,
+	   "help" => \$help);
+
+usage() if $help;
+
+my (@dirs) = @ARGV;
+
+usage() if (!defined $dirs[0] || !defined $dirs[1]);
+
+$_ = get_canonical_form($_) foreach (@dirs);
+
+my $master_dir = pop @dirs;
+
+opendir(D,$master_dir . "objects/")
+	or die "Failed to open $master_dir/objects/ : $!";
+
+my @hashdirs = grep !/^\.{1,2}$/, readdir(D);
+
+foreach my $repo (@dirs) {
+	$linked = 0;
+	$already = 0;
+	printf("Searching '%s' and '%s' for common objects and hardlinking them...\n",
+		$master_dir,$repo);
+
+	foreach my $hashdir (@hashdirs) {
+		do_scan_directory($master_dir, $hashdir, $repo);
+	}
+
+	printf("Linked %d files, %d were already linked.\n",$linked, $already);
+
+	$total_linked += $linked;
+	$total_already += $already;
+}
+
+printf("Totals: Linked %d files, %d were already linked.\n",
+	$total_linked, $total_already);
+
+
+sub do_scan_directory($$$) {
+	my ($srcdir, $subdir, $dstdir) = @_;
+
+	my $sfulldir = sprintf("%sobjects/%s/",$srcdir,$subdir);
+	my $dfulldir = sprintf("%sobjects/%s/",$dstdir,$subdir);
+
+	opendir(S,$sfulldir)
+		or die "Failed to opendir $sfulldir: $!";
+
+	foreach my $file (grep(!/\.{1,2}$/, readdir(S))) {
+		my $sfilename = $sfulldir . $file;
+		my $dfilename = $dfulldir . $file;
+
+		compare_two_files($sfilename,$dfilename);
+
+	}
+	closedir(S);
+}
+
+sub compare_two_files($$) {
+	my ($sfilename, $dfilename) = @_;
+
+	# Perl's stat returns relevant information as follows:
+	# 0 = dev number
+	# 1 = inode number
+	# 7 = size
+	my @sstatinfo = stat($sfilename);
+	my @dstatinfo = stat($dfilename);
+
+	if (@sstatinfo == 0 && @dstatinfo == 0) {
+		die sprintf("Stat of both %s and %s failed: %s\n",$sfilename, $dfilename, $!);
+
+	} elsif (@dstatinfo == 0) {
+		return;
+	}
+
+	if ( ($sstatinfo[0] == $dstatinfo[0]) &&
+	     ($sstatinfo[1] != $dstatinfo[1])) {
+		if ($sstatinfo[7] == $dstatinfo[7]) {
+			link_two_files($sfilename, $dfilename);
+
+		} else {
+			my $err = sprintf("ERROR: File sizes are not the same, cannot relink %s to %s.\n",
+				$sfilename, $dfilename);
+			if ($fail_on_different_sizes) {
+				die $err;
+			} else {
+				warn $err;
+			}
+		}
+
+	} elsif ( ($sstatinfo[0] == $dstatinfo[0]) &&
+	     ($sstatinfo[1] == $dstatinfo[1])) {
+		$already++;
+	}
+}
+
+sub get_canonical_form($) {
+	my $dir = shift;
+	my $original = $dir;
+
+	die "$dir is not a directory." unless -d $dir;
+
+	$dir .= "/" unless $dir =~ m#/$#;
+	$dir .= ".git/" unless $dir =~ m#\.git/$#;
+
+	die "$original does not have a .git/ subdirectory.\n" unless -d $dir;
+
+	return $dir;
+}
+
+sub link_two_files($$) {
+	my ($sfilename, $dfilename) = @_;
+	my $tmpdname = sprintf("%s.old",$dfilename);
+	rename($dfilename,$tmpdname)
+		or die sprintf("Failure renaming %s to %s: %s",
+			$dfilename, $tmpdname, $!);
+
+	if (! link($sfilename,$dfilename)) {
+		my $failtxt = "";
+		unless (rename($tmpdname,$dfilename)) {
+			$failtxt = sprintf(
+				"Git Repository containing %s is probably corrupted, " .
+				"please copy '%s' to '%s' to fix.\n",
+				$tmpdname, $dfilename);
+		}
+
+		die sprintf("Failed to link %s to %s: %s\n%s" .
+			$sfilename, $dfilename,
+			$!, $dfilename, $failtxt);
+	}
+
+	unlink($tmpdname)
+		or die sprintf("Unlink of %s failed: %s\n",
+			$dfilename, $!);
+
+	$linked++;
+}
+
+
+sub usage() {
+	print("Usage: $0 [--safe] <dir> [<dir> ...] <master_dir> \n");
+	print("All directories should contain a .git/objects/ subdirectory.\n");
+	print("Options\n");
+	print("\t--safe\t" .
+		"Stops if two objects with the same hash exist but " .
+		"have different sizes.  Default is to warn and continue.\n");
+	exit(1);
+}
-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[relevance 3%]

* git-pull-branch script
@ 2005-06-27 12:27  3% Jeff Garzik
  0 siblings, 0 replies; 200+ results
From: Jeff Garzik @ 2005-06-27 12:27 UTC (permalink / raw)
  To: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 405 bytes --]


I've attached the 1-line git-pull-branch script that I just whipped 
together, in case anybody finds it useful.

On occasion, I need to pull one branch into another branch, inside the 
same repo.  For this case, the objects are already present, so we can 
skip the git-fetch-script step of git-pull-script completely.  I also 
modify the commit message so that we log the fact that we pulled a branch.



[-- Attachment #2: git-pull-branch --]
[-- Type: text/plain, Size: 59 bytes --]

#!/bin/sh

git-resolve-script HEAD $1 "`pwd` branch '$1'"


^ permalink raw reply	[relevance 3%]

* [PATCH] cvsimport: rewritten in Perl
@ 2005-06-28 19:23  2% Matthias Urlichs
    0 siblings, 1 reply; 200+ results
From: Matthias Urlichs @ 2005-06-28 19:23 UTC (permalink / raw)
  To: git

I just got my machine blocked from a CVS server which didn't like
to get hammered with connections.

That was cvs2git's shell script. Which, by the way, is slow as hell.

Appended: a git-cvsimport script, written in Perl, which directly talks
to the CVS server. If the repository is local, it runs a "cvs server"
child. It produces the same git repository as Linus' version. It can do
incremental imports. And it's 20 times faster (on my system, with a
local CVS repository).

cvs2git is thus obsolete; this patch deletes it.

Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de>

--- 

diff --git a/Documentation/cvs-migration.txt b/Documentation/cvs-migration.txt
--- a/Documentation/cvs-migration.txt
+++ b/Documentation/cvs-migration.txt
@@ -63,18 +63,38 @@ Once you've gotten (and installed) cvsps
 any more familiar with it, but make sure it is in your path. After that,
 the magic command line is
 
-	git cvsimport <cvsroot> <module>
+	git cvsimport -v -d <cvsroot> <module> <destination>
 
 which will do exactly what you'd think it does: it will create a git
-archive of the named CVS module. The new archive will be created in a
-subdirectory named <module>.
+archive of the named CVS module. The new archive will be created in the
+subdirectory named <destination>; it'll be created if it doesn't exist.
+Default is the local directory.
 
 It can take some time to actually do the conversion for a large archive
 since it involves checking out from CVS every revision of every file,
-and the conversion script can be reasonably chatty, but on some not very
-scientific tests it averaged about eight revisions per second, so a
-medium-sized project should not take more than a couple of minutes.  For
-larger projects or remote repositories, the process may take longer.
+and the conversion script is reasonably chatty unless you omit the '-v'
+option, but on some not very scientific tests it averaged about twenty
+revisions per second, so a medium-sized project should not take more
+than a couple of minutes.  For larger projects or remote repositories,
+the process may take longer.
+
+After the (initial) import is done, the CVS archive's current head
+revision will be checked out -- thus, you can start adding your own
+changes right away.
+
+The import is incremental, i.e. if you call it again next month it'll
+fetch any CVS updates that have been happening in the meantime. The
+cut-off is date-based, so don't change the branches that were imported
+from CVS.
+
+You can merge those updates (or, in fact, a different CVS branch) into
+your main branch:
+
+	cg-merge <branch>
+
+The HEAD revision from CVS is named "origin", not "HEAD", because git
+already uses "HEAD". (If you don't like 'origin', use cvsimport's
+'-o' option to change it.)
 
 
 Emulating CVS behaviour
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -35,7 +35,7 @@ PROG=   git-update-cache git-diff-files 
 	git-http-pull git-ssh-push git-ssh-pull git-rev-list git-mktag \
 	git-diff-helper git-tar-tree git-local-pull git-write-blob \
 	git-get-tar-commit-id git-apply git-stripspace \
-	git-cvs2git git-diff-stages git-rev-parse git-patch-id \
+	git-diff-stages git-rev-parse git-patch-id \
 	git-pack-objects git-unpack-objects
 
 all: $(PROG)
@@ -118,7 +118,6 @@ git-diff-helper: diff-helper.c
 git-tar-tree: tar-tree.c
 git-write-blob: write-blob.c
 git-stripspace: stripspace.c
-git-cvs2git: cvs2git.c
 git-diff-stages: diff-stages.c
 git-rev-parse: rev-parse.c
 git-patch-id: patch-id.c
diff --git a/cvs2git.c b/cvs2git.c
deleted file mode 100644
--- a/cvs2git.c
+++ /dev/null
@@ -1,329 +0,0 @@
-/*
- * cvs2git
- *
- * Copyright (C) Linus Torvalds 2005
- */
-
-#include <stdio.h>
-#include <ctype.h>
-#include <string.h>
-#include <stdlib.h>
-#include <unistd.h>
-
-static int verbose = 0;
-
-/*
- * This is a really stupid program that takes cvsps output, and
- * generates a a long _shell_script_ that will create the GIT archive
- * from it. 
- *
- * You've been warned. I told you it was stupid.
- *
- * NOTE NOTE NOTE! In order to do branches correctly, this needs
- * the fixed cvsps that has the "Ancestor branch" tag output.
- * Hopefully David Mansfield will update his distribution soon
- * enough (he's the one who wrote the patch, so at least we don't
- * have to figt maintainer issues ;)
- *
- * Usage:
- *
- *	TZ=UTC cvsps -A |
- *		git-cvs2git --cvsroot=[root] --module=[module] > script
- *
- * Creates a shell script that will generate the .git archive of
- * the names CVS repository.
- *
- *	TZ=UTC cvsps -s 1234- -A |
- *		git-cvs2git -u --cvsroot=[root] --module=[module] > script
- *
- * Creates a shell script that will update the .git archive with
- * CVS changes from patchset 1234 until the last one.
- *
- * IMPORTANT NOTE ABOUT "cvsps"! This requires version 2.1 or better,
- * and the "TZ=UTC" and the "-A" flag is required for sane results!
- */
-enum state {
-	Header,
-	Log,
-	Members
-};
-
-static const char *cvsroot;
-static const char *cvsmodule;
-
-static char date[100];
-static char author[100];
-static char branch[100];
-static char ancestor[100];
-static char tag[100];
-static char log[32768];
-static int loglen = 0;
-static int initial_commit = 1;
-
-static void lookup_author(char *n, char **name, char **email)
-{
-	/*
-	 * FIXME!!! I'm lazy and stupid.
-	 *
-	 * This could be something like
-	 *
-	 *	printf("lookup_author '%s'\n", n);
-	 *	*name = "$author_name";
-	 *	*email = "$author_email";
-	 *
-	 * and that would allow the script to do its own
-	 * lookups at run-time.
-	 */
-	*name = n;
-	*email = n;
-}
-
-static void prepare_commit(void)
-{
-	char *author_name, *author_email;
-	char *src_branch;
-
-	lookup_author(author, &author_name, &author_email);
-
-	printf("export GIT_COMMITTER_NAME=%s\n", author_name);
-	printf("export GIT_COMMITTER_EMAIL=%s\n", author_email);
-	printf("export GIT_COMMITTER_DATE='+0000 %s'\n", date);
-
-	printf("export GIT_AUTHOR_NAME=%s\n", author_name);
-	printf("export GIT_AUTHOR_EMAIL=%s\n", author_email);
-	printf("export GIT_AUTHOR_DATE='+0000 %s'\n", date);
-
-	if (initial_commit)
-		return;
-
-	src_branch = *ancestor ? ancestor : branch;
-	if (!strcmp(src_branch, "HEAD"))
-		src_branch = "master";
-	printf("ln -sf refs/heads/'%s' .git/HEAD\n", src_branch);
-
-	/*
-	 * Even if cvsps claims an ancestor, we'll let the new
-	 * branch name take precedence if it already exists
-	 */
-	if (*ancestor) {
-		src_branch = branch;
-		if (!strcmp(src_branch, "HEAD"))
-			src_branch = "master";
-		printf("[ -e .git/refs/heads/'%s' ] && ln -sf refs/heads/'%s' .git/HEAD\n",
-			src_branch, src_branch);
-	}
-
-	printf("git-read-tree -m HEAD || exit 1\n");
-	printf("git-checkout-cache -f -u -a\n");
-}
-
-static void commit(void)
-{
-	const char *cmit_parent = initial_commit ? "" : "-p HEAD";
-	const char *dst_branch;
-	char *space;
-	int i;
-
-	printf("tree=$(git-write-tree)\n");
-	printf("cat > .cmitmsg <<EOFMSG\n");
-
-	/* Escape $ characters, and remove control characters */
-	for (i = 0; i < loglen; i++) {
-		unsigned char c = log[i];
-
-		switch (c) {
-		case '$':
-		case '\\':
-		case '`':
-			putchar('\\');
-			break;
-		case 0 ... 31:
-			if (c == '\n' || c == '\t')
-				break;
-		case 128 ... 159:
-			continue;
-		}
-		putchar(c);
-	}
-	printf("\nEOFMSG\n");
-	printf("commit=$(cat .cmitmsg | git-commit-tree $tree %s)\n", cmit_parent);
-
-	dst_branch = branch;
-	if (!strcmp(dst_branch, "HEAD"))
-		dst_branch = "master";
-
-	printf("echo $commit > .git/refs/heads/'%s'\n", dst_branch);
-
-	space = strchr(tag, ' ');
-	if (space)
-		*space = 0;
-	if (strcmp(tag, "(none)"))
-		printf("echo $commit > .git/refs/tags/'%s'\n", tag);
-
-	printf("echo 'Committed (to %s):' ; cat .cmitmsg; echo\n", dst_branch);
-
-	*date = 0;
-	*author = 0;
-	*branch = 0;
-	*ancestor = 0;
-	*tag = 0;
-	loglen = 0;
-
-	initial_commit = 0;
-}
-
-static void update_file(char *line)
-{
-	char *name, *version;
-	char *dir;
-
-	while (isspace(*line))
-		line++;
-	name = line;
-	line = strchr(line, ':');
-	if (!line)
-		return;
-	*line++ = 0;
-	line = strchr(line, '>');
-	if (!line)
-		return;
-	*line++ = 0;
-	version = line;
-	line = strchr(line, '(');
-	if (line) {	/* "(DEAD)" */
-		printf("git-update-cache --force-remove '%s'\n", name);
-		return;
-	}
-
-	dir = strrchr(name, '/');
-	if (dir)
-		printf("mkdir -p %.*s\n", (int)(dir - name), name);
-
-	printf("cvs -q -d %s checkout -d .git-tmp -r%s '%s/%s'\n", 
-		cvsroot, version, cvsmodule, name);
-	printf("mv -f .git-tmp/%s %s\n", dir ? dir+1 : name, name);
-	printf("rm -rf .git-tmp\n");
-	printf("git-update-cache --add -- '%s'\n", name);
-}
-
-struct hdrentry {
-	const char *name;
-	char *dest;
-} hdrs[] = {
-	{ "Date:", date },
-	{ "Author:", author },
-	{ "Branch:", branch },
-	{ "Ancestor branch:", ancestor },
-	{ "Tag:", tag },
-	{ "Log:", NULL },
-	{ NULL, NULL }
-};
-
-int main(int argc, char **argv)
-{
-	static char line[1000];
-	enum state state = Header;
-	int i;
-
-	for (i = 1; i < argc; i++) {
-		const char *arg = argv[i];
-		if (!memcmp(arg, "--cvsroot=", 10)) {
-			cvsroot = arg + 10;
-			continue;
-		}
-		if (!memcmp(arg, "--module=", 9)) {
-			cvsmodule = arg+9;
-			continue;
-		} 
-		if (!strcmp(arg, "-v")) {
-			verbose = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-u")) {
-			initial_commit = 0;
-			continue;
-		}
-	}
-
-
-	if (!cvsroot)
-		cvsroot = getenv("CVSROOT");
-
-	if (!cvsmodule || !cvsroot) {
-		fprintf(stderr, "I need a CVSROOT and module name\n");
-		exit(1);
-	}
-
-	if (initial_commit) {
-		printf("[ -d .git ] && exit 1\n");
-		    printf("git-init-db\n");
-		printf("mkdir -p .git/refs/heads\n");
-		printf("mkdir -p .git/refs/tags\n");
-		printf("ln -sf refs/heads/master .git/HEAD\n");
-	}
-
-	while (fgets(line, sizeof(line), stdin) != NULL) {
-		int linelen = strlen(line);
-
-		while (linelen && isspace(line[linelen-1]))
-			line[--linelen] = 0;
-
-		switch (state) {
-		struct hdrentry *entry;
-
-		case Header:
-			if (verbose)
-				printf("# H: %s\n", line);
-			for (entry = hdrs ; entry->name ; entry++) {
-				int len = strlen(entry->name);
-				char *val;
-
-				if (memcmp(entry->name, line, len))
-					continue;
-				if (!entry->dest) {
-					state = Log;
-					break;
-				}
-				val = line + len;
-				linelen -= len;
-				while (isspace(*val)) {
-					val++;
-					linelen--;
-				}
-				memcpy(entry->dest, val, linelen+1);
-				break;
-			}
-			continue;
-
-		case Log:
-			if (verbose)
-				printf("# L: %s\n", line);
-			if (!strcmp(line, "Members:")) {
-				while (loglen && isspace(log[loglen-1]))
-					log[--loglen] = 0;
-				prepare_commit();
-				state = Members;
-				continue;
-			}
-				
-			if (loglen + linelen + 5 > sizeof(log))
-				continue;
-			memcpy(log + loglen, line, linelen);
-			loglen += linelen;
-			log[loglen++] = '\n';
-			continue;
-
-		case Members:
-			if (verbose)
-				printf("# M: %s\n", line);
-			if (!linelen) {
-				commit();
-				state = Header;
-				continue;
-			}
-			update_file(line);
-			continue;
-		}
-	}
-	return 0;
-}
diff --git a/git-cvsimport-script b/git-cvsimport-script
--- a/git-cvsimport-script
+++ b/git-cvsimport-script
@@ -1,38 +1,629 @@
-#!/bin/sh
+#!/usr/bin/perl -w
 
-usage () {
-	echo "Usage: git cvsimport [-v] [-z fuzz] <cvsroot> <module>"
-	exit 1
-}
-
-CVS2GIT=""
-CVSPS="--cvs-direct -x -A"
-while true; do
-	case "$1" in
-	-v) CVS2GIT="$1" ;;
-	-z) shift; CVSPS="$CVSPS -z $1" ;;
-	-*) usage ;;
-	*)  break ;;
-	esac
-	shift
-done
-
-export CVSROOT="$1"
-export MODULE="$2"
-if [ ! "$CVSROOT" ] || [ ! "$MODULE" ] ; then
-	usage
-fi
-
-cvsps -h 2>&1 | grep -q "cvsps version 2.1" >& /dev/null || {
-	echo "I need cvsps version 2.1"
-	exit 1
-}
-
-mkdir "$MODULE" || exit 1
-cd "$MODULE"
-
-TZ=UTC cvsps $CVSPS $MODULE > .git-cvsps-result
-[ -s .git-cvsps-result ] || exit 1
-git-cvs2git $CVS2GIT --cvsroot="$CVSROOT" --module="$MODULE" < .git-cvsps-result > .git-create-script || exit 1
-sh .git-create-script
+# This tool is copyright (c) 2005, Matthias Urlichs.
+# It is released under the Gnu Public License, version 2.
+#
+# The basic idea is to aggregate CVS check-ins into related changes.
+# Fortunately, "cvsps" does that for us; all we have to do is to parse
+# its output.
+#
+# Checking out the files is done by a single long-running CVS connection
+# / server process.
+#
+# The head revision is on branch "origin" by default.
+# You can change that with the '-o' option.
+
+use strict;
+use warnings;
+use Getopt::Std;
+use File::Path qw(mkpath);
+use File::Basename qw(basename dirname);
+use Time::Local;
+use IO::Socket;
+use IO::Pipe;
+use POSIX qw(strftime dup2);
+
+$SIG{'PIPE'}="IGNORE";
+$ENV{'TZ'}="UTC";
+
+our($opt_h,$opt_o,$opt_v,$opt_d);
+
+sub usage() {
+	print STDERR <<END;
+Usage: ${\basename $0}     # fetch/update GIT from CVS
+	   [ -o branch-for-HEAD ] [ -h ] [ -v ] [ -d CVSROOT ]
+       CVS_module [ GIT_repository ]
+END
+	exit(1);
+}
+
+getopts("hqvo:d:") or usage();
+usage if $opt_h;
+
+@ARGV == 1 or @ARGV == 2 or usage();
+
+my($cvs_tree, $git_tree) = @ARGV;
+
+if($opt_d) {
+	$ENV{"CVSROOT"} = $opt_d;
+} elsif($ENV{"CVSROOT"}) {
+	$opt_d = $ENV{"CVSROOT"};
+} else {
+	die "CVSROOT needs to be set";
+}
+$opt_o ||= "origin";
+$git_tree ||= ".";
+
+select(STDERR); $|=1; select(STDOUT);
+
+
+package CVSconn;
+# Basic CVS dialog.
+# We're only interested in connecting and downloading, so ...
+
+use POSIX qw(strftime dup2);
+
+sub new {
+	my($what,$repo,$subdir) = @_;
+	$what=ref($what) if ref($what);
+
+	my $self = {};
+	$self->{'buffer'} = "";
+	bless($self,$what);
+
+	$repo =~ s#/+$##;
+	$self->{'fullrep'} = $repo;
+	$self->conn();
+
+	$self->{'subdir'} = $subdir;
+	$self->{'lines'} = undef;
+
+	return $self;
+}
+
+sub conn {
+	my $self = shift;
+	my $repo = $self->{'fullrep'};
+	if($repo =~ s/^:pserver:(?:(.*?)(?::(.*?))?@)?([^:\/]*)(?::(\d*))?//) {
+		my($user,$pass,$serv,$port) = ($1,$2,$3,$4);
+		$user="anonymous" unless defined $user;
+		my $rr2 = "-";
+		unless($port) {
+			$rr2 = ":pserver:$user\@$serv:$repo";
+			$port=2401;
+		}
+		my $rr = ":pserver:$user\@$serv:$port$repo";
+
+		unless($pass) {
+			open(H,$ENV{'HOME'}."/.cvspass") and do {
+				# :pserver:cvs@mea.tmt.tele.fi:/cvsroot/zmailer Ah<Z
+				while(<H>) {
+					chomp;
+					s/^\/\d+\s+//;
+					my ($w,$p) = split(/\s/,$_,2);
+					if($w eq $rr or $w eq $rr2) {
+						$pass = $p;
+						last;
+					}
+				}
+			};
+		}
+		$pass="A" unless $pass;
+
+		my $s = IO::Socket::INET->new(PeerHost => $serv, PeerPort => $port);
+		die "Socket to $serv: $!\n" unless defined $s;
+		$s->write("BEGIN AUTH REQUEST\n$repo\n$user\n$pass\nEND AUTH REQUEST\n")
+			or die "Write to $serv: $!\n";
+		$s->flush();
+
+		my $rep = <$s>;
+
+		if($rep ne "I LOVE YOU\n") {
+			$rep="<unknown>" unless $rep;
+			die "AuthReply: $rep\n";
+		}
+		$self->{'socketo'} = $s;
+		$self->{'socketi'} = $s;
+	} else { # local: Fork off our own cvs server.
+		my $pr = IO::Pipe->new();
+		my $pw = IO::Pipe->new();
+		my $pid = fork();
+		die "Fork: $!\n" unless defined $pid;
+		unless($pid) {
+			$pr->writer();
+			$pw->reader();
+			dup2($pw->fileno(),0);
+			dup2($pr->fileno(),1);
+			$pr->close();
+			$pw->close();
+			exec("cvs","server");
+		}
+		$pw->writer();
+		$pr->reader();
+		$self->{'socketo'} = $pw;
+		$self->{'socketi'} = $pr;
+	}
+	$self->{'socketo'}->write("Root $repo\n");
+
+	# Trial and error says that this probably is the minimum set
+	$self->{'socketo'}->write("Valid-responses ok error Valid-requests Mode M Mbinary E F Checked-in Created Updated Merged Removed\n");
+
+	$self->{'socketo'}->write("valid-requests\n");
+	$self->{'socketo'}->flush();
+
+	chomp(my $rep=$self->readline());
+	if($rep !~ s/^Valid-requests\s*//) {
+		$rep="<unknown>" unless $rep;
+		die "Expected Valid-requests from server, but got: $rep\n";
+	}
+	chomp(my $res=$self->readline());
+	die "validReply: $res\n" if $res ne "ok";
+
+	$self->{'socketo'}->write("UseUnchanged\n") if $rep =~ /\bUseUnchanged\b/;
+	$self->{'repo'} = $repo;
+}
+
+sub readline {
+	my($self) = @_;
+	return $self->{'socketi'}->getline();
+}
+
+sub _file {
+	# Request a file with a given revision.
+	# Trial and error says this is a good way to do it. :-/
+	my($self,$fn,$rev) = @_;
+	$self->{'socketo'}->write("Argument -N\n") or return undef;
+	$self->{'socketo'}->write("Argument -P\n") or return undef;
+	# $self->{'socketo'}->write("Argument -ko\n") or return undef;
+	# -ko: Linus' version doesn't use it
+	$self->{'socketo'}->write("Argument -r\n") or return undef;
+	$self->{'socketo'}->write("Argument $rev\n") or return undef;
+	$self->{'socketo'}->write("Argument --\n") or return undef;
+	$self->{'socketo'}->write("Argument $self->{'subdir'}/$fn\n") or return undef;
+	$self->{'socketo'}->write("Directory .\n") or return undef;
+	$self->{'socketo'}->write("$self->{'repo'}\n") or return undef;
+	$self->{'socketo'}->write("Sticky T1.1\n") or return undef;
+	$self->{'socketo'}->write("co\n") or return undef;
+	$self->{'socketo'}->flush() or return undef;
+	$self->{'lines'} = 0;
+	return 1;
+}
+sub _line {
+	# Read a line from the server.
+	# ... except that 'line' may be an entire file. ;-)
+	my($self) = @_;
+	die "Not in lines" unless defined $self->{'lines'};
+
+	my $line;
+	my $res="";
+	while(defined($line = $self->readline())) {
+		# M U gnupg-cvs-rep/AUTHORS
+		# Updated gnupg-cvs-rep/
+		# /daten/src/rsync/gnupg-cvs-rep/AUTHORS
+		# /AUTHORS/1.1///T1.1
+		# u=rw,g=rw,o=rw
+		# 0
+		# ok
+
+		if($line =~ s/^(?:Created|Updated) //) {
+			$line = $self->readline(); # path
+			$line = $self->readline(); # Entries line
+			my $mode = $self->readline(); chomp $mode;
+			$self->{'mode'} = $mode;
+			defined (my $cnt = $self->readline())
+				or die "EOF from server after 'Changed'\n";
+			chomp $cnt;
+			die "Duh: Filesize $cnt" if $cnt !~ /^\d+$/;
+			$line="";
+			$res="";
+			while($cnt) {
+				my $buf;
+				my $num = $self->{'socketi'}->read($buf,$cnt);
+				die "Server: Filesize $cnt: $num: $!\n" if not defined $num or $num<=0;
+				$res .= $buf;
+				$cnt -= $num;
+			}
+		} elsif($line =~ s/^ //) {
+			$res .= $line;
+		} elsif($line =~ /^M\b/) {
+			# output, do nothing
+		} elsif($line =~ /^Mbinary\b/) {
+			my $cnt;
+			die "EOF from server after 'Mbinary'" unless defined ($cnt = $self->readline());
+			chomp $cnt;
+			die "Duh: Mbinary $cnt" if $cnt !~ /^\d+$/ or $cnt<1;
+			$line="";
+			while($cnt) {
+				my $buf;
+				my $num = $self->{'socketi'}->read($buf,$cnt);
+				die "S: Mbinary $cnt: $num: $!\n" if not defined $num or $num<=0;
+				$res .= $buf;
+				$cnt -= $num;
+			}
+		} else {
+			chomp $line;
+			if($line eq "ok") {
+				# print STDERR "S: ok (".length($res).")\n";
+				return $res;
+			} elsif($line =~ s/^E //) {
+				# print STDERR "S: $line\n";
+			} else {
+				die "Unknown: $line\n";
+			}
+		}
+	}
+}
+sub file {
+	my($self,$fn,$rev) = @_;
+	my $res;
+
+	if ($self->_file($fn,$rev)) {
+		$res = $self->_line();
+		return $res if defined $res;
+	}
+
+	# retry
+	$self->conn();
+	$self->_file($fn,$rev)
+		or die "No file command send\n";
+	$res = $self->_line();
+	die "No input: $fn $rev\n" unless defined $res;
+	return $res;
+}
+
+
+package main;
+
+my $cvs = CVSconn->new($opt_d, $cvs_tree);
+
+
+sub pdate($) {
+	my($d) = @_;
+	m#(\d{2,4})/(\d\d)/(\d\d)\s(\d\d):(\d\d)(?::(\d\d))?#
+		or die "Unparseable date: $d\n";
+	my $y=$1; $y-=1900 if $y>1900;
+	return timegm($6||0,$5,$4,$3,$2-1,$y);
+}
+
+sub pmode($) {
+	my($mode) = @_;
+	my $m = 0;
+	my $mm = 0;
+	my $um = 0;
+	for my $x(split(//,$mode)) {
+		if($x eq ",") {
+			$m |= $mm&$um;
+			$mm = 0;
+			$um = 0;
+		} elsif($x eq "u") { $um |= 0700;
+		} elsif($x eq "g") { $um |= 0070;
+		} elsif($x eq "o") { $um |= 0007;
+		} elsif($x eq "r") { $mm |= 0444;
+		} elsif($x eq "w") { $mm |= 0222;
+		} elsif($x eq "x") { $mm |= 0111;
+		} elsif($x eq "=") { # do nothing
+		} else { die "Unknown mode: $mode\n";
+		}
+	}
+	$m |= $mm&$um;
+	return $m;
+}
+
+my $tmpcv = "/var/cache/cvs";
+
+sub getwd() {
+	my $pwd = `pwd`;
+	chomp $pwd;
+	return $pwd;
+}
+
+-d $git_tree
+	or mkdir($git_tree,0777)
+	or die "Could not create $git_tree: $!";
+chdir($git_tree);
+
+my $last_branch = "";
+my $orig_branch = "";
+my %branch_date;
+
+my $git_dir = $ENV{"GIT_DIR"} || ".git";
+$git_dir = getwd()."/".$git_dir unless $git_dir =~ m#^/#;
+$ENV{"GIT_DIR"} = $git_dir;
+unless(-d $git_dir) {
+	system("git-init-db");
+	die "Cannot init the GIT db at $git_tree: $?\n" if $?;
+	system("git-read-tree");
+	die "Cannot init an empty tree: $?\n" if $?;
+
+	$last_branch = $opt_o;
+	$orig_branch = "";
+} else {
+	$last_branch = basename(readlink("$git_dir/HEAD"));
+	unless($last_branch) {
+		warn "Cannot read the last branch name: $! -- assuming 'master'\n";
+		$last_branch = "master";
+	}
+	$orig_branch = $last_branch;
+
+	# Get the last import timestamps
+	opendir(D,"$git_dir/refs/heads");
+	while(defined(my $head = readdir(D))) {
+		next if $head =~ /^\./;
+		open(F,"$git_dir/refs/heads/$head")
+			or die "Bad head branch: $head: $!\n";
+		chomp(my $ftag = <F>);
+		close(F);
+		open(F,"git-cat-file commit $ftag |");
+		while(<F>) {
+			next unless /^author\s.*\s(\d+)\s[-+]\d{4}$/;
+			$branch_date{$head} = $1;
+			last;
+		}
+		close(F);
+	}
+	closedir(D);
+}
+
+-d $git_dir
+	or die "Could not create git subdir ($git_dir).\n";
+
+my $pid = open(CVS,"-|");
+die "Cannot fork: $!\n" unless defined $pid;
+unless($pid) {
+	exec("cvsps","-A","--cvs-direct",$cvs_tree);
+	die "Could not start cvsps: $!\n";
+}
+
+
+## cvsps output:
+#---------------------
+#PatchSet 314
+#Date: 1999/09/18 13:03:59
+#Author: wkoch
+#Branch: STABLE-BRANCH-1-0
+#Ancestor branch: HEAD
+#Tag: (none)
+#Log:
+#    See ChangeLog: Sat Sep 18 13:03:28 CEST 1999  Werner Koch
+#Members:
+#	README:1.57->1.57.2.1
+#	VERSION:1.96->1.96.2.1
+#
+#---------------------
+
+my $state = 0;
+
+my($patchset,$date,$author,$branch,$ancestor,$tag,$logmsg);
+my(@old,@new);
+my $commit = sub {
+	my $pid;
+	system("git-update-cache","--force-remove","--",@old) if @old;
+	die "Cannot remove files: $?\n" if $?;
+	system("git-update-cache","--add","--",@new) if @new;
+	die "Cannot add files: $?\n" if $?;
+
+	$pid = open(C,"-|");
+	die "Cannot fork: $!" unless defined $pid;
+	unless($pid) {
+		exec("git-write-tree");
+		die "Cannot exec git-write-tree: $!\n";
+	}
+	chomp(my $tree = <C>);
+	length($tree) == 40
+		or die "Cannot get tree id ($tree): $!\n";
+	close(C)
+		or die "Error running git-write-tree: $?\n";
+	print "Tree ID $tree\n" if $opt_v;
+
+	my $parent = "";
+	if(open(C,"$git_dir/refs/heads/$last_branch")) {
+		chomp($parent = <C>);
+		close(C);
+		length($parent) == 40
+			or die "Cannot get parent id ($parent): $!\n";
+		print "Parent ID $parent\n" if $opt_v;
+	}
+
+	my $pr = IO::Pipe->new();
+	my $pw = IO::Pipe->new();
+	$pid = fork();
+	die "Fork: $!\n" unless defined $pid;
+	unless($pid) {
+		$pr->writer();
+		$pw->reader();
+		dup2($pw->fileno(),0);
+		dup2($pr->fileno(),1);
+		$pr->close();
+		$pw->close();
+
+		my @par = ();
+		@par = ("-p",$parent) if $parent;
+		exec("env",
+			"GIT_AUTHOR_NAME=$author",
+			"GIT_AUTHOR_EMAIL=$author",
+			"GIT_AUTHOR_DATE=".strftime("+0000 %Y-%m-%d %H:%M:%S",gmtime($date)),
+			"GIT_COMMITTER_NAME=$author",
+			"GIT_COMMITTER_EMAIL=$author",
+			"GIT_COMMITTER_DATE=".strftime("+0000 %Y-%m-%d %H:%M:%S",gmtime($date)),
+			"git-commit-tree", $tree,@par);
+		die "Cannot exec git-commit-tree: $!\n";
+	}
+	$pw->writer();
+	$pr->reader();
+	print $pw $logmsg
+		or die "Error writing to git-commit-tree: $!\n";
+	$pw->close();
+
+	print "Committed patch $patchset ($branch)\n" if $opt_v;
+	chomp(my $cid = <$pr>);
+	length($cid) == 40
+		or die "Cannot get commit id ($cid): $!\n";
+	print "Commit ID $cid\n" if $opt_v;
+	$pr->close();
+
+	waitpid($pid,0);
+	die "Error running git-commit-tree: $?\n" if $?;
+
+	open(C,">$git_dir/refs/heads/$branch")
+		or die "Cannot open branch $branch for update: $!\n";
+	print C "$cid\n"
+		or die "Cannot write branch $branch for update: $!\n";
+	close(C)
+		or die "Cannot write branch $branch for update: $!\n";
+
+	if($tag) {
+		open(C,">$git_dir/refs/tags/$tag")
+			or die "Cannot create tag $tag: $!\n";
+		print C "$cid\n"
+			or die "Cannot write tag $branch: $!\n";
+		close(C)
+			or die "Cannot write tag $branch: $!\n";
+		print "Created tag '$tag' on '$branch'\n" if $opt_v;
+	}
+
+	@old = ();
+	@new = ();
+};
+
+while(<CVS>) {
+	chomp;
+	if($state == 0 and /^-+$/) {
+		$state = 1;
+	} elsif($state == 0) {
+		$state = 1;
+		redo;
+	} elsif(($state==0 or $state==1) and s/^PatchSet\s+//) {
+		$patchset = 0+$_;
+		$state=2;
+	} elsif($state == 2 and s/^Date:\s+//) {
+		$date = pdate($_);
+		unless($date) {
+			print STDERR "Could not parse date: $_\n";
+			$state=0;
+			next;
+		}
+		$state=3;
+	} elsif($state == 3 and s/^Author:\s+//) {
+		s/\s+$//;
+		$author = $_;
+		$state = 4;
+	} elsif($state == 4 and s/^Branch:\s+//) {
+		s/\s+$//;
+		$branch = $_;
+		$state = 5;
+	} elsif($state == 5 and s/^Ancestor branch:\s+//) {
+		s/\s+$//;
+		$ancestor = $_;
+		$ancestor = $opt_o if $ancestor == "HEAD";
+		$state = 6;
+	} elsif($state == 5) {
+		$ancestor = undef;
+		$state = 6;
+		redo;
+	} elsif($state == 6 and s/^Tag:\s+//) {
+		s/\s+$//;
+		if($_ eq "(none)") {
+			$tag = undef;
+		} else {
+			$tag = $_;
+		}
+		$state = 7;
+	} elsif($state == 7 and /^Log:/) {
+		$logmsg = "";
+		$state = 8;
+	} elsif($state == 8 and /^Members:/) {
+		$branch = $opt_o if $branch eq "HEAD";
+		if(defined $branch_date{$branch} and $branch_date{$branch} >= $date) {
+			# skip
+			print "skip patchset $patchset: $date before $branch_date{$branch}\n";
+			$state = 11;
+			next;
+		}
+		if($ancestor) {
+			if(-f "$git_dir/refs/heads/$branch") {
+				print STDERR "Branch $branch already exists!\n";
+				$state=11;
+				next;
+			}
+			unless(open(H,"$git_dir/refs/heads/$ancestor")) {
+				print STDERR "Branch $ancestor does not exist!\n";
+				$state=11;
+				next;
+			}
+			chomp(my $id = <H>);
+			close(H);
+			unless(open(H,"> $git_dir/refs/heads/$branch")) {
+				print STDERR "Could not create branch $branch: $!\n";
+				$state=11;
+				next;
+			}
+			print H "$id\n"
+				or die "Could not write branch $branch: $!";
+			close(H)
+				or die "Could not write branch $branch: $!";
+		}
+		if(($ancestor || $branch) ne $last_branch) {
+			print "Switching from $last_branch to $branch\n" if $opt_v;
+			system("git-read-tree","-m","-u","$last_branch","$branch");
+			die "read-tree failed: $?\n" if $?;
+		}
+		if($branch ne $last_branch) {
+			unlink("$git_dir/HEAD");
+			symlink("refs/heads/$branch","$git_dir/HEAD");
+			$last_branch = $branch;
+		}
+		$state = 9;
+	} elsif($state == 8) {
+		$logmsg .= "$_\n";
+	} elsif($state == 9 and /^\s+(\S+):(INITIAL|\d(?:\.\d+)+)->(\d(?:\.\d+)+)\s*$/) {
+#	VERSION:1.96->1.96.2.1
+		my $init = ($2 eq "INITIAL");
+		my $fn = $1;
+		my $rev = $3;
+		$fn =~ s#^/+##;
+		my $data = $cvs->file($fn,$rev);
+		print "".($init ? "New" : "Update")." $fn: ".length($data)." bytes.\n";
+		mkpath(dirname($fn),$opt_v);
+		open(F,"> ./$fn")
+			or die "Cannot create '$fn': $!\n";
+		print F $data
+			or die "Cannot write to '$fn': $!\n";
+		close(F)
+			or die "Cannot write to '$fn': $!\n";
+		chmod(pmode($cvs->{'mode'}), $fn);
+		push(@new,$fn); # may be resurrected!
+	} elsif($state == 9 and /^\s+(\S+):\d(?:\.\d+)+->(\d(?:\.\d+)+)\(DEAD\)\s*$/) {
+		my $fn = $1;
+		$fn =~ s#^/+##;
+		push(@old,$fn);
+	} elsif($state == 9 and /^\s*$/) {
+		$state = 10;
+	} elsif(($state == 9 or $state == 10) and /^-+$/) {
+		&$commit();
+		$state = 1;
+	} elsif($state == 11 and /^-+$/) {
+		$state = 1;
+	} elsif(/^-+$/) { # end of unknown-line processing
+		$state = 1;
+	} elsif($state != 11) { # ignore stuff when skipping
+		print "* UNKNOWN LINE * $_\n";
+	}
+}
+&$commit() if $branch and $state != 11;
+
+# Now switch back to the branch we were in before all of this happened
+if($orig_branch) {
+	print "DONE; switching back to $orig_branch\n" if $opt_v;
+} else {
+	$orig_branch = "master";
+	print "DONE; creating $orig_branch branch\n" if $opt_v;
+	system("cp","$git_dir/refs/heads/$opt_o","$git_dir/refs/heads/master")
+		unless -f "$git_dir/refs/heads/master";
+}
+
+system("git-read-tree","-m","-u","$last_branch","$orig_branch");
+die "read-tree failed: $?\n" if $?;
+
+unlink("$git_dir/HEAD");
+symlink("refs/heads/$orig_branch","$git_dir/HEAD");
 

^ permalink raw reply	[relevance 2%]

* [PATCH] git-format-patch: Prepare patches for e-mail submission.
  @ 2005-06-30  7:36  3%       ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2005-06-30  7:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: jon, git

This is the script I use to prepare patches for e-mail submission.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Jon, I made it by default not to say [PATCH N/M] for any M,
*** and you can turn it on with --numbered flag.   Even with
*** --numbered, " N/M" is not shown if M==1.

 Makefile                |    3 +
 git-format-patch-script |  122 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 124 insertions(+), 1 deletions(-)
 create mode 100755 git-format-patch-script

16daa73282d5aa1cc7d227945aea193553fdfaad
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,8 @@ SCRIPTS=git git-apply-patch-script git-m
 	git-fetch-script git-status-script git-commit-script \
 	git-log-script git-shortlog git-cvsimport-script git-diff-script \
 	git-reset-script git-add-script git-checkout-script git-clone-script \
-	gitk git-cherry git-rebase-script git-relink-script
+	gitk git-cherry git-rebase-script git-relink-script \
+	git-format-patch-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
 	git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-format-patch-script b/git-format-patch-script
new file mode 100755
--- /dev/null
+++ b/git-format-patch-script
@@ -0,0 +1,122 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+usage () {
+    echo >&2 "usage: $0"' [-n] [-o dir] [-<diff options>...] upstream [ our-head ]
+
+Prepare each commit with its patch since our-head forked from upstream,
+one file per patch, for e-mail submission.  Each output file is
+numbered sequentially from 1, and uses the first line of the commit
+message (massaged for pathname safety) as the filename.
+
+When -o is specified, output files are created in that directory; otherwise in
+the current working directory.
+
+When -n is specified, instead of "[PATCH] Subject", the first line is formatted
+as "[PATCH N/M] Subject", unless you have only one patch.
+'
+    exit 1
+}
+
+diff_opts=
+IFS='
+'
+LF='
+'
+outdir=./
+
+while case "$#" in 0) break;; esac
+do
+    case "$1" in
+    -n|--n|--nu|--num|--numb|--numbe|--number|--numbere|--numbered)
+    numbered=t ;;
+    -o=*|--o=*|--ou=*|--out=*|--outp=*|--outpu=*|--output=*|--output-=*|\
+    --output-d=*|--output-di=*|--output-dir=*|--output-dire=*|\
+    --output-direc=*|--output-direct=*|--output-directo=*|\
+    --output-director=*|--output-directory=*)
+    outdir=`expr "$1" : '-[^=]*=\(.*\)'` ;;
+    -o|--o|--ou|--out|--outp|--outpu|--output|--output-|--output-d|\
+    --output-di|--output-dir|--output-dire|--output-direc|--output-direct|\
+    --output-directo|--output-director|--output-directory)
+    case "$#" in 1) usage ;; esac; shift
+    outdir="$1" ;;
+    -*)	diff_opts="$diff_opts$LF$1" ;;
+    *) break ;;
+    esac
+    shift
+done
+
+case "$#" in
+2)    linus="$1" junio="$2" ;;
+1)    linus="$1" junio=HEAD ;;
+*)    usage ;;
+esac
+
+case "$outdir" in
+*/) ;;
+*) outdir="$outdir/" ;;
+esac
+test -d "$outdir" || mkdir -p "$outdir" || exit
+
+tmp=.tmp-series$$
+trap 'rm -f $tmp-*' 0 1 2 3 15
+
+series=$tmp-series
+
+titleScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s/[^-a-z.A-Z_0-9]/-/g
+        s/\.\.\.*/\./g
+	s/\.*$//
+	s/--*/-/g
+	s/^-//
+	s/-$//
+	s/$/./
+	q
+'
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+stripCommitHead='/^'"$_x40"' (from '"$_x40"')$/d'
+
+git-rev-list "$junio" "^$linus" >$series
+total=`wc -l <$series`
+i=$total
+while read commit
+do
+    title=`git-cat-file commit "$commit" | sed -e "$titleScript"`
+    case "$numbered" in
+    '') num= ;;
+    *)
+	case $total in
+	1) num= ;;
+	*) num=' '`printf "%d/%d" $i $total` ;;
+	esac
+    esac
+    file=`printf '%04d-%stxt' $i "$title"`
+    i=`expr "$i" - 1`
+    echo "$file"
+    {
+	mailScript='
+	1,/^$/d
+	: loop
+	/^$/b loop
+	s|^|[PATCH'"$num"'] |
+	: body
+	p
+	n
+	b body'
+
+	git-cat-file commit "$commit" | sed -ne "$mailScript"
+	echo '---'
+	echo
+	git-diff-tree -p $diff_opts "$commit" | git-apply --stat --summary
+	echo
+	git-diff-tree -p $diff_opts "$commit" | sed -e "$stripCommitHead"
+	echo '------------'
+    } >"$outdir$file"
+done <$series
------------

^ permalink raw reply	[relevance 3%]

* Re: "git-send-pack"
  @ 2005-06-30 19:49  4% ` Daniel Barkalow
  2005-06-30 20:12  3%   ` "git-send-pack" Linus Torvalds
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-06-30 19:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano, ftpadmin

On Thu, 30 Jun 2005, Linus Torvalds wrote:

> Anyway, what are the limitations? Here's a few obvious ones:
> 
>  - I really hate how "ssh" apparently cannot be told to have alternate 
>    paths. For example, on master.kernel.org, I don't control the setup, so 
>    I can't install my own git binaries anywhere except in my ~/bin
>    directory, but I also cannot get ssh to accept that that is a valid 
>    path. This one really bums me out, and I think it's an ssh deficiency. 
> 
>    You apparently have to compile in the paths at compile-time into sshd, 
>    and PermitUserEnvironment is disabled by default (not that it even 
>    seems to work for the PATH environment, but that may have been my 
>    testing that didn't re-start sshd).
> 
>    That just sucks.

The easiest thing might be to have a centrally-installed wrapper script
that could run programs installed in your home directory. E.g., if
"git" had a "source ~/.git-env" at the beginning, and your ~/.git-env
fixed your PATH, then "git receive-pack ARGS" should work, for a generic
centrally installed git and special stuff in your home directory.

>  - It doesn't update the working directory at the other end. This is fine 
>    for what it's intended for (pushing to a central "raw" git archives), 
>    so this could be considered a feature, but it's worth pointing out. 
>    Only a "pull" will update your working directory, and this pack sending 
>    really is meant to be used in a kind of "push to central archive" way.

I thought only "resolve" (as part of "fetch") updated your working
directory, so this is completely consistant.

>  - this is also (at least once we've tested it a lot more and added the
>    code to allow it to create new refs on the remote side) meant to be a
>    good way to mirror things out, since clearly rsync isn't scaling. 
> 
>    However, I don't know what the rules for acceptable mirroring 
>    approaches are, and it's entirely possible (nay, probable) that an ssh
>    connection from the "master" ain't it. It would be good to know what 
>    (of any) would be acceptable solutions..

The right solution probably involves getting each pack file you push to
the mirrors as well as to the master. They'll probably update no less
frequently than you push, and they should go through a series of states
which matches the master, so it's not necessary to have anything smart on
master sending them, and they only have to unpack the files they get (and
update the refs afterward). That should make the cross-system trust
requirements relatively minimal; the mirror can fetch things from master,
and neither side has to allow the other to specify a command line.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[relevance 4%]

* Re: "git-send-pack"
  2005-06-30 19:49  4% ` "git-send-pack" Daniel Barkalow
@ 2005-06-30 20:12  3%   ` Linus Torvalds
    2005-06-30 20:49  0%     ` "git-send-pack" Daniel Barkalow
  0 siblings, 2 replies; 200+ results
From: Linus Torvalds @ 2005-06-30 20:12 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git Mailing List, Junio C Hamano, ftpadmin



On Thu, 30 Jun 2005, Daniel Barkalow wrote:
> 
> The right solution probably involves getting each pack file you push to
> the mirrors as well as to the master. They'll probably update no less
> frequently than you push, and they should go through a series of states
> which matches the master, so it's not necessary to have anything smart on
> master sending them, and they only have to unpack the files they get (and
> update the refs afterward).

Hmm, yes. That would work, together with just fetching the heads.

It won't _really_ solve the problem, since the pushed pack objects will
grow at a proportional rate to the current objects - it's just a constant
factor (admittedly a potentially fairly _big_ constant factor)  
improvement both in size and in number of files.

So the mirroring ends up getting slowly slower and slower as the number of 
pack files go up. In contrast, a git-aware thing can be basically 
constant-time, and mirroring expense ends up being relative to the size of 
the change rather than the size of the repository.

But mirroring just pack-files might solve the problem for the forseeable 
future, so..

"git-receive-pack" would need to take a flag to tell it to instead of
unpacking just check the object instead (ie call "git-unpack-object" with
the "-n" flag - it will check that everything looks ok, including the
embedded protecting SHA1 hash), and write it out to the filesystem (as it
comes in) and then rename it to the right place.

			Linus

^ permalink raw reply	[relevance 3%]

* Re: "git-send-pack"
  @ 2005-06-30 20:52  3%       ` Linus Torvalds
    0 siblings, 1 reply; 200+ results
From: Linus Torvalds @ 2005-06-30 20:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Daniel Barkalow, Git Mailing List, Junio C Hamano, ftpadmin



On Thu, 30 Jun 2005, H. Peter Anvin wrote:
> 
> If I've understood this correctly, it's not a constant factor 
> improvement in the number of files (in the size, yes); it's changing it 
> from O(t*c) to O(t) where t is number of trees and c is number of 
> changesets.  That's key.

No, it _is_ a constant factor even in number of files, if you just keep 
the pack objects around without re-packing them.

Basically, you'd get one new pack-file every time I push. That's better
than getting <n> "raw object" files (where <n> can be anything from just a
couple to several thousand, depending on whether I had pulled things), but
it's still just a constant factor on both number of files and size of
files.

Now, you could re-pack the objects every once in a while: it would force a
whole new "epoch", of course and then the mirrorers would have to fetch
the whole repacked file, but that might be fine. Especially if you stop
re-packing after you've hit a certain size (say, a couple of megs), and
then start on the next pack.

> For the purposes of rsync, storing the objects in a single append-only 
> file would be a very efficient method, since the rsync algorithm will 
> quickly discover an invariant head and only transmit the tail.

Actually, it won't be "quick" - it will have to read the whole file and do 
it's hash window thing.

You _could_ append the pack-files into one single "superpack" file (since
you can figure out where the pack boundaries are), but it would be
extremely big after a while, and rsync would spend all its time doing over
the hash window. You'd definitely be better off with re-packing.

		Linus

^ permalink raw reply	[relevance 3%]

* Re: "git-send-pack"
  2005-06-30 20:12  3%   ` "git-send-pack" Linus Torvalds
  @ 2005-06-30 20:49  0%     ` Daniel Barkalow
  1 sibling, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-06-30 20:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano, ftpadmin

On Thu, 30 Jun 2005, Linus Torvalds wrote:

> On Thu, 30 Jun 2005, Daniel Barkalow wrote:
> > 
> > The right solution probably involves getting each pack file you push to
> > the mirrors as well as to the master. They'll probably update no less
> > frequently than you push, and they should go through a series of states
> > which matches the master, so it's not necessary to have anything smart on
> > master sending them, and they only have to unpack the files they get (and
> > update the refs afterward).
> 
> Hmm, yes. That would work, together with just fetching the heads.
> 
> It won't _really_ solve the problem, since the pushed pack objects will
> grow at a proportional rate to the current objects - it's just a constant
> factor (admittedly a potentially fairly _big_ constant factor)  
> improvement both in size and in number of files.
>
> So the mirroring ends up getting slowly slower and slower as the number of 
> pack files go up. In contrast, a git-aware thing can be basically 
> constant-time, and mirroring expense ends up being relative to the size of 
> the change rather than the size of the repository.
> 
> But mirroring just pack-files might solve the problem for the forseeable 
> future, so..

Whenever it gets slow, you could replace all the old packs with a single
new pack containing all the old objects; and master could repack whenever
it has a lot of pack files. That's pretty close to O(n) in change size.

Alternatively, having a reverse-ordered list of pack files would mean that
mirrors could just go through that list until they found one they already
had, and stop there, which would really be O(n).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[relevance 0%]

* Re: Tags
    @ 2005-07-02 20:38  3%                           ` Jan Harkes
  2005-07-02 22:32  4%                             ` Tags Jan Harkes
  1 sibling, 1 reply; 200+ results
From: Jan Harkes @ 2005-07-02 20:38 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Linus Torvalds, Daniel Barkalow,
	Git Mailing List, Junio C Hamano, ftpadmin

On Fri, Jul 01, 2005 at 05:06:15PM -0700, H. Peter Anvin wrote:
> Eric W. Biederman wrote:
> >
> >If I really care what developer xyz tagged I will pull from them,
> >or a mirror I trust.  And since developer xyz doesn't pull his
> >own global tags from other repositories that should be sufficient.
> >
> 
> You're missing something totally and utterly fundamental here: I'm 
> talking about creating an infrastructure (think sourceforge) where there 
> is only one git repository for the whole system, period, full stop, end 
> of story.

I'm not entirely sure what you are envisoning, but it is definitely
doable in a secure way.

- Assume that each developer will one or more private trees with one or
  more branches on kernel.org, lets say all these private repositories
  are stored under /scm/git/<user>/

- Now you create a single 'global repository' which is going to be the
  publicly visible one that will be mirrored out,

- Then you run the following script (untested)
  #!/bin/sh
  GIT_DIR=$global_repo
  for user in `(cd /scm/git ; ls)`; do
    for tree in `find /scm/git/$user -name *.git` ; do
	for ref in `find $tree/refs -type f`  ; do
	    type=`echo $ref | sed 'sX^.*/refs/\([^/]*\)/.*$X\1X'`
	    name=`echo $ref | sed 'sX^.*/refs/[^/]*/\(.*\)$X\1X'`
	    git fetch /scm/git/$tree $branch 
	    mkdir -p $GIT_DIR/refs/$type/$user/$name
	    cat $GIT_DIR/FETCH_HEAD > $GIT_DIR/refs/$type/$user/$name
	done
    done
  done

- You can repack the global repository whenever you want.
- Finally, once a user knows that all his changes are available from the
  global repository, he can remove any objects from his tree and use
  GIT_ALTERNATE_OBJECT_DIRECTORIES=$global_repo/objects
  (maybe there should be a flag for git prune to removes local objects
  that are already available in the alternate object directories)

Jan

^ permalink raw reply	[relevance 3%]

* Re: Tags
  2005-07-02 20:38  3%                           ` Tags Jan Harkes
@ 2005-07-02 22:32  4%                             ` Jan Harkes
  0 siblings, 0 replies; 200+ results
From: Jan Harkes @ 2005-07-02 22:32 UTC (permalink / raw)
  To: Git Mailing List
  Cc: H. Peter Anvin, Eric W. Biederman, Linus Torvalds,
	Daniel Barkalow, Junio C Hamano, ftpadmin

On Sat, Jul 02, 2005 at 04:38:06PM -0400, Jan Harkes wrote:
> - Then you run the following script (untested)

Ok, I tested it and it was pretty broken, I assumed that git-fetch-script
accepted the same arguments as git-pull-script.

Here is one that actually seems to work.

Jan


#!/bin/sh
#
# combine per-user private trees into a single repository.
# assumes that user repositories are stored as "$repos/<user>/<tree>.git"
#
global=global.git
repos=/path/to/user/repositories

export GIT_DIR="$global"

# create global repository if it doesn't exist
git-init-db

for tree in $(cd "$repos" && find . -name '*.git' -prune | sed 'sX./XX')
do
    root="$repos/$tree"
    for ref in $(cd "$root" && find refs -type f)  ; do
	echo Synchronizing $tree
	git fetch "$root" "$ref"

	type=$(echo "$ref" | sed -ne 'sX^refs/\([^/]*\)/.*$X\1Xp')
	name=$(echo "$ref" | sed -ne 'sX^refs/[^/]*/\(.*\)$X\1Xp')
	dest="$GIT_DIR/refs/$type/$tree/$name"
	mkdir -p $(dirname "$dest")
	cat "$GIT_DIR/FETCH_HEAD" > "$dest"
    done
done

^ permalink raw reply	[relevance 4%]

* Re: Tags
  @ 2005-07-03  0:17  2%                                               ` Linus Torvalds
  0 siblings, 0 replies; 200+ results
From: Linus Torvalds @ 2005-07-03  0:17 UTC (permalink / raw)
  To: A Large Angry SCM
  Cc: Git Mailing List, H. Peter Anvin, Eric W. Biederman,
	Daniel Barkalow, Junio C Hamano, ftpadmin



On Sat, 2 Jul 2005, A Large Angry SCM wrote:
>
> Linus Torvalds wrote:
> > 
> > None of git itself normally has any "trust". The SHA1 means that the 
> > _integrity_ of the archive is ensured, but for some things (notably 
> > releases), you want to have something else. That's the "tag object".
> > 
> 
> But can't the commit object do this just as well by signing the commit text?

Yes and no.

Technically yes, absolutely, you could add a signature to the commit text.

However, that's just wrong for several reasons:

First off, the signing is not necessarily done by the person committing
something. Think of any paperwork: the person that signs the paperwork is 
not necessarily the same person that _wrote_ the paperwork. A signature is 
a "witness".

For an example of this, look at the signatures that we've had for a long 
time on kernel.org: check out the files like "patch-2.6.8.1.sign". That's 
a signature, but it's not a signature by _me_. It's kernel.org signing the 
thing so that downstream people can verify things.

And it would be not only wrong, but literally _impossible_ for me to do it 
in the commit. I don't have (or want to have) the kernel.org private key. 
That's not what the signature is about. kernel.org is signing that "this 
is what I got, and what I passed on". It's not signing that "this is what 
I wrote".

In a lot of systems, you tag something good after it has passed a
regression test. Ie the _tag_ may happen days or even weeks after the
commit has been done.

So any system that signs commits directly is doing something _wrong_. 

Secondly, you can say that you trust other things. In git, you can tag 
individual blobs, and you can tag individual trees. For an example of 
where it makes sense to tag (sign) individual file versions, we've 
actually had things like ISDN drivers (or firmware) that passed some telco 
verification suite, and in certain countries it used to be that you 
weren't legally supposed to use hadrware that hadn't passed that suite. In 
cases like that, you could sign the particular version of the driver, and 
say "this one is good".

(Yeah, those laws are happily going away, but I think the ISDN people in 
germany actually ended up doing exactly that, except they obviously didn't 
use git signatures. I think they had a list of file+md5sum).

Finally, it's a tools issue. It's wrong to mix up the notion of committing 
and signing in the same thing, because that just complicates a tool that 
has to be able to do both. Now you can have a nice graphical commit tool, 
and it doesn't need to know about public keys etc to be useful - you can 
use another tool to do the signing.

Small is beautiful, but "independent" is even more so.

> Your tendency is to use tag objects as a permanent, public label of some 
> state. Signing the commit text or the email stating that commit 
> ${COMMIT_SHA} would work just as well for verification purposes.

Well, according to that logic, you'd never need signatures at all - you 
can always keep them totally outside the system.

But if they are totally outside the system, then you have to have some
other mechanism to track them, and you can never trust a git archive on
its own. My goal with the tag objects was that you can just get my git
archive, and the archive is _inherently_ trustworthy, because if you care,
you can verify it without any external input at all (except you need to
know my public key, of course, but that's not a tools issue any more,
that's about how signatures work).

So by having tag objects, I can just have refs to them, and anything that 
can fetch a ref (which implies _any_ kind of "pull" functionality) can get 
it. No special cases. No crap.

Do one thing, and do it well. Git does objects with relationships. That's 
really what git is all about, and the "tag object" fits very well into 
that mentality.

		Linus

^ permalink raw reply	[relevance 2%]

* [PATCH] Make specification of CVS module to convert optional.
  @ 2005-07-03 10:36  4%                   ` Sven Verdoolaege
  0 siblings, 0 replies; 200+ results
From: Sven Verdoolaege @ 2005-07-03 10:36 UTC (permalink / raw)
  To: Matthias Urlichs, git

Make specification of CVS module to convert optional.

If we're inside a checked out CVS repository, there is
no need to explicitly specify the module as it is
available in CVS/Repository.
Also read CVS/Root if it's available and -d is not specified.
Finally, explicitly pass root to cvsps as CVS/Root takes
precedence over CVSROOT.

Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

---
commit f9714a4a0cd4ed0ccca3833743d98ea874a2232d
tree de5d7bba63538f29b8ea2b801d932b7679289b96
parent 1cd3674add10d1e511446f3034a1d233a3da7eab
author Sven Verdoolaege <skimo@kotnet.org> Sun, 03 Jul 2005 11:34:59 +0200
committer Sven Verdoolaege <skimo@kotnet.org> Sun, 03 Jul 2005 11:40:44 +0200

 Documentation/git-cvsimport-script.txt |    2 +-
 git-cvsimport-script                   |   34 ++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-cvsimport-script.txt b/Documentation/git-cvsimport-script.txt
--- a/Documentation/git-cvsimport-script.txt
+++ b/Documentation/git-cvsimport-script.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 --------
 'git-cvsimport-script' [ -o <branch-for-HEAD> ] [ -h ] [ -v ]
 			[ -d <CVSROOT> ] [ -p <options-for-cvsps> ]
-			<CVS_module> [ <GIT_repository> ]
+			[ -C <GIT_repository> ] [ <CVS_module> ]
 
 
 DESCRIPTION
diff --git a/git-cvsimport-script b/git-cvsimport-script
--- a/git-cvsimport-script
+++ b/git-cvsimport-script
@@ -26,35 +26,53 @@ use POSIX qw(strftime dup2);
 $SIG{'PIPE'}="IGNORE";
 $ENV{'TZ'}="UTC";
 
-our($opt_h,$opt_o,$opt_v,$opt_d,$opt_p);
+our($opt_h,$opt_o,$opt_v,$opt_d,$opt_p,$opt_C);
 
 sub usage() {
 	print STDERR <<END;
 Usage: ${\basename $0}     # fetch/update GIT from CVS
 	   [ -o branch-for-HEAD ] [ -h ] [ -v ] [ -d CVSROOT ]
-       [ -p opts-for-cvsps ]
-       CVS_module [ GIT_repository ]
+       [ -p opts-for-cvsps ] [ -C GIT_repository ]
+       [ CVS_module ]
 END
 	exit(1);
 }
 
-getopts("hqvo:d:p:") or usage();
+getopts("hqvo:d:p:C:") or usage();
 usage if $opt_h;
 
-@ARGV == 1 or @ARGV == 2 or usage();
-
-my($cvs_tree, $git_tree) = @ARGV;
+@ARGV <= 1 or usage();
 
 if($opt_d) {
 	$ENV{"CVSROOT"} = $opt_d;
+} elsif(-f 'CVS/Root') {
+	open my $f, '<', 'CVS/Root' or die 'Failed to open CVS/Root';
+	$opt_d = <$f>;
+	chomp $opt_d;
+	close $f;
+	$ENV{"CVSROOT"} = $opt_d;
 } elsif($ENV{"CVSROOT"}) {
 	$opt_d = $ENV{"CVSROOT"};
 } else {
 	die "CVSROOT needs to be set";
 }
 $opt_o ||= "origin";
+my $git_tree = $opt_C;
 $git_tree ||= ".";
 
+my $cvs_tree;
+if ($#ARGV == 0) {
+	$cvs_tree = $ARGV[0];
+} elsif (-f 'CVS/Repository') {
+	open my $f, '<', 'CVS/Repository' or 
+	    die 'Failed to open CVS/Repository';
+	$cvs_tree = <$f>;
+	chomp $cvs_tree;
+	close $f
+} else {
+	usage();
+}
+
 select(STDERR); $|=1; select(STDOUT);
 
 
@@ -378,7 +396,7 @@ die "Cannot fork: $!\n" unless defined $
 unless($pid) {
 	my @opt;
 	@opt = split(/,/,$opt_p) if defined $opt_p;
-	exec("cvsps",@opt,"-x","-A","--cvs-direct",$cvs_tree);
+	exec("cvsps",@opt,"-x","-A","--cvs-direct",'--root',$opt_d,$cvs_tree);
 	die "Could not start cvsps: $!\n";
 }
 

^ permalink raw reply	[relevance 4%]

* [PATCH] Use MAP_FAILED instead of double cast
@ 2005-07-04  3:05  6% Pavel Roskin
  0 siblings, 0 replies; 200+ results
From: Pavel Roskin @ 2005-07-04  3:05 UTC (permalink / raw)
  To: git

Hello!

I have found some really ugly code to used to check the result of
mmap():

if (-1 == (int)(long)map)
	return;

Double cast almost always indicates that something fishy is going on.
Indeed, mmap() could (in theory) return 0xffffffff on a 64-bit platform,
and that would be misinterpreted as a failure.

There is a preprocessor symbol MAP_FAILED that should be used instead.
It's used already in some places, so let's use it consistently.

Signed-off-by: Pavel Roskin <proski@gnu.org>

diff --git a/diffcore-order.c b/diffcore-order.c
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -28,7 +28,7 @@ static void prepare_order(const char *or
 	}
 	map = mmap(NULL, st.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
 	close(fd);
-	if (-1 == (int)(long)map)
+	if (map == MAP_FAILED)
 		return;
 	endp = map + st.st_size;
 	for (pass = 0; pass < 2; pass++) {
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -54,7 +54,7 @@ int fetch(unsigned char *sha1)
 		}
 		map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0);
 		close(ifd);
-		if (-1 == (int)(long)map) {
+		if (map == MAP_FAILED) {
 			fprintf(stderr, "cannot mmap %s\n", filename);
 			return -1;
 		}
diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -376,7 +376,7 @@ int read_cache(void)
 			map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
 	}
 	close(fd);
-	if (-1 == (int)(long)map)
+	if (map == MAP_FAILED)
 		return error("mmap failed");
 
 	hdr = map;
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -507,7 +507,7 @@ static void *map_sha1_file_internal(cons
 	}
 	map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
 	close(fd);
-	if (-1 == (int)(long)map)
+	if (map == MAP_FAILED)
 		return NULL;
 	*size = st.st_size;
 	return map;
@@ -1293,7 +1293,7 @@ int index_fd(unsigned char *sha1, int fd
 	if (size)
 		buf = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
 	close(fd);
-	if ((int)(long)buf == -1)
+	if (buf == MAP_FAILED)
 		return -1;
 
 	ret = write_sha1_file(buf, size, "blob", sha1);


-- 
Regards,
Pavel Roskin

^ permalink raw reply	[relevance 6%]

* Re: [PATCH] cvsimport: rewritten in Perl
  @ 2005-07-04 15:52  5%                     ` Sven Verdoolaege
  0 siblings, 0 replies; 200+ results
From: Sven Verdoolaege @ 2005-07-04 15:52 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git

On Mon, Jul 04, 2005 at 04:36:37PM +0200, Matthias Urlichs wrote:
> Ideally, I'd prefer to recycle standard CVS options as much as possible, 
> but given that the confusion is already there (worse: cvs' -z wants an
> argument (compression level), cvsps' -Z doesn't) that may not actually
> make sense. *Shrug*
> 
> I'm too happy when other people improve my tools to get hung up on
> details like that. ;-)

Here it is, then.

skimo
--
git-cvsimport-script: provide direct support for cvsps -z option

---
commit 28537171e7ec23c8677ea6e77c208583f95caa28
tree ca80ed2fad05b150984c14a5364dac8d3e307120
parent 6e7e37b0bfc921aa1f0cb30560fc128e87a41966
author Sven Verdoolaege <skimo@kotnet.org> Mon, 04 Jul 2005 17:10:06 +0200
committer Sven Verdoolaege <skimo@kotnet.org> Mon, 04 Jul 2005 17:10:06 +0200

 git-cvsimport-script |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/git-cvsimport-script b/git-cvsimport-script
--- a/git-cvsimport-script
+++ b/git-cvsimport-script
@@ -28,19 +28,19 @@ use POSIX qw(strftime dup2);
 $SIG{'PIPE'}="IGNORE";
 $ENV{'TZ'}="UTC";
 
-our($opt_h,$opt_o,$opt_v,$opt_d,$opt_p,$opt_C);
+our($opt_h,$opt_o,$opt_v,$opt_d,$opt_p,$opt_C,$opt_z);
 
 sub usage() {
 	print STDERR <<END;
 Usage: ${\basename $0}     # fetch/update GIT from CVS
-	   [ -o branch-for-HEAD ] [ -h ] [ -v ] [ -d CVSROOT ]
-       [ -p opts-for-cvsps ] [ -C GIT_repository ]
+       [ -o branch-for-HEAD ] [ -h ] [ -v ] [ -d CVSROOT ]
+       [ -p opts-for-cvsps ] [ -C GIT_repository ] [ -z fuzz ]
        [ CVS_module ]
 END
 	exit(1);
 }
 
-getopts("hqvo:d:p:C:") or usage();
+getopts("hqvo:d:p:C:z:") or usage();
 usage if $opt_h;
 
 @ARGV <= 1 or usage();
@@ -436,6 +436,7 @@ die "Cannot fork: $!\n" unless defined $
 unless($pid) {
 	my @opt;
 	@opt = split(/,/,$opt_p) if defined $opt_p;
+	unshift @opt, '-z', $opt_z if defined $opt_z;
 	exec("cvsps",@opt,"-u","-A","--cvs-direct",'--root',$opt_d,$cvs_tree);
 	die "Could not start cvsps: $!\n";
 }

^ permalink raw reply	[relevance 5%]

* [PATCH 0/2] Support for transferring pack files in git-ssh-*
@ 2005-07-04 18:48  3% Daniel Barkalow
  2005-07-04 18:50 10% ` [PATCH 1/2] Specify object not useful to pull Daniel Barkalow
  0 siblings, 1 reply; 200+ results
From: Daniel Barkalow @ 2005-07-04 18:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This series allows git-ssh-* to transfer objects packed into pack files in
the case of updating a ref file. It is a proof-of-concept for transferring
pack files in any situation where it's useful.

The general method is that the fetch() method has the option of
additionally getting other objects in addition to the one
specified; objects which aren't needed are specified with
dont_fetch() (when it makes sense to exclude them). In this version, it
only excludes an object when it is the current value of a ref file that is
being updated, but further exclusions are clearly possible.

In the case of git-ssh-*, the target specifies objects to exclude, and the
source responds (asynchronously) with whether or not it knows how to
exclude them (i.e., whether or not it has them). If the target has gotten
an object excluded, it requests a pack file instead of a single object,
and the source provides all objects referenced from the given hash,
excluding those specified for exclusion.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[relevance 3%]

* [PATCH 1/2] Specify object not useful to pull
  2005-07-04 18:48  3% [PATCH 0/2] Support for transferring pack files in git-ssh-* Daniel Barkalow
@ 2005-07-04 18:50 10% ` Daniel Barkalow
  0 siblings, 0 replies; 200+ results
From: Daniel Barkalow @ 2005-07-04 18:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Add support for the pull common code to specify to a pull implementation
hashes which wouldn't be useful to fetch implicitly. This can be used to
infer (possibly) what hashes would be useful to fetch implicitly, such that
a later call to fetch can also fetch extra stuff.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

---
commit 9bc0256d2e834e101d7bd3f4867a330c92104929
tree a2b4cfdaac954c46babffd585a61f3faf68969e1
parent d0efc8a71da1855c705fd2074b219bcb158b6dbd
author Daniel Barkalow <barkalow@iabervon.org> 1120411439 -0400
committer Daniel Barkalow <barkalow@silva-tulga.(none)> 1120411439 -0400

Index: http-pull.c
===================================================================
--- f26b700095ec30154fede14638a099f49744981d/http-pull.c  (mode:100644 sha1:1f9d60b9b1d5eed85b24d96c240666bbfc5a22ed)
+++ a2b4cfdaac954c46babffd585a61f3faf68969e1/http-pull.c  (mode:100644 sha1:f252a4b9d5448afa7af8a62176808a631429b9cd)
@@ -139,6 +139,10 @@
         return 0;
 }
 
+void dont_fetch(const unsigned char *sha1)
+{
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;
Index: local-pull.c
===================================================================
--- f26b700095ec30154fede14638a099f49744981d/local-pull.c  (mode:100644 sha1:2f06fbee8b840a7ae642f5a22e2cb993687f3470)
+++ a2b4cfdaac954c46babffd585a61f3faf68969e1/local-pull.c  (mode:100644 sha1:270e3a0b8405793cd70e6efa70ec6aa4b1674141)
@@ -73,6 +73,10 @@
 	return -1;
 }
 
+void dont_fetch(const unsigned char *sha1)
+{
+}
+
 int fetch_ref(char *ref, unsigned char *sha1)
 {
 	static int ref_name_start = -1;
Index: pull.c
===================================================================
--- f26b700095ec30154fede14638a099f49744981d/pull.c  (mode:100644 sha1:ed3078e3b27c62c07558fd94f339801cbd685593)
+++ a2b4cfdaac954c46babffd585a61f3faf68969e1/pull.c  (mode:100644 sha1:f7f5a89aef36ffc2436dbd30170e4c8dbb2ba3a3)
@@ -155,6 +155,10 @@
 	unsigned char sha1[20];
 	int fd = -1;
 
+	if (current_ref) {
+		dont_fetch(current_ref);
+	}
+
 	if (write_ref && current_ref) {
 		fd = lock_ref_sha1(write_ref, current_ref);
 		if (fd < 0)
Index: pull.h
===================================================================
--- f26b700095ec30154fede14638a099f49744981d/pull.h  (mode:100644 sha1:e173ae3337c4465da87d849f4e5c9da203fdf01d)
+++ a2b4cfdaac954c46babffd585a61f3faf68969e1/pull.h  (mode:100644 sha1:6a35d39fd69bb884faa2d5e70c79e5c40b3ba436)
@@ -15,6 +15,12 @@
  */
 extern int fetch_ref(char *ref, unsigned char *sha1);
 
+/*
+ * Specify that the given SHA1, and everything it references, need not
+ * be fetched.  To be provided by the particular implementation. 
+ */
+extern void dont_fetch(const unsigned char *sha1);
+
 /* If set, the ref filename to write the target value to. */
 extern const char *write_ref;
 
Index: ssh-pull.c
===================================================================
--- f26b700095ec30154fede14638a099f49744981d/ssh-pull.c  (mode:100644 sha1:87d523899a83d8c0d3c5ff721208ded30c1a38f4)
+++ a2b4cfdaac954c46babffd585a61f3faf68969e1/ssh-pull.c  (mode:100644 sha1:362318071333420a7cf2450ada7269a94ec2cc7c)
@@ -53,6 +53,10 @@
 	return 0;
 }
 
+void dont_fetch(const unsigned char *sha1)
+{
+}
+
 int main(int argc, char **argv)
 {
 	char *commit_id;

^ permalink raw reply	[relevance 10%]

Results 1-200 of ~80000   | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2005-04-15 17:19  2% space compression (again) C. Scott Ananian
2005-04-15 18:34     ` Linus Torvalds
2005-04-15 18:45       ` C. Scott Ananian
2005-04-15 19:11         ` Linus Torvalds
2005-04-16 14:39           ` Martin Uecker
2005-04-16 15:11  3%         ` C. Scott Ananian
2005-04-16 17:37  0%           ` Martin Uecker
2005-04-15 19:33  2% Ray Heasman
2005-04-16 12:29  0% ` David Lang
2005-04-16 13:15     full kernel history, in patchset format Ingo Molnar
2005-04-16 17:04     ` Linus Torvalds
2005-04-17 23:31       ` David Woodhouse
2005-04-17 23:39         ` Petr Baudis
2005-04-18  0:06  2%       ` David Woodhouse
2005-04-18  0:35  0%         ` Petr Baudis
2005-04-18  0:45               ` David Woodhouse
2005-04-18  0:50                 ` Petr Baudis
2005-04-18  0:51  3%               ` David Woodhouse
2005-04-18  0:59  3%                 ` Petr Baudis
2005-04-16 22:03  7% [PATCH] Get commits from remote repositories by HTTP Daniel Barkalow
2005-04-16 22:24  3% ` Tony Luck
2005-04-16 22:33  0%   ` Daniel Barkalow
2005-04-16 23:01     Re-done kernel archive - real one? Linus Torvalds
2005-04-17 15:24     ` Russell King
2005-04-17 16:36       ` Linus Torvalds
2005-04-17 18:57         ` Russell King
2005-04-17 19:33  3%       ` Linus Torvalds
2005-04-17 19:51  0%         ` Russell King
2005-04-17  0:14  7% [PATCH] Use libcurl to use HTTP to get repositories Daniel Barkalow
2005-04-17 15:20     [0/5] Patch set for various things Daniel Barkalow
2005-04-17 15:31  7% ` [3/5] Add http-pull Daniel Barkalow
2005-04-17 18:10  4%   ` Petr Baudis
2005-04-17 18:49  0%     ` Daniel Barkalow
2005-04-17 19:08  0%       ` Petr Baudis
2005-04-17 19:24  3%         ` Daniel Barkalow
2005-04-17 19:59  0%           ` Petr Baudis
2005-04-21  3:27  3%             ` Brad Roberts
2005-04-21  4:28  3%               ` Daniel Barkalow
2005-04-21 22:05  0%                 ` tony.luck
2005-04-22 19:46  0%                   ` Daniel Barkalow
2005-04-22 22:40  0%                     ` Petr Baudis
2005-04-17 18:58  7%   ` [3.1/5] " Daniel Barkalow
2005-04-18 20:28     SCSI trees, merges and git status James Bottomley
2005-04-18 21:39     ` Linus Torvalds
2005-04-18 23:14       ` James Bottomley
2005-04-19  0:03         ` Linus Torvalds
2005-04-19  0:10  3%       ` David Woodhouse
2005-04-19  0:16  0%         ` James Bottomley
2005-04-19  1:48  3% More patches Daniel Barkalow
2005-04-19  1:57  8% ` [3/4] Add http-pull Daniel Barkalow
2005-04-19  4:39     [GIT PATCH] I2C and W1 bugfixes for 2.6.12-rc2 Greg KH
2005-04-19 18:58     ` Greg KH
2005-04-19 19:40       ` Linus Torvalds
2005-04-19 19:47         ` Greg KH
2005-04-19 20:20           ` Linus Torvalds
2005-04-19 21:40             ` Greg KH
2005-04-19 22:00               ` Linus Torvalds
2005-04-19 22:27  3%             ` Daniel Jacobowitz
2005-04-19 22:33  0%               ` Greg KH
2005-04-19 22:47  0%               ` Linus Torvalds
2005-04-20 20:56  2% [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes Petr Baudis
     [not found]     <Pine.LNX.4.58.0504201728110.2344@ppc970.osdl.org>
     [not found]     ` <20050421112022.GB2160@elf.ucw.cz>
     [not found]       ` <20050421120327.GA13834@elf.ucw.cz>
     [not found]         ` <20050421162220.GD30991@pasky.ji.cz>
     [not found]           ` <20050421190009.GC475@openzaurus.ucw.cz>
2005-04-21 19:09  3%         ` Linux 2.6.12-rc3 Petr Baudis
2005-04-21 21:38  0%           ` Pavel Machek
2005-04-21 21:41  0%             ` Petr Baudis
2005-04-22  9:03     proposal: delta based git archival Michel Lespinasse
2005-04-22  9:49  3% ` Jaime Medrano
2005-04-22 10:41     First web interface and service API draft Christian Meder
2005-04-22 12:10  3% ` Petr Baudis
     [not found]       ` <1114176579.3233.42.camel@localhost>
2005-04-22 22:57  0%     ` Petr Baudis
2005-04-24 22:29  3%       ` Christian Meder
2005-04-22 14:23     ` Jan Harkes
2005-04-22 20:57       ` Christian Meder
2005-04-23  6:39  3%     ` Jon Seymour
     [not found]     <200504210422.j3L4Mo8L021495@hera.kernel.org>
     [not found]     ` <42674724.90005@ppp0.net>
     [not found]       ` <20050422002922.GB6829@kroah.com>
     [not found]         ` <426A4669.7080500@ppp0.net>
     [not found]           ` <1114266083.3419.40.camel@localhost.localdomain>
     [not found]             ` <426A5BFC.1020507@ppp0.net>
     [not found]               ` <1114266907.3419.43.camel@localhost.localdomain>
2005-04-23 17:31                 ` Git-commits mailing list feed Linus Torvalds
2005-04-23 18:34                   ` Jan Harkes
2005-04-23 19:30  3%                 ` Linus Torvalds
2005-04-23 20:49  4%                   ` Jan Harkes
2005-04-23 21:28  0%                     ` Git transfer protocols (was: Re: Git-commits mailing list feed) Mike Taht
2005-04-23 22:22  0%                       ` Jan Harkes
2005-04-23 23:29  3%                     ` Git-commits mailing list feed Linus Torvalds
2005-04-24  0:03  3% [PATCH 0/5] Better merge-base, alternative transport programs Daniel Barkalow
2005-04-24  0:24  8% ` [PATCH 5/5] Various " Daniel Barkalow
2005-04-26  1:22     Revised PPC assembly implementation Paul Mackerras
2005-04-27  1:47  1% ` linux
2005-04-26  3:24  3% [ANNOUNCE] Cogito-0.8 (former git-pasky, big changes!) Petr Baudis
2005-04-26  7:30 26% [PATCH COGITO] Do not make cross device hard links Alexey Nezhdanov
2005-04-29 12:57 11% [PATCH] cogito: honour SHA1_FILE_DIRECTORY env var Rene Scharfe
2005-04-29 18:31  3% [PATCH] GIT: Honour SHA1_FILE_DIRECTORY env var in git-pull-script Rene Scharfe
2005-04-29 20:42     git network protocol David Lang
2005-04-29 21:15  3% ` Daniel Barkalow
2005-04-29 21:07     More problems Junio C Hamano
2005-04-29 21:27  3% ` Daniel Barkalow
2005-04-29 22:01     Junio C Hamano
2005-04-30  5:36 20% ` [PATCH] Split out "pull" from particular methods Daniel Barkalow
2005-04-30  0:53     questions about cg-update, cg-pull, and cg-clone Zack Brown
2005-05-02 19:58  3% ` Petr Baudis
2005-05-03 15:22  3%   ` Zack Brown
2005-05-03 16:30  0%     ` Daniel Barkalow
2005-05-01 12:58  3% Quick command reference Paul Mackerras
2005-05-01 17:33  5% [PATCH] Make pull not assume anything about current objects Daniel Barkalow
2005-05-01 21:49  3% [0/2] Complete http-pull Daniel Barkalow
2005-05-01 21:56 20% ` [2/2] " Daniel Barkalow
2005-05-02  0:10 33% [PATCH] Do not call fetch() when we have it Junio C Hamano
2005-05-02  0:11  4% [PATCH] Add git-local-pull Junio C Hamano
2005-05-02  3:41  7% [PATCH] git-local-pull updates Junio C Hamano
2005-05-02  5:33  1% semi-useful git perl file Joshua T. Corbin
2005-05-03  0:13 10% [PATCH] Make git-*-pull say who wants it for missing objects Junio C Hamano
2005-05-03  0:26  9% [PATCH] Short-cut error return path in git-local-pull Junio C Hamano
2005-05-03  3:24     cogito "origin" vs. HEAD Benjamin Herrenschmidt
2005-05-03  6:49  3% ` Petr Baudis
2005-05-03  7:13  0%   ` Benjamin Herrenschmidt
2005-05-03  9:06  0%     ` Alexey Nezhdanov
2005-05-03  9:47  0%     ` Petr Baudis
2005-05-03 23:49  0%       ` Benjamin Herrenschmidt
2005-05-03  3:57     RFC: adding xdelta compression to git Alon Ziv
2005-05-03  4:52     ` Linus Torvalds
2005-05-03  8:06  1%   ` [PATCH] add the ability to create and retrieve delta objects Nicolas Pitre
2005-05-03 14:48         ` Linus Torvalds
2005-05-03 15:52  1%       ` Nicolas Pitre
2005-05-03 19:15     Careful object writing Linus Torvalds
2005-05-03 19:27     ` Chris Wedgwood
2005-05-03 19:47  3%   ` Linus Torvalds
2005-05-04 23:03     git and symlinks as tracked content Daniel Barkalow
2005-05-05  6:09     ` Alan Chandler
2005-05-05  9:51  3%   ` read-only git repositories David Lang
2005-05-04 23:19     [PATCH] Fix memory leaks in read_tree_recursive() Jonas Fonseca
2005-05-04 23:43     ` Junio C Hamano
2005-05-05  0:08  3%   ` Jonas Fonseca
2005-05-05  0:28     Kernel nightly snapshots Linus Torvalds
2005-05-05  2:06     ` H. Peter Anvin
2005-05-05 14:38       ` David Woodhouse
2005-05-05 14:44  3%     ` Linus Torvalds
2005-05-05 15:10  0%       ` David Woodhouse
2005-05-06 23:35     [PATCH] Introduce SHA1_FILE_DIRECTORIES Junio C Hamano
2005-05-07  0:20     ` Sean
2005-05-07  0:24       ` Junio C Hamano
2005-05-07  0:32         ` Sean
2005-05-07  6:31           ` Junio C Hamano
2005-05-07 19:51             ` Junio C Hamano
2005-05-09 13:33               ` H. Peter Anvin
2005-05-09 16:38                 ` Junio C Hamano
2005-05-09 16:41                   ` Sean
2005-05-09 18:03                     ` H. Peter Anvin
2005-05-09 18:50                       ` Junio C Hamano
2005-05-09 20:05  1%                     ` [RFC] Renaming environment variables Junio C Hamano
2005-05-08 19:35  3% Make errors John Kacur
2005-05-12  3:51     [PATCH] improved delta support for git Nicolas Pitre
2005-05-12  4:36  8% ` Junio C Hamano
2005-05-12 14:27  3%   ` Chris Mason
     [not found]         ` <2cfc403205051207467755cdf@mail.gmail.com>
2005-05-12 14:47  0%       ` Jon Seymour
2005-05-12 15:18  0%         ` Nicolas Pitre
2005-05-12 17:16  0%           ` Junio C Hamano
2005-05-13 11:44  0%             ` Chris Mason
2005-05-12  9:44     Mercurial 0.4e vs git network pull Matt Mackall
2005-05-12 18:23     ` Petr Baudis
2005-05-12 20:11       ` Matt Mackall
2005-05-12 20:14  3%     ` Petr Baudis
2005-05-12 20:57  0%       ` Matt Mackall
2005-05-12 16:51     [RFC] Support projects including other projects Daniel Barkalow
2005-05-12 17:24  2% ` David Lang
2005-05-13  6:49  4% [PATCH 0/4] Pulling refs files Daniel Barkalow
2005-05-13  6:56  6% ` [PATCH 2/4] Generic support for pulling refs Daniel Barkalow
2005-05-13  6:57  4% ` [PATCH 3/4] Pull refs by HTTP Daniel Barkalow
2005-05-13  7:01  3% ` [PATCH 4/4] Pulling refs by ssh Daniel Barkalow
2005-05-13 22:19  0% ` [PATCH 0/4] Pulling refs files Petr Baudis
2005-05-13 23:14  3%   ` Daniel Barkalow
2005-05-13 23:37  0%     ` Petr Baudis
2005-05-15  3:23  3%       ` Daniel Barkalow
2005-05-17 20:14  0%         ` Petr Baudis
2005-05-14  9:56     [PATCH] Add --author match to git-rev-list and git-rev-tree (Documentation) Junio C Hamano
2005-05-15  4:57  2% ` [PATCH] Add --author and --committer match to rev-list and rev-tree Junio C Hamano
2005-05-15 11:22     Mercurial 0.4e vs git network pull Adam J. Richter
2005-05-15 12:40  3% ` Petr Baudis
2005-05-16 22:22  0%   ` Tristan Wibberley
2005-05-15 11:52  0% Adam J. Richter
2005-05-15 14:23  0% ` Petr Baudis
2005-05-15 21:18  2% [PATCH 1/4] Add --author and --committer match to git-rev-list and git-rev-tree Junio C Hamano
2005-05-17 22:20     [PATCH 0/4] Pulling refs files Daniel Barkalow
2005-05-19  3:19  4% ` Daniel Barkalow
2005-05-19  6:52  0%   ` Petr Baudis
2005-05-19 16:00  0%     ` Daniel Barkalow
2005-05-19  5:11  4% [PATCH] Make rsh protocol extensible Daniel Barkalow
2005-05-21 21:47     cogito - how do I ??? Sam Ravnborg
2005-05-21 22:06     ` Sean
2005-05-21 23:41       ` Linus Torvalds
2005-05-22  7:14  5%     ` Sam Ravnborg
2005-05-22 16:23  5%       ` Linus Torvalds
2005-05-21 23:12     updated design for the diff-raw format Junio C Hamano
2005-05-22  2:40     ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
2005-05-22  2:42       ` [PATCH] The diff-raw format updates Junio C Hamano
2005-05-22 18:35  2%     ` Linus Torvalds
2005-05-30 20:00     I want to release a "git-1.0" Linus Torvalds
2005-05-30 20:49     ` Nicolas Pitre
2005-06-01  6:52       ` Junio C Hamano
2005-06-01  8:24 15%     ` [PATCH] Add -d flag to git-pull-* family Junio C Hamano
2005-05-31 13:45     ` I want to release a "git-1.0" Eric W. Biederman
2005-06-01  3:04       ` Linus Torvalds
2005-06-01 22:00  3%     ` Daniel Barkalow
2005-06-03  9:47  0%       ` Petr Baudis
2005-06-03 15:09  0%         ` Daniel Barkalow
2005-06-02 23:40         ` Adam Kropelin
2005-06-03  0:06           ` Linus Torvalds
2005-06-03  0:47             ` Linus Torvalds
2005-06-03  1:34  3%           ` Adam Kropelin
2005-05-31 19:00     [SCRIPT] cg-rpush & locking Tony Lindgren
2005-06-02 19:15  1% ` Dan Holmsand
2005-05-31 21:50 10% [PATCH] Add git-format-patch-script Junio C Hamano
2005-05-31 22:00 16% [COGITO PATCH] Heads and tags in subdirectories Santi Béjar
2005-06-01 12:59     ` Santi Béjar
2005-06-01 16:17 16%   ` Santi Béjar
2005-06-01  1:58 20% [PATCH] cg-clone fails to clone tags Miguel Bazdresch
2005-06-01  5:59 25% [PATCH] One Git To Rule Them All - Prep 2 Jason McMullan
2005-06-01  6:00  3% [PATCH] One Git To Rule Them All - Final Jason McMullan
2005-06-01 18:22 16% [PATCH] One-Git Part 2 (Patch 1/3) Jason McMullan
2005-06-01 18:23  7% [PATCH] One-Git Part 2 (Patch 3/3) Jason McMullan
     [not found]     <7vy89ums2l.fsf@assigned-by-dhcp.cox.net>
2005-06-01 18:38     ` [PATCH] diff: mode bits fixes Junio C Hamano
2005-06-02 16:46 10%   ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano
2005-06-02 17:03         ` Linus Torvalds
2005-06-02 18:55 10%       ` Junio C Hamano
2005-06-02 21:31             ` Nicolas Pitre
2005-06-02 21:36               ` Nicolas Pitre
2005-06-02 22:19 10%             ` [PATCH 1/2] " Junio C Hamano
2005-06-02 17:03         ` [PATCH] " McMullan, Jason
2005-06-02 18:02  3%       ` Junio C Hamano
2005-06-02 22:23  3% [ANNOUNCE] cogito-0.11 Petr Baudis
2005-06-03 21:43  4% [PATCH] ssh-protocol version, command types, response code Daniel Barkalow
2005-06-05  6:11 15% [PATCH] pull: gracefully recover from delta retrieval failure Junio C Hamano
2005-06-05 17:46     Junio C Hamano
2005-06-05 20:02  2% ` Daniel Barkalow
2005-06-06 20:27     [PATCH 0/4] Writing refs in git-ssh-push Daniel Barkalow
2005-06-06 20:38  6% ` [PATCH 3/4] Generic support for pulling refs Daniel Barkalow
2005-06-07 14:17 12% [PATCH] Add missing Documentation/* Jason McMullan
2005-06-07 14:19  4% [PATCH] Add git-help-script Jason McMullan
2005-06-07 14:25  3% ` McMullan, Jason
2005-06-08 23:07  1% [ANNOUNCE] Cogito-0.11.2 Petr Baudis
2005-06-10 18:53  5% do people use the 'git' command? Sebastian Kuzminsky
2005-06-11  3:15     ` Junio C Hamano
2005-06-11  5:26  4%   ` Sebastian Kuzminsky
2005-06-11  1:32  4% [PATCH] Add script for patch submission via e-mail Junio C Hamano
2005-06-11 11:47     reducing line crossings in gitk Paul Mackerras
2005-06-11 18:26  2% ` Junio C Hamano
2005-06-14  1:47  5% [PATCH] Adding Correct Useage Notification and -h Help flag James Purser
2005-06-14  2:23     ` Junio C Hamano
2005-06-14 21:49  4%   ` James Purser
2005-06-18 10:38  3% qgit-0.6 Marco Costalba
2005-06-19 13:00  0% ` qgit-0.6 Ingo Molnar
2005-06-20 16:19     Patch (apply) vs. Pull Darrin Thompson
2005-06-20 17:22     ` Junio C Hamano
2005-06-21 22:09  3%   ` Linus Torvalds
2005-06-23 23:21         ` [PATCH 0/3] Rebasing for "individual developer" usage Junio C Hamano
2005-06-23 23:28  4%       ` [PATCH 2/3] git-cherry: find commits not merged upstream Junio C Hamano
2005-06-23 23:29  4%       ` [PATCH 3/3] git-rebase-script: rebase local commits to new upstream head Junio C Hamano
     [not found]     <20050617181156.GT6957@suse.de>
     [not found]     ` <Pine.LNX.4.58.0506171132390.2268@ppc970.osdl.org>
     [not found]       ` <20050617183914.GX6957@suse.de>
2005-06-17 18:50         ` git merging Linus Torvalds
2005-06-17 23:08           ` Jeff Garzik
2005-06-17 23:31             ` Linus Torvalds
2005-06-17 23:51               ` Jeff Garzik
2005-06-18  0:13                 ` Linus Torvalds
2005-06-20 12:30                   ` Jens Axboe
2005-06-20 15:58                     ` Linus Torvalds
2005-06-20 20:38                       ` Jens Axboe
2005-06-20 21:15  4%                     ` Linus Torvalds
2005-06-21 14:59  0%                       ` Jens Axboe
2005-06-21  5:10     ORIG_HEAD David S. Miller
2005-06-21 21:06  4% ` ORIG_HEAD Linus Torvalds
2005-06-22  0:33     [PATCH 0/2] Pull objects of various types Daniel Barkalow
2005-06-22  0:35  3% ` [PATCH 2/2] Pull misc objects Daniel Barkalow
2005-06-22  0:45     [PATCH] Pull refs by HTTP Daniel Barkalow
2005-06-22  8:52 10% ` [PATCH] local-pull: implement fetch_ref() Junio C Hamano
2005-06-22 21:46  2% The coolest merge EVER! Linus Torvalds
2005-06-24  0:44     ` Junio C Hamano
2005-06-24 11:54  3%   ` Matthias Urlichs
2005-06-24 17:49  0%     ` Daniel Barkalow
2005-06-24 19:22  0%     ` Linus Torvalds
2005-06-22 22:24     Updated git HOWTO for kernel hackers Jeff Garzik
2005-06-22 23:09     ` Greg KH
2005-06-22 23:25       ` Linus Torvalds
2005-06-23  0:05         ` Jeff Garzik
2005-06-23  0:29           ` Linus Torvalds
2005-06-23  1:47             ` Jeff Garzik
2005-06-23  1:56               ` Linus Torvalds
2005-06-23  2:16                 ` Jeff Garzik
2005-06-23  2:39 10%               ` Linus Torvalds
2005-06-23  3:06  0%                 ` Jeff Garzik
2005-06-23  3:24  4%                   ` Linus Torvalds
2005-06-23  5:16  3%                     ` Jeff Garzik
2005-06-23  5:58  5%                       ` Linus Torvalds
2005-06-23  6:20  4%                         ` Greg KH
2005-06-23  6:51 10%                           ` Linus Torvalds
2005-06-23  7:11  0%                             ` Greg KH
2005-06-23  7:03  0%                         ` Jeff Garzik
2005-06-23  7:38  3%                         ` Petr Baudis
2005-06-23  8:30  0%                         ` Vojtech Pavlik
2005-06-23 14:31  0%                       ` Horst von Brand
2005-06-23 23:56     ` Mercurial vs " Matt Mackall
2005-06-24  6:41  2%   ` Petr Baudis
2005-06-23  4:23     Patch (apply) vs. Pull Linus Torvalds
2005-06-23  5:15  2% ` Daniel Barkalow
2005-06-23 10:12  3% [RFC] Order of push/pull file transfers Russell King
2005-06-24 16:38  0% ` Daniel Barkalow
2005-06-23 23:18     [PATCH 0/2] Finishing touches for pull "-w ref" Junio C Hamano
2005-06-23 23:23 25% ` [PATCH 1/2] Add a bit of developer documentation to pull.h Junio C Hamano
2005-06-23 23:20     [PATCH 0/2] D/F conflicts fixes Junio C Hamano
2005-06-24 23:40     ` [PATCH 2/2] Fix oversimplified optimization for add_cache_entry() Junio C Hamano
2005-06-25  0:57       ` Linus Torvalds
2005-06-25  2:40         ` Junio C Hamano
2005-06-25  9:16           ` [PATCH 0/9] " Junio C Hamano
2005-06-25  9:22  4%         ` [PATCH 3/9] git-cherry: find commits not merged upstream Junio C Hamano
2005-06-25  9:23  4%         ` [PATCH 4/9] git-rebase-script: rebase local commits to new upstream head Junio C Hamano
2005-06-25  9:26 25%         ` [PATCH 9/9] Add a bit of developer documentation to pull.h Junio C Hamano
2005-06-25  4:20     kernel.org and GIT tree rebuilding David S. Miller
2005-06-25  5:04  4% ` Junio C Hamano
2005-06-26 18:15  3% [PATCH] Add git-relink-script to fix up missing hardlinks Ryan Anderson
2005-06-27 12:27  3% git-pull-branch script Jeff Garzik
2005-06-28 19:23  2% [PATCH] cvsimport: rewritten in Perl Matthias Urlichs
2005-06-30 15:02     ` Sven Verdoolaege
2005-06-30 15:21       ` Matthias Urlichs
2005-06-30 15:44         ` Sven Verdoolaege
2005-06-30 16:10           ` Matthias Urlichs
2005-06-30 16:14             ` Sven Verdoolaege
2005-06-30 16:30               ` Matthias Urlichs
2005-06-30 17:22                 ` Nicolas Pitre
2005-07-01  9:43                   ` Matthias Urlichs
2005-07-03 10:35                     ` Sven Verdoolaege
2005-07-03 10:36  4%                   ` [PATCH] Make specification of CVS module to convert optional Sven Verdoolaege
2005-06-30 19:38             ` [PATCH] cvsimport: rewritten in Perl Sven Verdoolaege
2005-06-30 21:00               ` Matthias Urlichs
2005-07-04 13:03                 ` Sven Verdoolaege
2005-07-04 13:53                   ` Matthias Urlichs
2005-07-04 13:46                     ` Sven Verdoolaege
2005-07-04 14:36                       ` Matthias Urlichs
2005-07-04 15:52  5%                     ` Sven Verdoolaege
2005-06-30  5:58     [PATCH 1/1] Add a topological sort procedure to commit.c Jon Seymour
2005-06-30  6:52     ` Junio C Hamano
2005-06-30  7:00       ` Jon Seymour
2005-06-30  7:13         ` Junio C Hamano
2005-06-30  7:36  3%       ` [PATCH] git-format-patch: Prepare patches for e-mail submission Junio C Hamano
2005-06-30 17:54     "git-send-pack" Linus Torvalds
2005-06-30 19:49  4% ` "git-send-pack" Daniel Barkalow
2005-06-30 20:12  3%   ` "git-send-pack" Linus Torvalds
2005-06-30 20:23         ` "git-send-pack" H. Peter Anvin
2005-06-30 20:52  3%       ` "git-send-pack" Linus Torvalds
2005-06-30 21:23             ` "git-send-pack" H. Peter Anvin
2005-06-30 21:42               ` "git-send-pack" Linus Torvalds
2005-06-30 22:00                 ` "git-send-pack" H. Peter Anvin
2005-07-01 13:56                   ` Tags Eric W. Biederman
2005-07-01 16:37                     ` Tags H. Peter Anvin
2005-07-01 22:38                       ` Tags Eric W. Biederman
2005-07-01 22:44                         ` Tags H. Peter Anvin
2005-07-01 23:07                           ` Tags Eric W. Biederman
2005-07-02  0:06                             ` Tags H. Peter Anvin
2005-07-02  7:00                               ` Tags Eric W. Biederman
2005-07-02 17:47                                 ` Tags H. Peter Anvin
2005-07-02 17:54                                   ` Tags Eric W. Biederman
2005-07-02 17:58                                     ` Tags H. Peter Anvin
2005-07-02 18:31                                       ` Tags Eric W. Biederman
2005-07-02 21:16                                         ` Tags H. Peter Anvin
2005-07-02 21:39                                           ` Tags Linus Torvalds
2005-07-02 21:42                                             ` Tags H. Peter Anvin
2005-07-02 22:02                                               ` Tags A Large Angry SCM
2005-07-02 23:49                                                 ` Tags A Large Angry SCM
2005-07-03  0:17  2%                                               ` Tags Linus Torvalds
2005-07-02 20:38  3%                           ` Tags Jan Harkes
2005-07-02 22:32  4%                             ` Tags Jan Harkes
2005-06-30 20:49  0%     ` "git-send-pack" Daniel Barkalow
2005-07-04  3:05  6% [PATCH] Use MAP_FAILED instead of double cast Pavel Roskin
2005-07-04 18:48  3% [PATCH 0/2] Support for transferring pack files in git-ssh-* Daniel Barkalow
2005-07-04 18:50 10% ` [PATCH 1/2] Specify object not useful to pull Daniel Barkalow

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).