git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git network protocol
  2005-04-29 20:21   ` Noel Maddy
@ 2005-04-29 20:42     ` David Lang
  2005-04-29 21:15       ` Daniel Barkalow
  0 siblings, 1 reply; 7+ messages in thread
From: David Lang @ 2005-04-29 20:42 UTC (permalink / raw
  To: git

would it make sense for the network git protocol to be something along the 
lines of

client contacts server and sends
the tag you want to sync with (defaults to head)
the local index file

then the server can use the git tools locally to figure out what objects 
need to be sent to do the merge and only send those objects.

no this isn't as efficiant as only sending diffs, but it avoids sending 
any objects that aren't needed (which would be sent if you just did a 
straight rsync)

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git network protocol
  2005-04-29 20:42     ` git network protocol David Lang
@ 2005-04-29 21:15       ` Daniel Barkalow
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Barkalow @ 2005-04-29 21:15 UTC (permalink / raw
  To: David Lang; +Cc: git

On Fri, 29 Apr 2005, David Lang wrote:

> would it make sense for the network git protocol to be something along the 
> lines of
> 
> client contacts server and sends
> the tag you want to sync with (defaults to head)
> the local index file

Actually, you really want to have a bidirectional interaction, where the
client first fetches the info to determine where to start, and then goes
through the reachable space, asking for anything it doesn't already have.

(In the long run, we want to keep track of some things we already have all
of, or know we're missing, etc., so the receiver side doesn't have to
look over its whole tree.)

git already includes two versions of this protocol; the first runs against
a static HTTP server, and the second uses ssh to get a socket. At some
point, I'm going to enable these programs to read and write
.git/refs/?/? to figure out what they're supposed to get.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Git network protocol
@ 2006-08-14  6:21 Josef "Jeff" Sipek
  2006-08-14  6:42 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Josef "Jeff" Sipek @ 2006-08-14  6:21 UTC (permalink / raw
  To: git

Hello,

I'm trying to implement the git protocol, and I am having a bit of an issue
with the lack of information available about it (please correct me if I
missed some source of information.)

I understand the basic format of the protocol, however I'm not sure what
"command" can follow what. I also noticed some odd inconsistencies (or maybe
I just don't see the pattern yet.) For example, a git-clone generates this
traffic:

C: git-upload-pack ....
S: SHA1 HEAD....
S: SHA1 refs/heads/master
S: flush
C: want SHA1...
C: want SHA1... (it wants the same SHA1 twice!)
C: flush
C: done
S: NAK
S: the pack...

Then, when it is time to git-fetch a new commit, I get:

C: git-upload-pack ....
S: SHA1 HEAD....
S: SHA1 refs/heads/master
S: flush
C: want SHA1...
C: flush
C: have SHA1
C: done
S: ACK SHA1 continue
S: ACK SHA1 (same hash)
S: the pack...

Then, if I try to git-fetch but there is nothing new, I get:

C: git-upload-pack ....
S: SHA1 HEAD....
S: SHA1 refs/heads/master
S: flush
C: flush
<client closes connection>

So, I can _assume_ that "done" tells the server that it is time to make a
pack. Why does the server use NAK during the clone operation, but ACK
during fetch? Why does the server ACK the same SHA1 twice? And why does the
client "want" the same SHA1 twice? It just seems odd.

I think it would be great if there was some kind of description somewhere
that detailed the protocol. Also, the daemon source isn't the prettiest
thing in the world.

Thanks,
Josef "Jeff" Sipek.

-- 
Reality is merely an illusion, albeit a very persistent one.
		- Albert Einstein

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git network protocol
  2006-08-14  6:21 Git network protocol Josef "Jeff" Sipek
@ 2006-08-14  6:42 ` Junio C Hamano
  2006-08-14 23:48   ` Josef "Jeff" Sipek
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2006-08-14  6:42 UTC (permalink / raw
  To: Josef "Jeff" Sipek; +Cc: git

jeffpc@josefsipek.net (Josef "Jeff" Sipek) writes:

> I'm trying to implement the git protocol, and I am having a bit of an issue
> with the lack of information available about it (please correct me if I
> missed some source of information.)

Documentation/technical/pack-protocol.txt, and "git show 1bd8c8f0".

> So, I can _assume_ that "done" tells the server that it is time to make a
> pack. Why does the server use NAK during the clone operation, but ACK
> during fetch? Why does the server ACK the same SHA1 twice? And why does the
> client "want" the same SHA1 twice? It just seems odd.

During the initial "SHA-1 name"/"want" exchange, the server and
the client negotiate the protocol extension (the document needs
to be updated at least for "multi-ack" extension).

After that, the client and the server try to determine what
commits the client has that are recent ancestors of "want"
commits.  This exchange is done by client sending bunch of
"have" to the server.  The server responds "ACK" (in the
original protocol) to say "I've seen enough and know a common
ancestor to use".  In multi-ack protocol (which is used in
modern git, v0.99.9 and later), the server can respond "ACK
continue" to say "I've seen enough on that branch so do not
bother sending 'have's for its ancestors, but do keep sending
from other branch."  If the server does not feel it saw enough,
it does not send either.

The client can send "flush" -- this asks the server to give NAK
if more "have"s need to be sent.  If there is no more "have"s to
be sent, "done" is sent.

The protocol streams; if you see an ACK from server it does not
usually mean it is ACKing the last 'have' client has sent.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git network protocol
  2006-08-14  6:42 ` Junio C Hamano
@ 2006-08-14 23:48   ` Josef "Jeff" Sipek
  2006-08-15  0:59     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Josef "Jeff" Sipek @ 2006-08-14 23:48 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Sun, Aug 13, 2006 at 11:42:33PM -0700, Junio C Hamano wrote:
> jeffpc@josefsipek.net (Josef "Jeff" Sipek) writes:
> 
> > I'm trying to implement the git protocol, and I am having a bit of an issue
> > with the lack of information available about it (please correct me if I
> > missed some source of information.)
> 
> Documentation/technical/pack-protocol.txt

This is pretty much a vague example - or at least it feels vague if you
don't know the protocol. After reading the source, and your description, the
example makes a lot more sense.

> "git show 1bd8c8f0".

Neat.

> After that, the client and the server try to determine what
> commits the client has that are recent ancestors of "want"
> commits.  This exchange is done by client sending bunch of
> "have" to the server.  The server responds "ACK" (in the
> original protocol) to say "I've seen enough and know a common
> ancestor to use".  In multi-ack protocol (which is used in
> modern git, v0.99.9 and later), the server can respond "ACK
> continue" to say "I've seen enough on that branch so do not
> bother sending 'have's for its ancestors, but do keep sending
> from other branch."  If the server does not feel it saw enough,
> it does not send either.

So, if I understand this correctly, multi_ack allows for multiple branches
to be fetched using the same connection?

> The protocol streams; if you see an ACK from server it does not
> usually mean it is ACKing the last 'have' client has sent.

Yeah, makes sense. Doing things over lo does things like that - I'm not sure
why I didn't realize it earlier.

Thanks,
Josef "Jeff" Sipek.

-- 
Don't drink and derive. Alcohol and algebra don't mix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git network protocol
  2006-08-14 23:48   ` Josef "Jeff" Sipek
@ 2006-08-15  0:59     ` Junio C Hamano
  2006-08-15  9:14       ` Josef "Jeff" Sipek
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2006-08-15  0:59 UTC (permalink / raw
  To: Josef "Jeff" Sipek; +Cc: git

jeffpc@josefsipek.net (Josef "Jeff" Sipek) writes:

> So, if I understand this correctly, multi_ack allows for multiple branches
> to be fetched using the same connection?

The original protocol without extension already allowed it.
Suppose the global history was like this:

       o-----------------------o---o---o---o---o---o---o---c---c---c
      /
 o---o---o---o---o---o---o---o---s---s---s

where the server side had 's', the client side had 'c', and both
of them had 'o'.  The objective is to update client with three 's'
commits.

The exchange would go like this without multi-ack:

	S: SHA-1 name1 -- for the rightmost 'o' on the top branch
        S: SHA-2 name2 -- for the rightmost 's' on the bottom branch
	C: want SHA-2 -- ask for the second branch tip
	C: have SHA1 -- the rightmost 'c' on the top branch
	C: have SHA1 -- the parent of the above
	C: have SHA1 -- the parent of the above
        C: have SHA1 -- the parent of the above, rightmost 'o' on the top.
        C: ... more have from the top branch.
	S: ACK -- for the rightmost 'o' on the top branch

During this exchange, the server learns that the client and the
server shares the rightmost 'o' on the top branch, but does not
learn about all the 'o' commits on the bottom branch, so it ends
up sending everything from the fork point to complete three 's'
commits.

The multi-ack extension was invented by Johannes to improve this
exchange.  It changes the protocol to let the server ACK with
"ACK continue".  After that, the client is expected to stop
traversing the parents of ACK-continue'd commit -- so it has a
chance to send the rightmost 'o' on the bottom branch.  When the
server sees it, it again gives an ACK, and the client soon runs
out have's to send and says "done".  In this case, the server
would have the rightmost 'o' commits on both branches to work
out the minimum set of objects to complete three 's' commits
that are missing from the client.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Git network protocol
  2006-08-15  0:59     ` Junio C Hamano
@ 2006-08-15  9:14       ` Josef "Jeff" Sipek
  0 siblings, 0 replies; 7+ messages in thread
From: Josef "Jeff" Sipek @ 2006-08-15  9:14 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Mon, Aug 14, 2006 at 05:59:37PM -0700, Junio C Hamano wrote:
> jeffpc@josefsipek.net (Josef "Jeff" Sipek) writes:
> 
> > So, if I understand this correctly, multi_ack allows for multiple branches
> > to be fetched using the same connection?
> 
> The original protocol without extension already allowed it.
> Suppose the global history was like this:
...

Thanks. That helped a lot.

Josef "Jeff" Sipek.

-- 
Linux, n.:
  Generous programmers from around the world all join forces to help you
  shoot yourself in the foot for free. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-08-15  9:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-14  6:21 Git network protocol Josef "Jeff" Sipek
2006-08-14  6:42 ` Junio C Hamano
2006-08-14 23:48   ` Josef "Jeff" Sipek
2006-08-15  0:59     ` Junio C Hamano
2006-08-15  9:14       ` Josef "Jeff" Sipek
  -- strict thread matches above, loose matches on Subject: below --
2005-04-29 19:47 Mercurial 0.4b vs git patchbomb benchmark Noel Maddy
2005-04-29 19:54 ` Tom Lord
2005-04-29 20:21   ` Noel Maddy
2005-04-29 20:42     ` git network protocol David Lang
2005-04-29 21:15       ` Daniel Barkalow

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).