git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH v4 00/49] Add initial experimental external ODB support
@ 2017-06-20  7:54 Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf Christian Couder
                   ` (51 more replies)
  0 siblings, 52 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Goal
~~~~

Git can store its objects only in the form of loose objects in
separate files or packed objects in a pack file.

To be able to better handle some kind of objects, for example big
blobs, it would be nice if Git could store its objects in other object
databases (ODB).

To do that, this patch series makes it possible to register commands,
also called "helpers", using "odb.<odbname>.command" config variables,
to access external ODBs where objects can be stored and retrieved.

External ODBs should be able to tranfer information about the blobs
they store. This patch series shows how this is possible using kind of
replace refs.

Design
~~~~~~

* The "helpers" (registered commands)

Each helper manages access to one external ODB.

There are now 2 different modes for helper:

  - When "odb.<odbname>.scriptMode" is set to "true", the helper is
    launched each time Git wants to communicate with the <odbname>
    external ODB.

  - When "odb.<odbname>.scriptMode" is not set or set to "false", then
    the helper is launched once as a sub-process (using
    sub-process.h), and Git communicates with it using packet lines.

A helper can be given different instructions by Git. The instructions
that are supported are negociated at the beginning of the
communication using a capability mechanism.

For now the following instructions are supported: 

  - "have": the helper should respond with the sha1, size and type of
    all the objects the external ODB contains, one object per line.

  - "get <sha1>": the helper should then read from the external ODB
    the content of the object corresponding to <sha1> and pass it to Git.

  - "put <sha1> <size> <type>": the helper should then read from from
    Git an object and store it in the external ODB.

Currently "have" and "put" are optional.

There are 3 different kinds of "get" instructions depending on how the
helper passes objects to Git:

  - "fault_in": the helper will write the requested objects directly
    into the regular Git object database, and then Git will retry
    reading it from there.

  - "git_object": the helper will send the object as a Git object.

  - "plain_object": the helper will send the object (a blob) as a raw
    object. (The blob content will be sent as is.)

For now the kind of "get" that is supported is read from the
"odb.<odbname>.fetchKind" configuration variable, but in the future it
should be decided as part of the capability negociation.

* Transfering information

To tranfer information about the blobs stored in external ODB, some
special refs, called "odb ref", similar as replace refs, are used in
the tests of this series, but in general nothing forces the helper to
use that mechanism.

The external odb helper is responsible for using and creating the refs
in refs/odbs/<odbname>/, if it wants to do that. It is free for example
to just create one ref, as it is also free to create many refs. Git
would just transmit the refs that have been created by this helper, if
Git is asked to do so.

For now in the tests there is one odb ref per blob, as it is simple
and as it is similar to what git-lfs does. Each ref name is
refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
in the external odb named <odbname>.

These odb refs point to a blob that is stored in the Git
repository and contain information about the blob stored in the
external odb. This information can be specific to the external odb.
The repos can then share this information using commands like:

`git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

At the end of the current patch series, "git clone" is teached a
"--initial-refspec" option, that asks it to first fetch some specified
refs. This is used in the tests to fetch the odb refs first.

This way only one "git clone" command can setup a repo using the
external ODB mechanism as long as the right helper is installed on the
machine and as long as the following options are used:

  - "--initial-refspec <odbrefspec>" to fetch the odb refspec
  - "-c odb.<odbname>.command=<helper>" to configure the helper

There is also a test script that shows that the "--initial-refspec"
option along with the external ODB mechanism can be used to implement
cloning using bundles.

* External object database

This RFC patch series shows in the tests:

  - how to use another git repository as an external ODB (storing Git objects)
  - how to use an http server as an external ODB (storing plain objects)

(This works in both script mode and sub-process mode.)

* Performance

So the sub-process mode, which is now the default, has been
implemented in this new version of this patch series.

This has been implemented using the refactoring that Ben Peart did on
top of Lars Schneider's work on using sub-processes and packet lines
in the smudge/clean filters for git-lfs. This also uses further work
from Ben Peart called "read object process".

See:

http://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/
http://public-inbox.org/git/20170322165220.5660-1-benpeart@microsoft.com/

Thanks to this, the external ODB mechanism should in the end perform
as well as the git-lfs mechanism when many objects should be
transfered.

Implementation
~~~~~~~~~~~~~~

* Mechanism to call the registered commands

This series adds a set of function in external-odb.{c,h} that are
called by the rest of Git to manage all the external ODBs.

These functions use 'struct odb_helper' and its associated functions
defined in odb-helper.{c,h} to talk to the different external ODBs by
launching the configured "odb.<odbname>.command" commands and writing
to or reading from them.

* ODB refs

For now odb ref management is only implemented in a helper in t0410.

When a new blob is added to an external odb, its sha1, size and type
are writen in another new blob and the odb ref is created.

When the list of existing blobs is requested from the external odb,
the content of the blobs pointed to by the odb refs can also be used
by the odb to claim that it can get the objects.

When a blob is actually requested from the external odb, it can use
the content stored in the blobs pointed to by the odb refs to get the
actual blobs and then pass them.

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    - Patch 1/49 is a small code cleanup that I already sent to the
      mailing list but will probably be removed in the end bdue to
      ongoing work on "git clone"

    - Patches 02/49 to 08/49 create a Git/Packet.pm module by
      refactoring "t0021/rot13-filter.pl". Functions from this new
      module will be used later in test scripts.

    - Patches 09/49 to 16/49 create the external ODB insfrastructure
      in external-odb.{c,h} and odb-helper.{c,h} for the script mode.

    - Patches 17/49 to 23/49 improve lib-http to make it possible to
      use it as an external ODB to test storing blobs in an HTTP
      server.

    - Patches 24/49 to 44/49 improve the external ODB insfrastructure
      to support sub-processes and make everything work using them.

    - Patches 45/49 to 49/49 add the --initial-refspec to git clone
      along with tests.

Future work
~~~~~~~~~~~

First sorry about the state of this patch series, it is not as clean
as I would have liked, butI think it is interesting to get feedback
from the mailing list at this point, because the previous RFC was sent
a long time ago and a lot of things changed.

So a big part of the future work will be about cleaning this patch series.

Other things I think I am going to do:

  -   

Previous work and discussions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(Sorry for the old Gmane links, I will try to replace them with
public-inbox.org at one point.)

Peff started to work on this and discuss this some years ago:

http://thread.gmane.org/gmane.comp.version-control.git/206886/focus=207040
http://thread.gmane.org/gmane.comp.version-control.git/247171
http://thread.gmane.org/gmane.comp.version-control.git/202902/focus=203020

His work, which is not compile-tested any more, is still there:

https://github.com/peff/git/commits/jk/external-odb-wip

Initial discussions about this new series are there:

http://thread.gmane.org/gmane.comp.version-control.git/288151/focus=295160

Version 1, 2 and 3 of this RFC/PATCH series are here:

https://public-inbox.org/git/20160613085546.11784-1-chriscool@tuxfamily.org/
https://public-inbox.org/git/20160628181933.24620-1-chriscool@tuxfamily.org/
https://public-inbox.org/git/20161130210420.15982-1-chriscool@tuxfamily.org/

Some of the discussions related to Ben Peart's work that is used by
this series are here:

http://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/
http://public-inbox.org/git/20170322165220.5660-1-benpeart@microsoft.com/

Links
~~~~~

This patch series is available here:

https://github.com/chriscool/git/commits/external-odb

Version 1, 2 and 3 are here:

https://github.com/chriscool/git/commits/gl-external-odb12
https://github.com/chriscool/git/commits/gl-external-odb22
https://github.com/chriscool/git/commits/gl-external-odb61


Ben Peart (4):
  Documentation: add read-object-protocol.txt
  contrib: add long-running-read-object/example.pl
  Add t0410 to test read object mechanism
  odb-helper: add read_object_process()

Christian Couder (43):
  builtin/clone: get rid of 'value' strbuf
  t0021/rot13-filter: refactor packet reading functions
  t0021/rot13-filter: improve 'if .. elsif .. else' style
  Add Git/Packet.pm from parts of t0021/rot13-filter.pl
  t0021/rot13-filter: use Git/Packet.pm
  Git/Packet.pm: improve error message
  Git/Packet.pm: add packet_initialize()
  Git/Packet: add capability functions
  t0400: add 'put' command to odb-helper script
  external odb: add write support
  external-odb: accept only blobs for now
  t0400: add test for external odb write support
  Add GIT_NO_EXTERNAL_ODB env variable
  Add t0410 to test external ODB transfer
  lib-httpd: pass config file to start_httpd()
  lib-httpd: add upload.sh
  lib-httpd: add list.sh
  lib-httpd: add apache-e-odb.conf
  odb-helper: add 'store_plain_objects' to 'struct odb_helper'
  pack-objects: don't pack objects in external odbs
  t0420: add test with HTTP external odb
  odb-helper: start fault in implementation
  external-odb: add external_odb_fault_in_object()
  odb-helper: add script_mode
  external-odb: add external_odb_get_capabilities()
  t04*: add 'get_cap' support to helpers
  odb-helper: call odb_helper_lookup() with 'have' capability
  odb-helper: fix odb_helper_fetch_object() for read_object
  Add t0460 to test passing git objects
  odb-helper: add read_packetized_git_object_to_fd()
  odb-helper: add read_packetized_plain_object_to_fd()
  Add t0470 to test passing plain objects
  odb-helper: add write_object_process()
  Add t0480 to test "have" capability and plain objects
  external-odb: add external_odb_do_fetch_object()
  odb-helper: advertise 'have' capability
  odb-helper: advertise 'put' capability
  odb-helper: add have_object_process()
  clone: add initial param to write_remote_refs()
  clone: add --initial-refspec option
  clone: disable external odb before initial clone
  Add test for 'clone --initial-refspec'
  t: add t0430 to test cloning using bundles

Jeff King (2):
  Add initial external odb support
  external odb foreach

 Documentation/technical/read-object-protocol.txt | 102 +++
 Makefile                                         |   2 +
 builtin/clone.c                                  |  91 ++-
 builtin/pack-objects.c                           |   4 +
 cache.h                                          |  18 +
 contrib/long-running-read-object/example.pl      | 114 +++
 environment.c                                    |   4 +
 external-odb.c                                   | 220 +++++
 external-odb.h                                   |  17 +
 odb-helper.c                                     | 987 +++++++++++++++++++++++
 odb-helper.h                                     |  47 ++
 perl/Git/Packet.pm                               | 118 +++
 sha1_file.c                                      | 117 ++-
 t/lib-httpd.sh                                   |   8 +-
 t/lib-httpd/apache-e-odb.conf                    | 214 +++++
 t/lib-httpd/list.sh                              |  41 +
 t/lib-httpd/upload.sh                            |  45 ++
 t/t0021/rot13-filter.pl                          |  97 +--
 t/t0400-external-odb.sh                          |  85 ++
 t/t0410-transfer-e-odb.sh                        | 148 ++++
 t/t0420-transfer-http-e-odb.sh                   | 159 ++++
 t/t0430-clone-bundle-e-odb.sh                    |  91 +++
 t/t0450-read-object.sh                           |  30 +
 t/t0450/read-object                              |  56 ++
 t/t0460-read-object-git.sh                       |  29 +
 t/t0460/read-object-git                          |  67 ++
 t/t0470-read-object-http-e-odb.sh                | 123 +++
 t/t0470/read-object-plain                        |  93 +++
 t/t0480-read-object-have-http-e-odb.sh           | 123 +++
 t/t0480/read-object-plain-have                   | 116 +++
 t/t5616-clone-initial-refspec.sh                 |  48 ++
 31 files changed, 3296 insertions(+), 118 deletions(-)
 create mode 100644 Documentation/technical/read-object-protocol.txt
 create mode 100644 contrib/long-running-read-object/example.pl
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100644 perl/Git/Packet.pm
 create mode 100644 t/lib-httpd/apache-e-odb.conf
 create mode 100644 t/lib-httpd/list.sh
 create mode 100644 t/lib-httpd/upload.sh
 create mode 100755 t/t0400-external-odb.sh
 create mode 100755 t/t0410-transfer-e-odb.sh
 create mode 100755 t/t0420-transfer-http-e-odb.sh
 create mode 100755 t/t0430-clone-bundle-e-odb.sh
 create mode 100755 t/t0450-read-object.sh
 create mode 100755 t/t0450/read-object
 create mode 100755 t/t0460-read-object-git.sh
 create mode 100755 t/t0460/read-object-git
 create mode 100755 t/t0470-read-object-http-e-odb.sh
 create mode 100755 t/t0470/read-object-plain
 create mode 100755 t/t0480-read-object-have-http-e-odb.sh
 create mode 100755 t/t0480/read-object-plain-have
 create mode 100755 t/t5616-clone-initial-refspec.sh

-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 02/49] t0021/rot13-filter: refactor packet reading functions Christian Couder
                   ` (50 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This makes the code simpler by removing a few lines, and getting
rid of one variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/clone.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a2ea019c59..370a233d22 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -870,7 +870,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	const struct ref *our_head_points_at;
 	struct ref *mapped_refs;
 	const struct ref *ref;
-	struct strbuf key = STRBUF_INIT, value = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
 	struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
 	struct transport *transport = NULL;
 	const char *src_ref_prefix = "refs/heads/";
@@ -1035,7 +1035,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		strbuf_addf(&branch_top, "refs/remotes/%s/", option_origin);
 	}
 
-	strbuf_addf(&value, "+%s*:%s*", src_ref_prefix, branch_top.buf);
 	strbuf_addf(&key, "remote.%s.url", option_origin);
 	git_config_set(key.buf, repo);
 	strbuf_reset(&key);
@@ -1049,10 +1048,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	fetch_pattern = value.buf;
+	fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
 	refspec = parse_fetch_refspec(1, &fetch_pattern);
-
-	strbuf_reset(&value);
+	free((char *)fetch_pattern);
 
 	remote = remote_get(option_origin);
 	transport = transport_get(remote, remote->url[0]);
@@ -1191,7 +1189,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&reflog_msg);
 	strbuf_release(&branch_top);
 	strbuf_release(&key);
-	strbuf_release(&value);
 	junk_mode = JUNK_LEAVE_ALL;
 
 	free(refspec);
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 02/49] t0021/rot13-filter: refactor packet reading functions
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 03/49] t0021/rot13-filter: improve 'if .. elsif .. else' style Christian Couder
                   ` (49 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

To make it possible in a following commit to move packet
reading and writing functions into a Packet.pm module,
let's refactor these functions, so they don't handle
printing debug output and exiting.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 617f581e56..d6411ca523 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -39,8 +39,7 @@ sub packet_bin_read {
 	my $bytes_read = read STDIN, $buffer, 4;
 	if ( $bytes_read == 0 ) {
 		# EOF - Git stopped talking to us!
-		print $debug "STOP\n";
-		exit();
+		return ( -1, "" );
 	}
 	elsif ( $bytes_read != 4 ) {
 		die "invalid packet: '$buffer'";
@@ -64,7 +63,7 @@ sub packet_bin_read {
 
 sub packet_txt_read {
 	my ( $res, $buf ) = packet_bin_read();
-	unless ( $buf =~ s/\n$// ) {
+	unless ( $res == -1 || $buf =~ s/\n$// ) {
 		die "A non-binary line MUST be terminated by an LF.";
 	}
 	return ( $res, $buf );
@@ -109,7 +108,12 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-	my ($command) = packet_txt_read() =~ /^command=(.+)$/;
+	my ($res, $command) = packet_txt_read();
+	if ( $res == -1 ) {
+		print $debug "STOP\n";
+		exit();
+	}
+	$command =~ s/^command=//;
 	print $debug "IN: $command";
 	$debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 03/49] t0021/rot13-filter: improve 'if .. elsif .. else' style
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 02/49] t0021/rot13-filter: refactor packet reading functions Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 04/49] Add Git/Packet.pm from parts of t0021/rot13-filter.pl Christian Couder
                   ` (48 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index d6411ca523..1fc581c814 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -40,23 +40,20 @@ sub packet_bin_read {
 	if ( $bytes_read == 0 ) {
 		# EOF - Git stopped talking to us!
 		return ( -1, "" );
-	}
-	elsif ( $bytes_read != 4 ) {
+	} elsif ( $bytes_read != 4 ) {
 		die "invalid packet: '$buffer'";
 	}
 	my $pkt_size = hex($buffer);
 	if ( $pkt_size == 0 ) {
 		return ( 1, "" );
-	}
-	elsif ( $pkt_size > 4 ) {
+	} elsif ( $pkt_size > 4 ) {
 		my $content_size = $pkt_size - 4;
 		$bytes_read = read STDIN, $buffer, $content_size;
 		if ( $bytes_read != $content_size ) {
 			die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
 		}
 		return ( 0, $buffer );
-	}
-	else {
+	} else {
 		die "invalid packet size: $pkt_size";
 	}
 }
@@ -144,14 +141,11 @@ while (1) {
 	my $output;
 	if ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
 		$output = "";
-	}
-	elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
+	} elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
 		$output = rot13($input);
-	}
-	elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
+	} elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
 		$output = rot13($input);
-	}
-	else {
+	} else {
 		die "bad command '$command'";
 	}
 
@@ -163,14 +157,12 @@ while (1) {
 		$debug->flush();
 		packet_txt_write("status=error");
 		packet_flush();
-	}
-	elsif ( $pathname eq "abort.r" ) {
+	} elsif ( $pathname eq "abort.r" ) {
 		print $debug "[ABORT]\n";
 		$debug->flush();
 		packet_txt_write("status=abort");
 		packet_flush();
-	}
-	else {
+	} else {
 		packet_txt_write("status=success");
 		packet_flush();
 
@@ -187,8 +179,7 @@ while (1) {
 			print $debug ".";
 			if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
 				$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
-			}
-			else {
+			} else {
 				$output = "";
 			}
 		}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 04/49] Add Git/Packet.pm from parts of t0021/rot13-filter.pl
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (2 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 03/49] t0021/rot13-filter: improve 'if .. elsif .. else' style Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 05/49] t0021/rot13-filter: use Git/Packet.pm Christian Couder
                   ` (47 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This will make it possible to reuse packet reading and writing
functions in other test scripts.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 perl/Git/Packet.pm | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)
 create mode 100644 perl/Git/Packet.pm

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
new file mode 100644
index 0000000000..aaffecbe2a
--- /dev/null
+++ b/perl/Git/Packet.pm
@@ -0,0 +1,71 @@
+package Git::Packet;
+use 5.008;
+use strict;
+use warnings;
+BEGIN {
+	require Exporter;
+	if ($] < 5.008003) {
+		*import = \&Exporter::import;
+	} else {
+		# Exporter 5.57 which supports this invocation was
+		# released with perl 5.8.3
+		Exporter->import('import');
+	}
+}
+
+our @EXPORT = qw(
+			packet_bin_read
+			packet_txt_read
+			packet_bin_write
+			packet_txt_write
+			packet_flush
+		);
+our @EXPORT_OK = @EXPORT;
+
+sub packet_bin_read {
+	my $buffer;
+	my $bytes_read = read STDIN, $buffer, 4;
+	if ( $bytes_read == 0 ) {
+		# EOF - Git stopped talking to us!
+		return ( -1, "" );
+	} elsif ( $bytes_read != 4 ) {
+		die "invalid packet: '$buffer'";
+	}
+	my $pkt_size = hex($buffer);
+	if ( $pkt_size == 0 ) {
+		return ( 1, "" );
+	} elsif ( $pkt_size > 4 ) {
+		my $content_size = $pkt_size - 4;
+		$bytes_read = read STDIN, $buffer, $content_size;
+		if ( $bytes_read != $content_size ) {
+			die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
+		}
+		return ( 0, $buffer );
+	} else {
+		die "invalid packet size: $pkt_size";
+	}
+}
+
+sub packet_txt_read {
+	my ( $res, $buf ) = packet_bin_read();
+	unless ( $res == -1 || $buf =~ s/\n$// ) {
+		die "A non-binary line MUST be terminated by an LF.";
+	}
+	return ( $res, $buf );
+}
+
+sub packet_bin_write {
+	my $buf = shift;
+	print STDOUT sprintf( "%04x", length($buf) + 4 );
+	print STDOUT $buf;
+	STDOUT->flush();
+}
+
+sub packet_txt_write {
+	packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+	print STDOUT sprintf( "%04x", 0 );
+	STDOUT->flush();
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 05/49] t0021/rot13-filter: use Git/Packet.pm
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (3 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 04/49] Add Git/Packet.pm from parts of t0021/rot13-filter.pl Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 06/49] Git/Packet.pm: improve error message Christian Couder
                   ` (46 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

After creating Git/Packet.pm from part of t0021/rot13-filter.pl,
we can now simplify this script by using Git/Packet.pm.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 51 +++----------------------------------------------
 1 file changed, 3 insertions(+), 48 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 1fc581c814..36a9eb3608 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -19,9 +19,12 @@
 #     same command.
 #
 
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
 use strict;
 use warnings;
 use IO::File;
+use Git::Packet;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
 my @capabilities            = @ARGV;
@@ -34,54 +37,6 @@ sub rot13 {
 	return $str;
 }
 
-sub packet_bin_read {
-	my $buffer;
-	my $bytes_read = read STDIN, $buffer, 4;
-	if ( $bytes_read == 0 ) {
-		# EOF - Git stopped talking to us!
-		return ( -1, "" );
-	} elsif ( $bytes_read != 4 ) {
-		die "invalid packet: '$buffer'";
-	}
-	my $pkt_size = hex($buffer);
-	if ( $pkt_size == 0 ) {
-		return ( 1, "" );
-	} elsif ( $pkt_size > 4 ) {
-		my $content_size = $pkt_size - 4;
-		$bytes_read = read STDIN, $buffer, $content_size;
-		if ( $bytes_read != $content_size ) {
-			die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
-		}
-		return ( 0, $buffer );
-	} else {
-		die "invalid packet size: $pkt_size";
-	}
-}
-
-sub packet_txt_read {
-	my ( $res, $buf ) = packet_bin_read();
-	unless ( $res == -1 || $buf =~ s/\n$// ) {
-		die "A non-binary line MUST be terminated by an LF.";
-	}
-	return ( $res, $buf );
-}
-
-sub packet_bin_write {
-	my $buf = shift;
-	print STDOUT sprintf( "%04x", length($buf) + 4 );
-	print STDOUT $buf;
-	STDOUT->flush();
-}
-
-sub packet_txt_write {
-	packet_bin_write( $_[0] . "\n" );
-}
-
-sub packet_flush {
-	print STDOUT sprintf( "%04x", 0 );
-	STDOUT->flush();
-}
-
 print $debug "START\n";
 $debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 06/49] Git/Packet.pm: improve error message
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (4 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 05/49] t0021/rot13-filter: use Git/Packet.pm Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize() Christian Couder
                   ` (45 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Try to give a bit more information when we die()
because there is no new line at the end of something
we receive.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 perl/Git/Packet.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index aaffecbe2a..2ad6b00d6c 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -49,7 +49,8 @@ sub packet_bin_read {
 sub packet_txt_read {
 	my ( $res, $buf ) = packet_bin_read();
 	unless ( $res == -1 || $buf =~ s/\n$// ) {
-		die "A non-binary line MUST be terminated by an LF.";
+		die "A non-binary line MUST be terminated by an LF.\n"
+		    . "Received: '$buf'";
 	}
 	return ( $res, $buf );
 }
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (5 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 06/49] Git/Packet.pm: improve error message Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-23 18:55   ` Ben Peart
  2017-06-20  7:54 ` [RFC/PATCH v4 08/49] Git/Packet: add capability functions Christian Couder
                   ` (44 subsequent siblings)
  51 siblings, 1 reply; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Add a function to initialize the communication. And use this
function in 't/t0021/rot13-filter.pl'.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 perl/Git/Packet.pm      | 13 +++++++++++++
 t/t0021/rot13-filter.pl |  8 +-------
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index 2ad6b00d6c..b0233caf37 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -19,6 +19,7 @@ our @EXPORT = qw(
 			packet_bin_write
 			packet_txt_write
 			packet_flush
+			packet_initialize
 		);
 our @EXPORT_OK = @EXPORT;
 
@@ -70,3 +71,15 @@ sub packet_flush {
 	print STDOUT sprintf( "%04x", 0 );
 	STDOUT->flush();
 }
+
+sub packet_initialize {
+	my ($name, $version) = @_;
+
+	( packet_txt_read() eq ( 0, $name . "-client" ) )	|| die "bad initialize";
+	( packet_txt_read() eq ( 0, "version=" . $version ) )	|| die "bad version";
+	( packet_bin_read() eq ( 1, "" ) )			|| die "bad version end";
+
+	packet_txt_write( $name . "-server" );
+	packet_txt_write( "version=" . $version );
+	packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 36a9eb3608..5b05518640 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -40,13 +40,7 @@ sub rot13 {
 print $debug "START\n";
 $debug->flush();
 
-( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
-( packet_txt_read() eq ( 0, "version=2" ) )         || die "bad version";
-( packet_bin_read() eq ( 1, "" ) )                  || die "bad version end";
-
-packet_txt_write("git-filter-server");
-packet_txt_write("version=2");
-packet_flush();
+packet_initialize("git-filter", 2);
 
 ( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
 ( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 08/49] Git/Packet: add capability functions
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (6 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize() Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 09/49] Add initial external odb support Christian Couder
                   ` (43 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Add functions to help read and write capabilities.
Use these functions in 't/t0021/rot13-filter.pl'.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 perl/Git/Packet.pm      | 33 +++++++++++++++++++++++++++++++++
 t/t0021/rot13-filter.pl |  9 ++-------
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index b0233caf37..4443b67724 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -20,6 +20,9 @@ our @EXPORT = qw(
 			packet_txt_write
 			packet_flush
 			packet_initialize
+			packet_read_capabilities
+			packet_write_capabilities
+			packet_read_and_check_capabilities
 		);
 our @EXPORT_OK = @EXPORT;
 
@@ -83,3 +86,33 @@ sub packet_initialize {
 	packet_txt_write( "version=" . $version );
 	packet_flush();
 }
+
+sub packet_read_capabilities {
+	my @cap;
+	while (1) {
+		my ( $res, $buf ) = packet_bin_read();
+		return ( $res, @cap ) if ( $res != 0 );
+		unless ( $buf =~ s/\n$// ) {
+			die "A non-binary line MUST be terminated by an LF.\n"
+			    . "Received: '$buf'";
+		}
+		die "bad capability buf: '$buf'" unless ( $buf =~ s/capability=// );
+		push @cap, $buf;
+	}
+}
+
+sub packet_read_and_check_capabilities {
+	my @local_caps = @_;
+	my @remote_res_caps = packet_read_capabilities();
+	my $res = shift @remote_res_caps;
+	my %remote_caps = map { $_ => 1 } @remote_res_caps;
+	foreach (@local_caps) {
+	    die "'$_' capability not available" unless (exists($remote_caps{$_}));
+	}
+	return $res;
+}
+
+sub packet_write_capabilities {
+	packet_txt_write( "capability=" . $_ ) foreach (@_);
+	packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 5b05518640..bbfd52619d 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -42,14 +42,9 @@ $debug->flush();
 
 packet_initialize("git-filter", 2);
 
-( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-( packet_bin_read() eq ( 1, "" ) )                  || die "bad capability end";
+packet_read_and_check_capabilities("clean", "smudge");
+packet_write_capabilities(@capabilities);
 
-foreach (@capabilities) {
-	packet_txt_write( "capability=" . $_ );
-}
-packet_flush();
 print $debug "init handshake complete\n";
 $debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 09/49] Add initial external odb support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (7 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 08/49] Git/Packet: add capability functions Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-23 19:49   ` Ben Peart
  2017-06-20  7:54 ` [RFC/PATCH v4 10/49] external odb foreach Christian Couder
                   ` (42 subsequent siblings)
  51 siblings, 1 reply; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

From: Jeff King <peff@peff.net>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile                |   2 +
 cache.h                 |   9 ++
 external-odb.c          | 115 +++++++++++++++++++++++
 external-odb.h          |   8 ++
 odb-helper.c            | 245 ++++++++++++++++++++++++++++++++++++++++++++++++
 odb-helper.h            |  25 +++++
 sha1_file.c             |  79 +++++++++++-----
 t/t0400-external-odb.sh |  46 +++++++++
 8 files changed, 507 insertions(+), 22 deletions(-)
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100755 t/t0400-external-odb.sh

diff --git a/Makefile b/Makefile
index f484801638..b488874d60 100644
--- a/Makefile
+++ b/Makefile
@@ -776,6 +776,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += external-odb.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
@@ -808,6 +809,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += odb-helper.o
 LIB_OBJS += oidset.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
diff --git a/cache.h b/cache.h
index d6ba8a2f11..391a69e9c5 100644
--- a/cache.h
+++ b/cache.h
@@ -954,6 +954,12 @@ const char *git_path_shallow(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
+/*
+ * Like sha1_file_name, but return the filename within a specific alternate
+ * object directory. Shares the same static buffer with sha1_file_name.
+ */
+extern const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1);
+
 /*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
@@ -1265,6 +1271,8 @@ extern int do_check_packed_object_crc;
 
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
+extern int create_object_tmpfile(struct strbuf *tmp, const char *filename);
+extern void close_sha1_file(int fd);
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 extern int has_sha1_pack(const unsigned char *sha1);
@@ -1600,6 +1608,7 @@ extern void read_info_alternates(const char * relative_base, int depth);
 extern char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 extern int foreach_alt_odb(alt_odb_fn, void*);
+extern void prepare_external_alt_odb(void);
 
 /*
  * Allocate a "struct alternate_object_database" but do _not_ actually
diff --git a/external-odb.c b/external-odb.c
new file mode 100644
index 0000000000..1ccfa99a01
--- /dev/null
+++ b/external-odb.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+#include "external-odb.h"
+#include "odb-helper.h"
+
+static struct odb_helper *helpers;
+static struct odb_helper **helpers_tail = &helpers;
+
+static struct odb_helper *find_or_create_helper(const char *name, int len)
+{
+	struct odb_helper *o;
+
+	for (o = helpers; o; o = o->next)
+		if (!strncmp(o->name, name, len) && !o->name[len])
+			return o;
+
+	o = odb_helper_new(name, len);
+	*helpers_tail = o;
+	helpers_tail = &o->next;
+
+	return o;
+}
+
+static int external_odb_config(const char *var, const char *value, void *data)
+{
+	struct odb_helper *o;
+	const char *key, *dot;
+
+	if (!skip_prefix(var, "odb.", &key))
+		return 0;
+	dot = strrchr(key, '.');
+	if (!dot)
+		return 0;
+
+	o = find_or_create_helper(key, dot - key);
+	key = dot + 1;
+
+	if (!strcmp(key, "command"))
+		return git_config_string(&o->cmd, var, value);
+
+	return 0;
+}
+
+static void external_odb_init(void)
+{
+	static int initialized;
+
+	if (initialized)
+		return;
+	initialized = 1;
+
+	git_config(external_odb_config, NULL);
+}
+
+const char *external_odb_root(void)
+{
+	static const char *root;
+	if (!root)
+		root = git_pathdup("objects/external");
+	return root;
+}
+
+int external_odb_has_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next)
+		if (odb_helper_has_object(o, sha1))
+			return 1;
+	return 0;
+}
+
+int external_odb_fetch_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+	const char *path;
+
+	if (!external_odb_has_object(sha1))
+		return -1;
+
+	path = sha1_file_name_alt(external_odb_root(), sha1);
+	safe_create_leading_directories_const(path);
+	prepare_external_alt_odb();
+
+	for (o = helpers; o; o = o->next) {
+		struct strbuf tmpfile = STRBUF_INIT;
+		int ret;
+		int fd;
+
+		if (!odb_helper_has_object(o, sha1))
+			continue;
+
+		fd = create_object_tmpfile(&tmpfile, path);
+		if (fd < 0) {
+			strbuf_release(&tmpfile);
+			return -1;
+		}
+
+		if (odb_helper_fetch_object(o, sha1, fd) < 0) {
+			close(fd);
+			unlink(tmpfile.buf);
+			strbuf_release(&tmpfile);
+			continue;
+		}
+
+		close_sha1_file(fd);
+		ret = finalize_object_file(tmpfile.buf, path);
+		strbuf_release(&tmpfile);
+		if (!ret)
+			return 0;
+	}
+
+	return -1;
+}
diff --git a/external-odb.h b/external-odb.h
new file mode 100644
index 0000000000..2397477684
--- /dev/null
+++ b/external-odb.h
@@ -0,0 +1,8 @@
+#ifndef EXTERNAL_ODB_H
+#define EXTERNAL_ODB_H
+
+const char *external_odb_root(void);
+int external_odb_has_object(const unsigned char *sha1);
+int external_odb_fetch_object(const unsigned char *sha1);
+
+#endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
new file mode 100644
index 0000000000..de5562da9c
--- /dev/null
+++ b/odb-helper.c
@@ -0,0 +1,245 @@
+#include "cache.h"
+#include "object.h"
+#include "argv-array.h"
+#include "odb-helper.h"
+#include "run-command.h"
+#include "sha1-lookup.h"
+
+struct odb_helper *odb_helper_new(const char *name, int namelen)
+{
+	struct odb_helper *o;
+
+	o = xcalloc(1, sizeof(*o));
+	o->name = xmemdupz(name, namelen);
+
+	return o;
+}
+
+struct odb_helper_cmd {
+	struct argv_array argv;
+	struct child_process child;
+};
+
+static void prepare_helper_command(struct argv_array *argv, const char *cmd,
+				   const char *fmt, va_list ap)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addstr(&buf, cmd);
+	strbuf_addch(&buf, ' ');
+	strbuf_vaddf(&buf, fmt, ap);
+
+	argv_array_push(argv, buf.buf);
+	strbuf_release(&buf);
+}
+
+__attribute__((format (printf,3,4)))
+static int odb_helper_start(struct odb_helper *o,
+			    struct odb_helper_cmd *cmd,
+			    const char *fmt, ...)
+{
+	va_list ap;
+
+	memset(cmd, 0, sizeof(*cmd));
+	argv_array_init(&cmd->argv);
+
+	if (!o->cmd)
+		return -1;
+
+	va_start(ap, fmt);
+	prepare_helper_command(&cmd->argv, o->cmd, fmt, ap);
+	va_end(ap);
+
+	cmd->child.argv = cmd->argv.argv;
+	cmd->child.use_shell = 1;
+	cmd->child.no_stdin = 1;
+	cmd->child.out = -1;
+
+	if (start_command(&cmd->child) < 0) {
+		argv_array_clear(&cmd->argv);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int odb_helper_finish(struct odb_helper *o,
+			     struct odb_helper_cmd *cmd)
+{
+	int ret = finish_command(&cmd->child);
+	argv_array_clear(&cmd->argv);
+	if (ret) {
+		warning("odb helper '%s' reported failure", o->name);
+		return -1;
+	}
+	return 0;
+}
+
+static int parse_object_line(struct odb_helper_object *o, const char *line)
+{
+	char *end;
+	if (get_sha1_hex(line, o->sha1) < 0)
+		return -1;
+
+	line += 40;
+	if (*line++ != ' ')
+		return -1;
+
+	o->size = strtoul(line, &end, 10);
+	if (line == end || *end++ != ' ')
+		return -1;
+
+	o->type = type_from_string(end);
+	return 0;
+}
+
+static int add_have_entry(struct odb_helper *o, const char *line)
+{
+	ALLOC_GROW(o->have, o->have_nr+1, o->have_alloc);
+	if (parse_object_line(&o->have[o->have_nr], line) < 0) {
+		warning("bad 'have' input from odb helper '%s': %s",
+			o->name, line);
+		return 1;
+	}
+	o->have_nr++;
+	return 0;
+}
+
+static int odb_helper_object_cmp(const void *va, const void *vb)
+{
+	const struct odb_helper_object *a = va, *b = vb;
+	return hashcmp(a->sha1, b->sha1);
+}
+
+static void odb_helper_load_have(struct odb_helper *o)
+{
+	struct odb_helper_cmd cmd;
+	FILE *fh;
+	struct strbuf line = STRBUF_INIT;
+
+	if (o->have_valid)
+		return;
+	o->have_valid = 1;
+
+	if (odb_helper_start(o, &cmd, "have") < 0)
+		return;
+
+	fh = xfdopen(cmd.child.out, "r");
+	while (strbuf_getline(&line, fh) != EOF)
+		if (add_have_entry(o, line.buf))
+			break;
+
+	strbuf_release(&line);
+	fclose(fh);
+	odb_helper_finish(o, &cmd);
+
+	qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
+}
+
+static struct odb_helper_object *odb_helper_lookup(struct odb_helper *o,
+						   const unsigned char *sha1)
+{
+	int idx;
+
+	odb_helper_load_have(o);
+	idx = sha1_entry_pos(o->have, sizeof(*o->have), 0,
+			     0, o->have_nr, o->have_nr,
+			     sha1);
+	if (idx < 0)
+		return NULL;
+	return &o->have[idx];
+}
+
+int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1)
+{
+	return !!odb_helper_lookup(o, sha1);
+}
+
+int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
+			    int fd)
+{
+	struct odb_helper_object *obj;
+	struct odb_helper_cmd cmd;
+	unsigned long total_got;
+	git_zstream stream;
+	int zret = Z_STREAM_END;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	obj = odb_helper_lookup(o, sha1);
+	if (!obj)
+		return -1;
+
+	if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
+		return -1;
+
+	memset(&stream, 0, sizeof(stream));
+	git_inflate_init(&stream);
+	git_SHA1_Init(&hash);
+	total_got = 0;
+
+	for (;;) {
+		unsigned char buf[4096];
+		int r;
+
+		r = xread(cmd.child.out, buf, sizeof(buf));
+		if (r < 0) {
+			error("unable to read from odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			git_inflate_end(&stream);
+			return -1;
+		}
+		if (r == 0)
+			break;
+
+		write_or_die(fd, buf, r);
+
+		stream.next_in = buf;
+		stream.avail_in = r;
+		do {
+			unsigned char inflated[4096];
+			unsigned long got;
+
+			stream.next_out = inflated;
+			stream.avail_out = sizeof(inflated);
+			zret = git_inflate(&stream, Z_SYNC_FLUSH);
+			got = sizeof(inflated) - stream.avail_out;
+
+			git_SHA1_Update(&hash, inflated, got);
+			/* skip header when counting size */
+			if (!total_got) {
+				const unsigned char *p = memchr(inflated, '\0', got);
+				if (p)
+					got -= p - inflated + 1;
+				else
+					got = 0;
+			}
+			total_got += got;
+		} while (stream.avail_in && zret == Z_OK);
+	}
+
+	close(cmd.child.out);
+	git_inflate_end(&stream);
+	git_SHA1_Final(real_sha1, &hash);
+	if (odb_helper_finish(o, &cmd))
+		return -1;
+	if (zret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+	if (total_got != obj->size) {
+		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
+			o->name, sha1_to_hex(sha1), total_got, obj->size);
+		return -1;
+	}
+	if (hashcmp(real_sha1, sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
new file mode 100644
index 0000000000..0f704f9452
--- /dev/null
+++ b/odb-helper.h
@@ -0,0 +1,25 @@
+#ifndef ODB_HELPER_H
+#define ODB_HELPER_H
+
+struct odb_helper {
+	const char *name;
+	const char *cmd;
+
+	struct odb_helper_object {
+		unsigned char sha1[20];
+		unsigned long size;
+		enum object_type type;
+	} *have;
+	int have_nr;
+	int have_alloc;
+	int have_valid;
+
+	struct odb_helper *next;
+};
+
+struct odb_helper *odb_helper_new(const char *name, int namelen);
+int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
+int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
+			    int fd);
+
+#endif /* ODB_HELPER_H */
diff --git a/sha1_file.c b/sha1_file.c
index 59a4ed2ed3..f87c59d711 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -27,6 +27,7 @@
 #include "list.h"
 #include "mergesort.h"
 #include "quote.h"
+#include "external-odb.h"
 
 #define SZ_FMT PRIuMAX
 static inline uintmax_t sz_fmt(size_t s) { return s; }
@@ -252,12 +253,12 @@ static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
 	}
 }
 
-const char *sha1_file_name(const unsigned char *sha1)
+const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1)
 {
 	static struct strbuf buf = STRBUF_INIT;
 
 	strbuf_reset(&buf);
-	strbuf_addf(&buf, "%s/", get_object_directory());
+	strbuf_addf(&buf, "%s/", objdir);
 
 	fill_sha1_path(&buf, sha1);
 	return buf.buf;
@@ -277,9 +278,14 @@ static const char *alt_sha1_path(struct alternate_object_database *alt,
 	return buf->buf;
 }
 
- char *odb_pack_name(struct strbuf *buf,
-		     const unsigned char *sha1,
-		     const char *ext)
+const char *sha1_file_name(const unsigned char *sha1)
+{
+	return sha1_file_name_alt(get_object_directory(), sha1);
+}
+
+char *odb_pack_name(struct strbuf *buf,
+		    const unsigned char *sha1,
+		    const char *ext)
 {
 	strbuf_reset(buf);
 	strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(),
@@ -631,6 +637,21 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
 	return r;
 }
 
+void prepare_external_alt_odb(void)
+{
+	static int linked_external;
+	const char *path;
+
+	if (linked_external)
+		return;
+
+	path = external_odb_root();
+	if (!access(path, F_OK)) {
+		link_alt_odb_entry(path, NULL, 0, "");
+		linked_external = 1;
+	}
+}
+
 void prepare_alt_odb(void)
 {
 	const char *alt;
@@ -645,6 +666,7 @@ void prepare_alt_odb(void)
 	link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0);
 
 	read_info_alternates(get_object_directory(), 0);
+	prepare_external_alt_odb();
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
@@ -685,7 +707,7 @@ static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
 		if (check_and_freshen_file(path, freshen))
 			return 1;
 	}
-	return 0;
+	return external_odb_has_object(sha1);
 }
 
 static int check_and_freshen(const unsigned char *sha1, int freshen)
@@ -1727,24 +1749,14 @@ static int stat_sha1_file(const unsigned char *sha1, struct stat *st,
 	return -1;
 }
 
-/*
- * Like stat_sha1_file(), but actually open the object and return the
- * descriptor. See the caveats on the "path" parameter above.
- */
-static int open_sha1_file(const unsigned char *sha1, const char **path)
+static int open_sha1_file_alt(const unsigned char *sha1, const char **path)
 {
-	int fd;
 	struct alternate_object_database *alt;
-	int most_interesting_errno;
-
-	*path = sha1_file_name(sha1);
-	fd = git_open(*path);
-	if (fd >= 0)
-		return fd;
-	most_interesting_errno = errno;
+	int most_interesting_errno = errno;
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
+		int fd;
 		*path = alt_sha1_path(alt, sha1);
 		fd = git_open(*path);
 		if (fd >= 0)
@@ -1756,6 +1768,29 @@ static int open_sha1_file(const unsigned char *sha1, const char **path)
 	return -1;
 }
 
+/*
+ * Like stat_sha1_file(), but actually open the object and return the
+ * descriptor. See the caveats on the "path" parameter above.
+ */
+static int open_sha1_file(const unsigned char *sha1, const char **path)
+{
+	int fd;
+
+	*path = sha1_file_name(sha1);
+	fd = git_open(*path);
+	if (fd >= 0)
+		return fd;
+
+	fd = open_sha1_file_alt(sha1, path);
+	if (fd >= 0)
+		return fd;
+
+	if (!external_odb_fetch_object(sha1))
+		fd = open_sha1_file_alt(sha1, path);
+
+	return fd;
+}
+
 /*
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
@@ -3268,7 +3303,7 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_sha1_file(int fd)
+void close_sha1_file(int fd)
 {
 	if (fsync_object_files)
 		fsync_or_die(fd, "sha1 file");
@@ -3292,7 +3327,7 @@ static inline int directory_size(const char *filename)
  * We want to avoid cross-directory filename renames, because those
  * can have problems on various filesystems (FAT, NFS, Coda).
  */
-static int create_tmpfile(struct strbuf *tmp, const char *filename)
+int create_object_tmpfile(struct strbuf *tmp, const char *filename)
 {
 	int fd, dirlen = directory_size(filename);
 
@@ -3332,7 +3367,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	const char *filename = sha1_file_name(sha1);
 
-	fd = create_tmpfile(&tmp_file, filename);
+	fd = create_object_tmpfile(&tmp_file, filename);
 	if (fd < 0) {
 		if (errno == EACCES)
 			return error("insufficient permission for adding an object to repository database %s", get_object_directory());
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
new file mode 100755
index 0000000000..fe85413725
--- /dev/null
+++ b/t/t0400-external-odb.sh
@@ -0,0 +1,46 @@
+#!/bin/sh
+
+test_description='basic tests for external object databases'
+
+. ./test-lib.sh
+
+ALT_SOURCE="$PWD/alt-repo/.git"
+export ALT_SOURCE
+write_script odb-helper <<\EOF
+GIT_DIR=$ALT_SOURCE; export GIT_DIR
+case "$1" in
+have)
+	git cat-file --batch-check --batch-all-objects |
+	awk '{print $1 " " $3 " " $2}'
+	;;
+get)
+	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-helper"
+
+test_expect_success 'setup alternate repo' '
+	git init alt-repo &&
+	(cd alt-repo &&
+	 test_commit one &&
+	 test_commit two
+	) &&
+	alt_head=`cd alt-repo && git rev-parse HEAD`
+'
+
+test_expect_success 'alt objects are missing' '
+	test_must_fail git log --format=%s $alt_head
+'
+
+test_expect_success 'helper can retrieve alt objects' '
+	test_config odb.magic.command "$HELPER" &&
+	cat >expect <<-\EOF &&
+	two
+	one
+	EOF
+	git log --format=%s $alt_head >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 10/49] external odb foreach
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (8 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 09/49] Add initial external odb support Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 11/49] t0400: add 'put' command to odb-helper script Christian Couder
                   ` (41 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

From: Jeff King <peff@peff.net>

---
 external-odb.c | 14 ++++++++++++++
 external-odb.h |  6 ++++++
 odb-helper.c   | 15 +++++++++++++++
 odb-helper.h   |  4 ++++
 4 files changed, 39 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 1ccfa99a01..42978a3298 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -113,3 +113,17 @@ int external_odb_fetch_object(const unsigned char *sha1)
 
 	return -1;
 }
+
+int external_odb_for_each_object(each_external_object_fn fn, void *data)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next) {
+		int r = odb_helper_for_each_object(o, fn, data);
+		if (r)
+			return r;
+	}
+	return 0;
+}
diff --git a/external-odb.h b/external-odb.h
index 2397477684..cea8570a49 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -5,4 +5,10 @@ const char *external_odb_root(void);
 int external_odb_has_object(const unsigned char *sha1);
 int external_odb_fetch_object(const unsigned char *sha1);
 
+typedef int (*each_external_object_fn)(const unsigned char *sha1,
+				       enum object_type type,
+				       unsigned long size,
+				       void *data);
+int external_odb_for_each_object(each_external_object_fn, void *);
+
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index de5562da9c..d8ef5cbf4b 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -243,3 +243,18 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 
 	return 0;
 }
+
+int odb_helper_for_each_object(struct odb_helper *o,
+			       each_external_object_fn fn,
+			       void *data)
+{
+	int i;
+	for (i = 0; i < o->have_nr; i++) {
+		struct odb_helper_object *obj = &o->have[i];
+		int r = fn(obj->sha1, obj->type, obj->size, data);
+		if (r)
+			return r;
+	}
+
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 0f704f9452..8c3916d215 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -1,6 +1,8 @@
 #ifndef ODB_HELPER_H
 #define ODB_HELPER_H
 
+#include "external-odb.h"
+
 struct odb_helper {
 	const char *name;
 	const char *cmd;
@@ -21,5 +23,7 @@ struct odb_helper *odb_helper_new(const char *name, int namelen);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
+int odb_helper_for_each_object(struct odb_helper *o,
+			       each_external_object_fn, void *);
 
 #endif /* ODB_HELPER_H */
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 11/49] t0400: add 'put' command to odb-helper script
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (9 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 10/49] external odb foreach Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 12/49] external odb: add write support Christian Couder
                   ` (40 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index fe85413725..6c6da5cf4f 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -7,6 +7,10 @@ test_description='basic tests for external object databases'
 ALT_SOURCE="$PWD/alt-repo/.git"
 export ALT_SOURCE
 write_script odb-helper <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
 have)
@@ -16,6 +20,16 @@ have)
 get)
 	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
 	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
 esac
 EOF
 HELPER="\"$PWD\"/odb-helper"
@@ -43,4 +57,13 @@ test_expect_success 'helper can retrieve alt objects' '
 	test_cmp expect actual
 '
 
+test_expect_success 'helper can add objects to alt repo' '
+	hash=$(echo "Hello odb!" | git hash-object -w -t blob --stdin) &&
+	test -f .git/objects/$(echo $hash | sed "s#..#&/#") &&
+	size=$(git cat-file -s "$hash") &&
+	git cat-file blob "$hash" | ./odb-helper put "$hash" "$size" blob &&
+	alt_size=$(cd alt-repo && git cat-file -s "$hash") &&
+	test "$size" -eq "$alt_size"
+'
+
 test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 12/49] external odb: add write support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (10 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 11/49] t0400: add 'put' command to odb-helper script Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 13/49] external-odb: accept only blobs for now Christian Couder
                   ` (39 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 15 +++++++++++++++
 external-odb.h |  2 ++
 odb-helper.c   | 41 +++++++++++++++++++++++++++++++++++++----
 odb-helper.h   |  3 +++
 sha1_file.c    |  2 ++
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 42978a3298..893937a7d4 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -127,3 +127,18 @@ int external_odb_for_each_object(each_external_object_fn fn, void *data)
 	}
 	return 0;
 }
+
+int external_odb_write_object(const void *buf, size_t len,
+			      const char *type, unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next) {
+		int r = odb_helper_write_object(o, buf, len, type, sha1);
+		if (r <= 0)
+			return r;
+	}
+	return 1;
+}
diff --git a/external-odb.h b/external-odb.h
index cea8570a49..53879e900d 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -10,5 +10,7 @@ typedef int (*each_external_object_fn)(const unsigned char *sha1,
 				       unsigned long size,
 				       void *data);
 int external_odb_for_each_object(each_external_object_fn, void *);
+int external_odb_write_object(const void *buf, size_t len,
+			      const char *type, unsigned char *sha1);
 
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index d8ef5cbf4b..af7cc55ca2 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -33,9 +33,10 @@ static void prepare_helper_command(struct argv_array *argv, const char *cmd,
 	strbuf_release(&buf);
 }
 
-__attribute__((format (printf,3,4)))
+__attribute__((format (printf,4,5)))
 static int odb_helper_start(struct odb_helper *o,
 			    struct odb_helper_cmd *cmd,
+			    int use_stdin,
 			    const char *fmt, ...)
 {
 	va_list ap;
@@ -52,7 +53,10 @@ static int odb_helper_start(struct odb_helper *o,
 
 	cmd->child.argv = cmd->argv.argv;
 	cmd->child.use_shell = 1;
-	cmd->child.no_stdin = 1;
+	if (use_stdin)
+		cmd->child.in = -1;
+	else
+		cmd->child.no_stdin = 1;
 	cmd->child.out = -1;
 
 	if (start_command(&cmd->child) < 0) {
@@ -121,7 +125,7 @@ static void odb_helper_load_have(struct odb_helper *o)
 		return;
 	o->have_valid = 1;
 
-	if (odb_helper_start(o, &cmd, "have") < 0)
+	if (odb_helper_start(o, &cmd, 0, "have") < 0)
 		return;
 
 	fh = xfdopen(cmd.child.out, "r");
@@ -170,7 +174,7 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 	if (!obj)
 		return -1;
 
-	if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
+	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
 		return -1;
 
 	memset(&stream, 0, sizeof(stream));
@@ -258,3 +262,32 @@ int odb_helper_for_each_object(struct odb_helper *o,
 
 	return 0;
 }
+
+int odb_helper_write_object(struct odb_helper *o,
+			    const void *buf, size_t len,
+			    const char *type, unsigned char *sha1)
+{
+	struct odb_helper_cmd cmd;
+
+	if (odb_helper_start(o, &cmd, 1, "put %s %"PRIuMAX" %s",
+			     sha1_to_hex(sha1), (uintmax_t)len, type) < 0)
+		return -1;
+
+	do {
+		int w = xwrite(cmd.child.in, buf, len);
+		if (w < 0) {
+			error("unable to write to odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.in);
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			return -1;
+		}
+		len -= w;
+	} while (len > 0);
+
+	close(cmd.child.in);
+	close(cmd.child.out);
+	odb_helper_finish(o, &cmd);
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 8c3916d215..4e321195e8 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -25,5 +25,8 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
 int odb_helper_for_each_object(struct odb_helper *o,
 			       each_external_object_fn, void *);
+int odb_helper_write_object(struct odb_helper *o,
+			    const void *buf, size_t len,
+			    const char *type, unsigned char *sha1);
 
 #endif /* ODB_HELPER_H */
diff --git a/sha1_file.c b/sha1_file.c
index f87c59d711..8dd09334cf 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3450,6 +3450,8 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
 	 * it out into .git/objects/??/?{38} file.
 	 */
 	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
+	if (!external_odb_write_object(buf, len, type, sha1))
+		return 0;
 	if (freshen_packed_object(sha1) || freshen_loose_object(sha1))
 		return 0;
 	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 13/49] external-odb: accept only blobs for now
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (11 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 12/49] external odb: add write support Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 14/49] t0400: add test for external odb write support Christian Couder
                   ` (38 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 893937a7d4..6d4fdd0bc1 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -133,6 +133,10 @@ int external_odb_write_object(const void *buf, size_t len,
 {
 	struct odb_helper *o;
 
+	/* For now accept only blobs */
+	if (strcmp(type, "blob"))
+		return 1;
+
 	external_odb_init();
 
 	for (o = helpers; o; o = o->next) {
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 14/49] t0400: add test for external odb write support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (12 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 13/49] external-odb: accept only blobs for now Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 15/49] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
                   ` (37 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 6c6da5cf4f..3c868cad4c 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -66,4 +66,12 @@ test_expect_success 'helper can add objects to alt repo' '
 	test "$size" -eq "$alt_size"
 '
 
+test_expect_success 'commit adds objects to alt repo' '
+	test_config odb.magic.command "$HELPER" &&
+	test_commit three &&
+	hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo && git show "$hash3") &&
+	test "$content" = "three"
+'
+
 test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 15/49] Add GIT_NO_EXTERNAL_ODB env variable
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (13 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 14/49] t0400: add test for external odb write support Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 16/49] Add t0410 to test external ODB transfer Christian Couder
                   ` (36 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h        | 9 +++++++++
 environment.c  | 4 ++++
 external-odb.c | 6 ++++++
 sha1_file.c    | 3 +++
 4 files changed, 22 insertions(+)

diff --git a/cache.h b/cache.h
index 391a69e9c5..6047755629 100644
--- a/cache.h
+++ b/cache.h
@@ -428,6 +428,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES"
 #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS"
 #define GIT_REPLACE_REF_BASE_ENVIRONMENT "GIT_REPLACE_REF_BASE"
+#define NO_EXTERNAL_ODB_ENVIRONMENT "GIT_NO_EXTERNAL_ODB"
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
@@ -760,6 +761,14 @@ void reset_shared_repository(void);
 extern int check_replace_refs;
 extern char *git_replace_ref_base;
 
+/*
+ * Do external odbs need to be used this run?  This variable is
+ * initialized to true unless $GIT_NO_EXTERNAL_ODB is set, but it
+ * maybe set to false by some commands that do not want external
+ * odbs to be active.
+ */
+extern int use_external_odb;
+
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/environment.c b/environment.c
index aa478e71de..8c4f52635c 100644
--- a/environment.c
+++ b/environment.c
@@ -46,6 +46,7 @@ const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 int check_replace_refs = 1;
 char *git_replace_ref_base;
+int use_external_odb = 1;
 enum eol core_eol = EOL_UNSET;
 enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
@@ -120,6 +121,7 @@ const char * const local_repo_env[] = {
 	INDEX_ENVIRONMENT,
 	NO_REPLACE_OBJECTS_ENVIRONMENT,
 	GIT_REPLACE_REF_BASE_ENVIRONMENT,
+	NO_EXTERNAL_ODB_ENVIRONMENT,
 	GIT_PREFIX_ENVIRONMENT,
 	GIT_SUPER_PREFIX_ENVIRONMENT,
 	GIT_SHALLOW_FILE_ENVIRONMENT,
@@ -188,6 +190,8 @@ static void setup_git_env(void)
 	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
 	git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
 							  : "refs/replace/");
+	if (getenv(NO_EXTERNAL_ODB_ENVIRONMENT))
+		use_external_odb = 0;
 	namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT));
 	namespace_len = strlen(namespace);
 	shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
diff --git a/external-odb.c b/external-odb.c
index 6d4fdd0bc1..a88837feda 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -63,6 +63,9 @@ int external_odb_has_object(const unsigned char *sha1)
 {
 	struct odb_helper *o;
 
+	if (!use_external_odb)
+		return 0;
+
 	external_odb_init();
 
 	for (o = helpers; o; o = o->next)
@@ -133,6 +136,9 @@ int external_odb_write_object(const void *buf, size_t len,
 {
 	struct odb_helper *o;
 
+	if (!use_external_odb)
+		return 1;
+
 	/* For now accept only blobs */
 	if (strcmp(type, "blob"))
 		return 1;
diff --git a/sha1_file.c b/sha1_file.c
index 8dd09334cf..9d8e37432e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -642,6 +642,9 @@ void prepare_external_alt_odb(void)
 	static int linked_external;
 	const char *path;
 
+	if (!use_external_odb)
+		return;
+
 	if (linked_external)
 		return;
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 16/49] Add t0410 to test external ODB transfer
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (14 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 15/49] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 17/49] lib-httpd: pass config file to start_httpd() Christian Couder
                   ` (35 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0410-transfer-e-odb.sh | 136 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)
 create mode 100755 t/t0410-transfer-e-odb.sh

diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
new file mode 100755
index 0000000000..868b55db94
--- /dev/null
+++ b/t/t0410-transfer-e-odb.sh
@@ -0,0 +1,136 @@
+#!/bin/sh
+
+test_description='basic tests for transfering external ODBs'
+
+. ./test-lib.sh
+
+ORIG_SOURCE="$PWD/.git"
+export ORIG_SOURCE
+
+ALT_SOURCE1="$PWD/alt-repo1/.git"
+export ALT_SOURCE1
+write_script odb-helper1 <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+GIT_DIR=$ALT_SOURCE1; export GIT_DIR
+case "$1" in
+have)
+	git cat-file --batch-check --batch-all-objects |
+	awk '{print $1 " " $3 " " $2}'
+	;;
+get)
+	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$ORIG_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	GIT_DIR=$ORIG_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER1="\"$PWD\"/odb-helper1"
+
+OTHER_SOURCE="$PWD/.git"
+export OTHER_SOURCE
+
+ALT_SOURCE2="$PWD/alt-repo2/.git"
+export ALT_SOURCE2
+write_script odb-helper2 <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+GIT_DIR=$ALT_SOURCE2; export GIT_DIR
+case "$1" in
+have)
+	GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
+	;;
+get)
+	OBJ_FILE="$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	if ! test -f "$OBJ_FILE"
+	then
+		# "Download" the missing object by copying it from alt-repo1
+		OBJ_DIR=$(echo $2 | sed 's/\(..\).*/\1/')
+		OBJ_BASE=$(basename "$OBJ_FILE")
+		ALT_OBJ_DIR1="$ALT_SOURCE1/objects/$OBJ_DIR"
+		ALT_OBJ_DIR2="$ALT_SOURCE2/objects/$OBJ_DIR"
+		mkdir -p "$ALT_OBJ_DIR2" || die "Could not mkdir '$ALT_OBJ_DIR2'"
+		OBJ_SRC="$ALT_OBJ_DIR1/$OBJ_BASE"
+		cp "$OBJ_SRC" "$ALT_OBJ_DIR2" ||
+		die "Could not cp '$OBJ_SRC' into '$ALT_OBJ_DIR2'"
+	fi
+	cat "$OBJ_FILE" || die "Could not cat '$OBJ_FILE'"
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$OTHER_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	GIT_DIR=$OTHER_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER2="\"$PWD\"/odb-helper2"
+
+test_expect_success 'setup first alternate repo' '
+	git init alt-repo1 &&
+	test_commit zero &&
+	git config odb.magic.command "$HELPER1"
+'
+
+test_expect_success 'setup other repo and its alternate repo' '
+	git init other-repo &&
+	git init alt-repo2 &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+test_expect_success 'new blobs are put in first object store' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo1 && git show "$hash1") &&
+	test "$content" = "one" &&
+	test_commit two &&
+	hash2=$(git ls-tree HEAD | grep two.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo1 && git show "$hash2") &&
+	test "$content" = "two"
+'
+
+test_expect_success 'other repo gets the blobs from object store' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 test_must_fail git cat-file blob "$hash2" &&
+	 git config odb.magic.command "$HELPER2" &&
+	 git cat-file blob "$hash1" &&
+	 git cat-file blob "$hash2"
+	)
+'
+
+test_expect_success 'other repo gets everything else' '
+	(cd other-repo &&
+	 git fetch origin &&
+	 content=$(git show "$hash1") &&
+	 test "$content" = "one" &&
+	 content=$(git show "$hash2") &&
+	 test "$content" = "two")
+'
+
+test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 17/49] lib-httpd: pass config file to start_httpd()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (15 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 16/49] Add t0410 to test external ODB transfer Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 18/49] lib-httpd: add upload.sh Christian Couder
                   ` (34 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This makes it possible to start an apache web server with different
config files.

This will be used in a later patch to pass a config file that makes
apache store external objects.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465a..2e659a8ee2 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -171,12 +171,14 @@ prepare_httpd() {
 }
 
 start_httpd() {
+	APACHE_CONF_FILE=${1-apache.conf}
+
 	prepare_httpd >&3 2>&4
 
 	trap 'code=$?; stop_httpd; (exit $code); die' EXIT
 
 	"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-		-f "$TEST_PATH/apache.conf" $HTTPD_PARA \
+		-f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA \
 		-c "Listen 127.0.0.1:$LIB_HTTPD_PORT" -k start \
 		>&3 2>&4
 	if test $? -ne 0
@@ -191,7 +193,7 @@ stop_httpd() {
 	trap 'die' EXIT
 
 	"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-		-f "$TEST_PATH/apache.conf" $HTTPD_PARA -k stop
+		-f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA -k stop
 }
 
 test_http_push_nonff () {
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 18/49] lib-httpd: add upload.sh
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (16 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 17/49] lib-httpd: pass config file to start_httpd() Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 19/49] lib-httpd: add list.sh Christian Couder
                   ` (33 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This cgi will be used to upload objects to, or to delete
objects from, an apache web server.

This way the apache server can work as an external object
database.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh        |  1 +
 t/lib-httpd/upload.sh | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)
 create mode 100644 t/lib-httpd/upload.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 2e659a8ee2..d80b004549 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
 	install_script broken-smart-http.sh
 	install_script error.sh
+	install_script upload.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh
new file mode 100644
index 0000000000..172be0f73f
--- /dev/null
+++ b/t/lib-httpd/upload.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+# In part from http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+    key=${1%=*}
+    val=${1#*=}
+
+    case "$key" in
+	"sha1") sha1="$val" ;;
+	"type") type="$val" ;;
+	"size") size="$val" ;;
+	"delete") delete=1 ;;
+	*) echo >&2 "unknown key '$key'" ;;
+    esac
+
+    shift
+done
+
+case "$REQUEST_METHOD" in
+  POST)
+    if test "$delete" = "1"
+    then
+	rm -f "$FILES_DIR/$sha1-$size-$type"
+    else
+	mkdir -p "$FILES_DIR"
+	cat >"$FILES_DIR/$sha1-$size-$type"
+    fi
+
+    echo 'Status: 204 No Content'
+    echo
+    ;;
+
+  *)
+    echo 'Status: 405 Method Not Allowed'
+    echo
+esac
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 19/49] lib-httpd: add list.sh
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (17 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 18/49] lib-httpd: add upload.sh Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 20/49] lib-httpd: add apache-e-odb.conf Christian Couder
                   ` (32 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This cgi script can list Git objects that have been uploaded as
files to an apache web server. This script can also retrieve
the content of each of these files.

This will help make apache work as an external object database.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh      |  1 +
 t/lib-httpd/list.sh | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)
 create mode 100644 t/lib-httpd/list.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index d80b004549..f31ea261f5 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -133,6 +133,7 @@ prepare_httpd() {
 	install_script broken-smart-http.sh
 	install_script error.sh
 	install_script upload.sh
+	install_script list.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh
new file mode 100644
index 0000000000..7e520e507a
--- /dev/null
+++ b/t/lib-httpd/list.sh
@@ -0,0 +1,41 @@
+#!/bin/sh
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+    key=${1%=*}
+    val=${1#*=}
+
+    case "$key" in
+	"sha1") sha1="$val" ;;
+	*) echo >&2 "unknown key '$key'" ;;
+    esac
+
+    shift
+done
+
+if test -d "$FILES_DIR"
+then
+    if test -z "$sha1"
+    then
+	echo 'Status: 200 OK'
+	echo
+	ls "$FILES_DIR" | tr '-' ' '
+    else
+	if test -f "$FILES_DIR/$sha1"-*
+	then
+	    echo 'Status: 200 OK'
+	    echo
+	    cat "$FILES_DIR/$sha1"-*
+	else
+	    echo 'Status: 404 Not Found'
+	    echo
+	fi
+    fi
+fi
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 20/49] lib-httpd: add apache-e-odb.conf
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (18 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 19/49] lib-httpd: add list.sh Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 21/49] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
                   ` (31 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This is an apache config file to test external object databases.
It uses the upload.sh and list.sh cgi that have been added
previously to make apache store external objects.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd/apache-e-odb.conf | 214 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 214 insertions(+)
 create mode 100644 t/lib-httpd/apache-e-odb.conf

diff --git a/t/lib-httpd/apache-e-odb.conf b/t/lib-httpd/apache-e-odb.conf
new file mode 100644
index 0000000000..19a1540c82
--- /dev/null
+++ b/t/lib-httpd/apache-e-odb.conf
@@ -0,0 +1,214 @@
+ServerName dummy
+PidFile httpd.pid
+DocumentRoot www
+LogFormat "%h %l %u %t \"%r\" %>s %b" common
+CustomLog access.log common
+ErrorLog error.log
+<IfModule !mod_log_config.c>
+	LoadModule log_config_module modules/mod_log_config.so
+</IfModule>
+<IfModule !mod_alias.c>
+	LoadModule alias_module modules/mod_alias.so
+</IfModule>
+<IfModule !mod_cgi.c>
+	LoadModule cgi_module modules/mod_cgi.so
+</IfModule>
+<IfModule !mod_env.c>
+	LoadModule env_module modules/mod_env.so
+</IfModule>
+<IfModule !mod_rewrite.c>
+	LoadModule rewrite_module modules/mod_rewrite.so
+</IFModule>
+<IfModule !mod_version.c>
+	LoadModule version_module modules/mod_version.so
+</IfModule>
+<IfModule !mod_headers.c>
+	LoadModule headers_module modules/mod_headers.so
+</IfModule>
+
+<IfVersion < 2.4>
+LockFile accept.lock
+</IfVersion>
+
+<IfVersion < 2.1>
+<IfModule !mod_auth.c>
+	LoadModule auth_module modules/mod_auth.so
+</IfModule>
+</IfVersion>
+
+<IfVersion >= 2.1>
+<IfModule !mod_auth_basic.c>
+	LoadModule auth_basic_module modules/mod_auth_basic.so
+</IfModule>
+<IfModule !mod_authn_file.c>
+	LoadModule authn_file_module modules/mod_authn_file.so
+</IfModule>
+<IfModule !mod_authz_user.c>
+	LoadModule authz_user_module modules/mod_authz_user.so
+</IfModule>
+<IfModule !mod_authz_host.c>
+	LoadModule authz_host_module modules/mod_authz_host.so
+</IfModule>
+</IfVersion>
+
+<IfVersion >= 2.4>
+<IfModule !mod_authn_core.c>
+	LoadModule authn_core_module modules/mod_authn_core.so
+</IfModule>
+<IfModule !mod_authz_core.c>
+	LoadModule authz_core_module modules/mod_authz_core.so
+</IfModule>
+<IfModule !mod_access_compat.c>
+	LoadModule access_compat_module modules/mod_access_compat.so
+</IfModule>
+<IfModule !mod_mpm_prefork.c>
+	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+</IfModule>
+<IfModule !mod_unixd.c>
+	LoadModule unixd_module modules/mod_unixd.so
+</IfModule>
+</IfVersion>
+
+PassEnv GIT_VALGRIND
+PassEnv GIT_VALGRIND_OPTIONS
+PassEnv GNUPGHOME
+PassEnv ASAN_OPTIONS
+PassEnv GIT_TRACE
+PassEnv GIT_CONFIG_NOSYSTEM
+
+Alias /dumb/ www/
+Alias /auth/dumb/ www/auth/dumb/
+
+<LocationMatch /smart/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
+<LocationMatch /smart_noexport/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+</LocationMatch>
+<LocationMatch /smart_custom_env/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	SetEnv GIT_COMMITTER_NAME "Custom User"
+	SetEnv GIT_COMMITTER_EMAIL custom@example.com
+</LocationMatch>
+<LocationMatch /smart_namespace/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	SetEnv GIT_NAMESPACE ns
+</LocationMatch>
+<LocationMatch /smart_cookies/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	Header set Set-Cookie name=value
+</LocationMatch>
+<LocationMatch /smart_headers/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
+ScriptAlias /upload/ upload.sh/
+ScriptAlias /list/ list.sh/
+<Directory ${GIT_EXEC_PATH}>
+	Options FollowSymlinks
+</Directory>
+<Files upload.sh>
+  Options ExecCGI
+</Files>
+<Files list.sh>
+  Options ExecCGI
+</Files>
+<Files ${GIT_EXEC_PATH}/git-http-backend>
+	Options ExecCGI
+</Files>
+
+RewriteEngine on
+RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301]
+RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302]
+RewriteRule ^/smart-redir-auth/(.*)$ /auth/smart/$1 [R=301]
+RewriteRule ^/smart-redir-limited/(.*)/info/refs$ /smart/$1/info/refs [R=301]
+RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 [R=302]
+
+RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 [R=302]
+RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
+
+# Apache 2.2 does not understand <RequireAll>, so we use RewriteCond.
+# And as RewriteCond does not allow testing for non-matches, we match
+# the desired case first (one has abra, two has cadabra), and let it
+# pass by marking the RewriteRule as [L], "last rule, do not process
+# any other matching RewriteRules after this"), and then have another
+# RewriteRule that matches all other cases and lets them fail via '[F]',
+# "fail the request".
+RewriteCond %{HTTP:x-magic-one} =abra
+RewriteCond %{HTTP:x-magic-two} =cadabra
+RewriteRule ^/smart_headers/.* - [L]
+RewriteRule ^/smart_headers/.* - [F]
+
+<IfDefine SSL>
+LoadModule ssl_module modules/mod_ssl.so
+
+SSLCertificateFile httpd.pem
+SSLCertificateKeyFile httpd.pem
+SSLRandomSeed startup file:/dev/urandom 512
+SSLRandomSeed connect file:/dev/urandom 512
+SSLSessionCache none
+SSLMutex file:ssl_mutex
+SSLEngine On
+</IfDefine>
+
+<Location /auth/>
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</Location>
+
+<LocationMatch "^/auth-push/.*/git-receive-pack$">
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</LocationMatch>
+
+<LocationMatch "^/auth-fetch/.*/git-upload-pack$">
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</LocationMatch>
+
+RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
+RewriteCond %{REQUEST_URI} /git-receive-pack$
+RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
+
+<Location /half-auth-complete/>
+  Order Deny,Allow
+  Deny from env=AUTHREQUIRED
+
+  AuthType Basic
+  AuthName "Git Access"
+  AuthUserFile passwd
+  Require valid-user
+  Satisfy Any
+</Location>
+
+<IfDefine DAV>
+	LoadModule dav_module modules/mod_dav.so
+	LoadModule dav_fs_module modules/mod_dav_fs.so
+
+	DAVLockDB DAVLock
+	<Location /dumb/>
+		Dav on
+	</Location>
+	<Location /auth/dumb>
+		Dav on
+	</Location>
+</IfDefine>
+
+<IfDefine SVN>
+	LoadModule dav_svn_module modules/mod_dav_svn.so
+
+	<Location /${LIB_HTTPD_SVN}>
+		DAV svn
+		SVNPath "${LIB_HTTPD_SVNPATH}"
+	</Location>
+</IfDefine>
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 21/49] odb-helper: add 'store_plain_objects' to 'struct odb_helper'
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (19 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 20/49] lib-httpd: add apache-e-odb.conf Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 22/49] pack-objects: don't pack objects in external odbs Christian Couder
                   ` (30 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This adds a configuration option odb.<helper>.plainObjects and the
corresponding boolean variable called 'store_plain_objects' in
'struct odb_helper' to make it possible for external object
databases to store object as plain objects instead of Git objects.

The existing odb_helper_fetch_object() is renamed
odb_helper_fetch_git_object() and a new odb_helper_fetch_plain_object()
is introduce to deal with external objects that are not in Git format.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c |   2 +
 odb-helper.c   | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 odb-helper.h   |   1 +
 3 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index a88837feda..d11fc98719 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -36,6 +36,8 @@ static int external_odb_config(const char *var, const char *value, void *data)
 
 	if (!strcmp(key, "command"))
 		return git_config_string(&o->cmd, var, value);
+	if (!strcmp(key, "plainobjects"))
+		o->store_plain_objects = git_config_bool(var, value);
 
 	return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index af7cc55ca2..b33ee81c97 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -159,8 +159,107 @@ int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1)
 	return !!odb_helper_lookup(o, sha1);
 }
 
-int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
-			    int fd)
+static int odb_helper_fetch_plain_object(struct odb_helper *o,
+					 const unsigned char *sha1,
+					 int fd)
+{
+	struct odb_helper_object *obj;
+	struct odb_helper_cmd cmd;
+	unsigned long total_got = 0;
+
+	char hdr[32];
+	int hdrlen;
+
+	int ret = Z_STREAM_END;
+	unsigned char compressed[4096];
+	git_zstream stream;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	obj = odb_helper_lookup(o, sha1);
+	if (!obj)
+		return -1;
+
+	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
+		return -1;
+
+	/* Set it up */
+	git_deflate_init(&stream, zlib_compression_level);
+	stream.next_out = compressed;
+	stream.avail_out = sizeof(compressed);
+	git_SHA1_Init(&hash);
+
+	/* First header.. */
+	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj->type), obj->size) + 1;
+	stream.next_in = (unsigned char *)hdr;
+	stream.avail_in = hdrlen;
+	while (git_deflate(&stream, 0) == Z_OK)
+		; /* nothing */
+	git_SHA1_Update(&hash, hdr, hdrlen);
+
+	for (;;) {
+		unsigned char buf[4096];
+		int r;
+
+		r = xread(cmd.child.out, buf, sizeof(buf));
+		if (r < 0) {
+			error("unable to read from odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			git_deflate_end(&stream);
+			return -1;
+		}
+		if (r == 0)
+			break;
+
+		total_got += r;
+
+		/* Then the data itself.. */
+		stream.next_in = (void *)buf;
+		stream.avail_in = r;
+		do {
+			unsigned char *in0 = stream.next_in;
+			ret = git_deflate(&stream, Z_FINISH);
+			git_SHA1_Update(&hash, in0, stream.next_in - in0);
+			write_or_die(fd, compressed, stream.next_out - compressed);
+			stream.next_out = compressed;
+			stream.avail_out = sizeof(compressed);
+		} while (ret == Z_OK);
+	}
+
+	close(cmd.child.out);
+	if (ret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+	ret = git_deflate_end_gently(&stream);
+	if (ret != Z_OK) {
+		warning("deflateEnd on object %s from odb helper '%s' failed (%d)",
+			sha1_to_hex(sha1), o->name, ret);
+		return -1;
+	}
+	git_SHA1_Final(real_sha1, &hash);
+	if (hashcmp(sha1, real_sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+	if (odb_helper_finish(o, &cmd))
+		return -1;
+	if (total_got != obj->size) {
+		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
+			o->name, sha1_to_hex(sha1), total_got, obj->size);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int odb_helper_fetch_git_object(struct odb_helper *o,
+				       const unsigned char *sha1,
+				       int fd)
 {
 	struct odb_helper_object *obj;
 	struct odb_helper_cmd cmd;
@@ -248,6 +347,16 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 	return 0;
 }
 
+int odb_helper_fetch_object(struct odb_helper *o,
+			    const unsigned char *sha1,
+			    int fd)
+{
+	if (o->store_plain_objects)
+		return odb_helper_fetch_plain_object(o, sha1, fd);
+	else
+		return odb_helper_fetch_git_object(o, sha1, fd);
+}
+
 int odb_helper_for_each_object(struct odb_helper *o,
 			       each_external_object_fn fn,
 			       void *data)
diff --git a/odb-helper.h b/odb-helper.h
index 4e321195e8..3953b9bbaf 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -6,6 +6,7 @@
 struct odb_helper {
 	const char *name;
 	const char *cmd;
+	int store_plain_objects;
 
 	struct odb_helper_object {
 		unsigned char sha1[20];
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 22/49] pack-objects: don't pack objects in external odbs
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (20 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 21/49] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 23/49] t0420: add test with HTTP external odb Christian Couder
                   ` (29 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Objects managed by an external ODB should not be put into
pack files.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/pack-objects.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f672225def..e423f685ff 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -24,6 +24,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "mru.h"
+#include "external-odb.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [<options>...] [< <ref-list> | < <object-list>]"),
@@ -1011,6 +1012,9 @@ static int want_object_in_pack(const unsigned char *sha1,
 			return want;
 	}
 
+	if (external_odb_has_object(sha1))
+		return 0;
+
 	for (entry = packed_git_mru->head; entry; entry = entry->next) {
 		struct packed_git *p = entry->item;
 		off_t offset;
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 23/49] t0420: add test with HTTP external odb
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (21 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 22/49] pack-objects: don't pack objects in external odbs Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 24/49] odb-helper: start fault in implementation Christian Couder
                   ` (28 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

This tests that an apache web server can be used as an
external object database and store files in their native
format instead of converting them to a Git object.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0420-transfer-http-e-odb.sh | 150 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)
 create mode 100755 t/t0420-transfer-http-e-odb.sh

diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
new file mode 100755
index 0000000000..716d722e97
--- /dev/null
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -0,0 +1,150 @@
+#!/bin/sh
+
+test_description='tests for transfering external objects to an HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-http-helper <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+echo >&2 "odb-http-helper args:" "$@"
+case "$1" in
+have)
+	list_url="$HTTPD_URL/list/"
+	curl "$list_url" ||
+	die "curl '$list_url' failed"
+	;;
+get)
+	get_url="$HTTPD_URL/list/?sha1=$2"
+	curl "$get_url" ||
+	die "curl '$get_url' failed"
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	upload_url="$HTTPD_URL/upload/?sha1=$sha1&size=$size&type=$kind"
+	curl --data-binary @- --include "$upload_url" >out ||
+	die "curl '$upload_url' failed"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-http-helper"
+
+
+test_expect_success 'setup repo with a root commit and the helper' '
+	test_commit zero &&
+	git config odb.magic.command "$HELPER" &&
+	git config odb.magic.plainObjects "true"
+'
+
+test_expect_success 'setup another repo from the first one' '
+	git init other-repo &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME&size=123&type=blob"
+
+test_expect_success 'can upload a file' '
+	echo "Hello Apache World!" >hello_to_send.txt &&
+	echo "How are you?" >>hello_to_send.txt &&
+	curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" >out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+	curl --include "$LIST_URL" >out_list &&
+	grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+	curl --data "delete" --include "$UPLOAD_URL&delete=1" >out_delete &&
+	curl --include "$LIST_URL" >out_list2 &&
+	! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	echo "$hash1-4-blob" >expected &&
+	ls "$FILES_DIR" >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+	git cat-file blob "$hash1" &&
+	git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.plainObjects "true" &&
+	 git cat-file blob "$hash1" &&
+	 git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone .. . &&
+	 git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 test_must_fail git clone --no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git clone -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		--no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local initial-refspec clone succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.plainObjects "true" &&
+	 git -c odb.magic.command="$HELPER" -c odb.magic.plainObjects="true" \
+		clone --no-local --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 24/49] odb-helper: start fault in implementation
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (22 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 23/49] t0420: add test with HTTP external odb Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:54 ` [RFC/PATCH v4 25/49] external-odb: add external_odb_fault_in_object() Christian Couder
                   ` (27 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c                 | 24 ++++++++++++++++++++++--
 odb-helper.c                   | 30 ++++++++++++++++++++++++++++--
 odb-helper.h                   |  8 +++++++-
 t/t0400-external-odb.sh        |  2 ++
 t/t0410-transfer-e-odb.sh      |  4 +++-
 t/t0420-transfer-http-e-odb.sh |  6 +++---
 6 files changed, 65 insertions(+), 9 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index d11fc98719..0b6e443372 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -20,6 +20,19 @@ static struct odb_helper *find_or_create_helper(const char *name, int len)
 	return o;
 }
 
+static enum odb_helper_fetch_kind parse_fetch_kind(const char *key,
+						   const char *value)
+{
+	if (!strcasecmp(value, "plainobject"))
+		return ODB_FETCH_KIND_PLAIN_OBJECT;
+	else if (!strcasecmp(value, "gitobject"))
+		return ODB_FETCH_KIND_GIT_OBJECT;
+	else if (!strcasecmp(value, "faultin"))
+		return ODB_FETCH_KIND_FAULT_IN;
+
+	die("unknown value for config '%s': %s", key, value);
+}
+
 static int external_odb_config(const char *var, const char *value, void *data)
 {
 	struct odb_helper *o;
@@ -36,8 +49,15 @@ static int external_odb_config(const char *var, const char *value, void *data)
 
 	if (!strcmp(key, "command"))
 		return git_config_string(&o->cmd, var, value);
-	if (!strcmp(key, "plainobjects"))
-		o->store_plain_objects = git_config_bool(var, value);
+	if (!strcmp(key, "fetchkind")) {
+		const char *fetch_kind;
+		int ret = git_config_string(&fetch_kind, var, value);
+		if (!ret) {
+			o->fetch_kind = parse_fetch_kind(var, fetch_kind);
+			free((char *)fetch_kind);
+		}
+		return ret;
+	}
 
 	return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index b33ee81c97..24dc5375cb 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -347,14 +347,40 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
 	return 0;
 }
 
+static int odb_helper_fetch_fault_in(struct odb_helper *o,
+				     const unsigned char *sha1,
+				     int fd)
+{
+	struct odb_helper_object *obj;
+	struct odb_helper_cmd cmd;
+
+	obj = odb_helper_lookup(o, sha1);
+	if (!obj)
+		return -1;
+
+	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
+		return -1;
+
+	if (odb_helper_finish(o, &cmd))
+		return -1;
+
+	return 0;
+}
+
 int odb_helper_fetch_object(struct odb_helper *o,
 			    const unsigned char *sha1,
 			    int fd)
 {
-	if (o->store_plain_objects)
+	switch(o->fetch_kind) {
+	case ODB_FETCH_KIND_PLAIN_OBJECT:
 		return odb_helper_fetch_plain_object(o, sha1, fd);
-	else
+	case ODB_FETCH_KIND_GIT_OBJECT:
 		return odb_helper_fetch_git_object(o, sha1, fd);
+	case ODB_FETCH_KIND_FAULT_IN:
+		return odb_helper_fetch_fault_in(o, sha1, fd);
+	default:
+		BUG("invalid fetch kind '%d'", o->fetch_kind);
+	}
 }
 
 int odb_helper_for_each_object(struct odb_helper *o,
diff --git a/odb-helper.h b/odb-helper.h
index 3953b9bbaf..e3ad8e3316 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -3,10 +3,16 @@
 
 #include "external-odb.h"
 
+enum odb_helper_fetch_kind {
+	ODB_FETCH_KIND_PLAIN_OBJECT = 0,
+	ODB_FETCH_KIND_GIT_OBJECT,
+	ODB_FETCH_KIND_FAULT_IN
+};
+
 struct odb_helper {
 	const char *name;
 	const char *cmd;
-	int store_plain_objects;
+	enum odb_helper_fetch_kind fetch_kind;
 
 	struct odb_helper_object {
 		unsigned char sha1[20];
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 3c868cad4c..c3cb0fdc84 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -49,6 +49,7 @@ test_expect_success 'alt objects are missing' '
 
 test_expect_success 'helper can retrieve alt objects' '
 	test_config odb.magic.command "$HELPER" &&
+	test_config odb.magic.fetchKind "gitObject" &&
 	cat >expect <<-\EOF &&
 	two
 	one
@@ -68,6 +69,7 @@ test_expect_success 'helper can add objects to alt repo' '
 
 test_expect_success 'commit adds objects to alt repo' '
 	test_config odb.magic.command "$HELPER" &&
+	test_config odb.magic.fetchKind "gitObject" &&
 	test_commit three &&
 	hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
 	content=$(cd alt-repo && git show "$hash3") &&
diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
index 868b55db94..cba89866e2 100755
--- a/t/t0410-transfer-e-odb.sh
+++ b/t/t0410-transfer-e-odb.sh
@@ -89,7 +89,8 @@ HELPER2="\"$PWD\"/odb-helper2"
 test_expect_success 'setup first alternate repo' '
 	git init alt-repo1 &&
 	test_commit zero &&
-	git config odb.magic.command "$HELPER1"
+	git config odb.magic.command "$HELPER1" &&
+	git config odb.magic.fetchKind "gitObject"
 '
 
 test_expect_success 'setup other repo and its alternate repo' '
@@ -119,6 +120,7 @@ test_expect_success 'other repo gets the blobs from object store' '
 	 test_must_fail git cat-file blob "$hash1" &&
 	 test_must_fail git cat-file blob "$hash2" &&
 	 git config odb.magic.command "$HELPER2" &&
+	 git config odb.magic.fetchKind "gitObject"
 	 git cat-file blob "$hash1" &&
 	 git cat-file blob "$hash2"
 	)
diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index 716d722e97..8a5f3adaa7 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -53,7 +53,7 @@ HELPER="\"$PWD\"/odb-http-helper"
 test_expect_success 'setup repo with a root commit and the helper' '
 	test_commit zero &&
 	git config odb.magic.command "$HELPER" &&
-	git config odb.magic.plainObjects "true"
+	git config odb.magic.fetchKind "plainObject"
 '
 
 test_expect_success 'setup another repo from the first one' '
@@ -108,7 +108,7 @@ test_expect_success 'update other repo from the first one' '
 	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
 	 test_must_fail git cat-file blob "$hash1" &&
 	 git config odb.magic.command "$HELPER" &&
-	 git config odb.magic.plainObjects "true" &&
+	 git config odb.magic.fetchKind "plainObject" &&
 	 git cat-file blob "$hash1" &&
 	 git pull origin master)
 '
@@ -140,7 +140,7 @@ test_expect_success 'no-local initial-refspec clone succeeds' '
 	mkdir my-other-clone &&
 	(cd my-other-clone &&
 	 git config odb.magic.command "$HELPER" &&
-	 git config odb.magic.plainObjects "true" &&
+	 git config odb.magic.fetchKind "plainObject" &&
 	 git -c odb.magic.command="$HELPER" -c odb.magic.plainObjects="true" \
 		clone --no-local --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
 '
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 25/49] external-odb: add external_odb_fault_in_object()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (23 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 24/49] odb-helper: start fault in implementation Christian Couder
@ 2017-06-20  7:54 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 26/49] odb-helper: add script_mode Christian Couder
                   ` (26 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 21 ++++++++++++++++++++-
 external-odb.h |  1 +
 odb-helper.c   |  7 +++----
 odb-helper.h   |  1 +
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 0b6e443372..502380cac2 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -113,7 +113,8 @@ int external_odb_fetch_object(const unsigned char *sha1)
 		int ret;
 		int fd;
 
-		if (!odb_helper_has_object(o, sha1))
+		if (o->fetch_kind != ODB_FETCH_KIND_PLAIN_OBJECT &&
+		    o->fetch_kind != ODB_FETCH_KIND_GIT_OBJECT)
 			continue;
 
 		fd = create_object_tmpfile(&tmpfile, path);
@@ -139,6 +140,24 @@ int external_odb_fetch_object(const unsigned char *sha1)
 	return -1;
 }
 
+int external_odb_fault_in_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	if (!external_odb_has_object(sha1))
+		return -1;
+
+	for (o = helpers; o; o = o->next) {
+		if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN)
+			continue;
+		if (odb_helper_fault_in_object(o, sha1) < 0)
+			continue;
+		return 0;
+	}
+
+	return -1;
+}
+
 int external_odb_for_each_object(each_external_object_fn fn, void *data)
 {
 	struct odb_helper *o;
diff --git a/external-odb.h b/external-odb.h
index 53879e900d..1b46c49e25 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,6 +4,7 @@
 const char *external_odb_root(void);
 int external_odb_has_object(const unsigned char *sha1);
 int external_odb_fetch_object(const unsigned char *sha1);
+int external_odb_fault_in_object(const unsigned char *sha1);
 
 typedef int (*each_external_object_fn)(const unsigned char *sha1,
 				       enum object_type type,
diff --git a/odb-helper.c b/odb-helper.c
index 24dc5375cb..5fb56c6135 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -347,9 +347,8 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
 	return 0;
 }
 
-static int odb_helper_fetch_fault_in(struct odb_helper *o,
-				     const unsigned char *sha1,
-				     int fd)
+int odb_helper_fault_in_object(struct odb_helper *o,
+			       const unsigned char *sha1)
 {
 	struct odb_helper_object *obj;
 	struct odb_helper_cmd cmd;
@@ -377,7 +376,7 @@ int odb_helper_fetch_object(struct odb_helper *o,
 	case ODB_FETCH_KIND_GIT_OBJECT:
 		return odb_helper_fetch_git_object(o, sha1, fd);
 	case ODB_FETCH_KIND_FAULT_IN:
-		return odb_helper_fetch_fault_in(o, sha1, fd);
+		return 0;
 	default:
 		BUG("invalid fetch kind '%d'", o->fetch_kind);
 	}
diff --git a/odb-helper.h b/odb-helper.h
index e3ad8e3316..2dc6d96c40 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -30,6 +30,7 @@ struct odb_helper *odb_helper_new(const char *name, int namelen);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
+int odb_helper_fault_in_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_for_each_object(struct odb_helper *o,
 			       each_external_object_fn, void *);
 int odb_helper_write_object(struct odb_helper *o,
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 26/49] odb-helper: add script_mode
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (24 preceding siblings ...)
  2017-06-20  7:54 ` [RFC/PATCH v4 25/49] external-odb: add external_odb_fault_in_object() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 27/49] Documentation: add read-object-protocol.txt Christian Couder
                   ` (25 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c                 | 4 ++++
 odb-helper.h                   | 1 +
 t/t0400-external-odb.sh        | 2 ++
 t/t0410-transfer-e-odb.sh      | 2 ++
 t/t0420-transfer-http-e-odb.sh | 7 ++++++-
 5 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index 502380cac2..2efa805d12 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -49,6 +49,10 @@ static int external_odb_config(const char *var, const char *value, void *data)
 
 	if (!strcmp(key, "command"))
 		return git_config_string(&o->cmd, var, value);
+	if (!strcmp(key, "scriptmode")) {
+		o->script_mode = git_config_bool(var, value);
+		return 0;
+	}
 	if (!strcmp(key, "fetchkind")) {
 		const char *fetch_kind;
 		int ret = git_config_string(&fetch_kind, var, value);
diff --git a/odb-helper.h b/odb-helper.h
index 2dc6d96c40..44c98bbf56 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -13,6 +13,7 @@ struct odb_helper {
 	const char *name;
 	const char *cmd;
 	enum odb_helper_fetch_kind fetch_kind;
+	int script_mode;
 
 	struct odb_helper_object {
 		unsigned char sha1[20];
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index c3cb0fdc84..18d8c38862 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -49,6 +49,7 @@ test_expect_success 'alt objects are missing' '
 
 test_expect_success 'helper can retrieve alt objects' '
 	test_config odb.magic.command "$HELPER" &&
+	test_config odb.magic.scriptMode true &&
 	test_config odb.magic.fetchKind "gitObject" &&
 	cat >expect <<-\EOF &&
 	two
@@ -69,6 +70,7 @@ test_expect_success 'helper can add objects to alt repo' '
 
 test_expect_success 'commit adds objects to alt repo' '
 	test_config odb.magic.command "$HELPER" &&
+	test_config odb.magic.scriptMode true &&
 	test_config odb.magic.fetchKind "gitObject" &&
 	test_commit three &&
 	hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
index cba89866e2..8de9a08d7c 100755
--- a/t/t0410-transfer-e-odb.sh
+++ b/t/t0410-transfer-e-odb.sh
@@ -90,6 +90,7 @@ test_expect_success 'setup first alternate repo' '
 	git init alt-repo1 &&
 	test_commit zero &&
 	git config odb.magic.command "$HELPER1" &&
+	git config odb.magic.scriptMode true &&
 	git config odb.magic.fetchKind "gitObject"
 '
 
@@ -120,6 +121,7 @@ test_expect_success 'other repo gets the blobs from object store' '
 	 test_must_fail git cat-file blob "$hash1" &&
 	 test_must_fail git cat-file blob "$hash2" &&
 	 git config odb.magic.command "$HELPER2" &&
+	 git config odb.magic.scriptMode true &&
 	 git config odb.magic.fetchKind "gitObject"
 	 git cat-file blob "$hash1" &&
 	 git cat-file blob "$hash2"
diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index 8a5f3adaa7..b8062d14c0 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -53,6 +53,7 @@ HELPER="\"$PWD\"/odb-http-helper"
 test_expect_success 'setup repo with a root commit and the helper' '
 	test_commit zero &&
 	git config odb.magic.command "$HELPER" &&
+	git config odb.magic.scriptMode true &&
 	git config odb.magic.fetchKind "plainObject"
 '
 
@@ -108,6 +109,7 @@ test_expect_success 'update other repo from the first one' '
 	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
 	 test_must_fail git cat-file blob "$hash1" &&
 	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.scriptMode true &&
 	 git config odb.magic.fetchKind "plainObject" &&
 	 git cat-file blob "$hash1" &&
 	 git pull origin master)
@@ -131,6 +133,7 @@ test_expect_success 'no-local clone from the first repo with helper succeeds' '
 	mkdir my-other-clone &&
 	(cd my-other-clone &&
 	 git clone -c odb.magic.command="$HELPER" \
+		-c odb.magic.scriptMode="true" \
 		-c odb.magic.plainObjects="true" \
 		--no-local .. .) &&
 	rm -rf my-other-clone
@@ -141,7 +144,9 @@ test_expect_success 'no-local initial-refspec clone succeeds' '
 	(cd my-other-clone &&
 	 git config odb.magic.command "$HELPER" &&
 	 git config odb.magic.fetchKind "plainObject" &&
-	 git -c odb.magic.command="$HELPER" -c odb.magic.plainObjects="true" \
+	 git -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		-c odb.magic.scriptMode="true" \
 		clone --no-local --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
 '
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 27/49] Documentation: add read-object-protocol.txt
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (25 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 26/49] odb-helper: add script_mode Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 28/49] contrib: add long-running-read-object/example.pl Christian Couder
                   ` (24 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder,
	Ben Peart

From: Ben Peart <benpeart@microsoft.com>

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/technical/read-object-protocol.txt | 102 +++++++++++++++++++++++
 1 file changed, 102 insertions(+)
 create mode 100644 Documentation/technical/read-object-protocol.txt

diff --git a/Documentation/technical/read-object-protocol.txt b/Documentation/technical/read-object-protocol.txt
new file mode 100644
index 0000000000..a893b46e7c
--- /dev/null
+++ b/Documentation/technical/read-object-protocol.txt
@@ -0,0 +1,102 @@
+Read Object Process
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The read-object process enables Git to read all missing blobs with a
+single process invocation for the entire life of a single Git command.
+This is achieved by using a packet format (pkt-line, see technical/
+protocol-common.txt) based protocol over standard input and standard
+output as follows. All packets, except for the "*CONTENT" packets and
+the "0000" flush packet, are considered text and therefore are
+terminated by a LF.
+
+Git starts the process when it encounters the first missing object that
+needs to be retrieved. After the process is started, Git sends a welcome
+message ("git-read-object-client"), a list of supported protocol version
+numbers, and a flush packet. Git expects to read a welcome response
+message ("git-read-object-server"), exactly one protocol version number
+from the previously sent list, and a flush packet. All further
+communication will be based on the selected version.
+
+The remaining protocol description below documents "version=1". Please
+note that "version=42" in the example below does not exist and is only
+there to illustrate how the protocol would look with more than one
+version.
+
+After the version negotiation Git sends a list of all capabilities that
+it supports and a flush packet. Git expects to read a list of desired
+capabilities, which must be a subset of the supported capabilities list,
+and a flush packet as response:
+------------------------
+packet: git> git-read-object-client
+packet: git> version=1
+packet: git> version=42
+packet: git> 0000
+packet: git< git-read-object-server
+packet: git< version=1
+packet: git< 0000
+packet: git> capability=get
+packet: git> capability=have
+packet: git> capability=put
+packet: git> capability=not-yet-invented
+packet: git> 0000
+packet: git< capability=get
+packet: git< 0000
+------------------------
+The only supported capability in version 1 is "get".
+
+Afterwards Git sends a list of "key=value" pairs terminated with a flush
+packet. The list will contain at least the command (based on the
+supported capabilities) and the sha1 of the object to retrieve. Please
+note, that the process must not send any response before it received the
+final flush packet.
+
+When the process receives the "get" command, it should make the requested
+object available in the git object store and then return success. Git will
+then check the object store again and this time find it and proceed.
+------------------------
+packet: git> command=get
+packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
+packet: git> 0000
+------------------------
+
+The process is expected to respond with a list of "key=value" pairs
+terminated with a flush packet. If the process does not experience
+problems then the list must contain a "success" status.
+------------------------
+packet: git< status=success
+packet: git< 0000
+------------------------
+
+In case the process cannot or does not want to process the content, it
+is expected to respond with an "error" status.
+------------------------
+packet: git< status=error
+packet: git< 0000
+------------------------
+
+In case the process cannot or does not want to process the content as
+well as any future content for the lifetime of the Git process, then it
+is expected to respond with an "abort" status at any point in the
+protocol.
+------------------------
+packet: git< status=abort
+packet: git< 0000
+------------------------
+
+Git neither stops nor restarts the process in case the "error"/"abort"
+status is set.
+
+If the process dies during the communication or does not adhere to the
+protocol then Git will stop the process and restart it with the next
+object that needs to be processed.
+
+After the read-object process has processed an object it is expected to
+wait for the next "key=value" list containing a command. Git will close
+the command pipe on exit. The process is expected to detect EOF and exit
+gracefully on its own. Git will wait until the process has stopped.
+
+A long running read-object process demo implementation can be found in
+`contrib/long-running-read-object/example.pl` located in the Git core
+repository. If you develop your own long running process then the
+`GIT_TRACE_PACKET` environment variables can be very helpful for
+debugging (see linkgit:git[1]).
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 28/49] contrib: add long-running-read-object/example.pl
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (26 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 27/49] Documentation: add read-object-protocol.txt Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 29/49] Add t0410 to test read object mechanism Christian Couder
                   ` (23 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder,
	Ben Peart

From: Ben Peart <benpeart@microsoft.com>

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 contrib/long-running-read-object/example.pl | 114 ++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 contrib/long-running-read-object/example.pl

diff --git a/contrib/long-running-read-object/example.pl b/contrib/long-running-read-object/example.pl
new file mode 100644
index 0000000000..6587333b87
--- /dev/null
+++ b/contrib/long-running-read-object/example.pl
@@ -0,0 +1,114 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#	cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard <sha from rev-parse call above>
+#
+# Please note, this sample is a minimal skeleton. No proper error handling
+# was implemented.
+#
+
+use strict;
+use warnings;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "/host_repo/.git/";
+
+sub packet_bin_read {
+	my $buffer;
+	my $bytes_read = read STDIN, $buffer, 4;
+	if ( $bytes_read == 0 ) {
+
+		# EOF - Git stopped talking to us!
+		exit();
+	}
+	elsif ( $bytes_read != 4 ) {
+		die "invalid packet: '$buffer'";
+	}
+	my $pkt_size = hex($buffer);
+	if ( $pkt_size == 0 ) {
+		return ( 1, "" );
+	}
+	elsif ( $pkt_size > 4 ) {
+		my $content_size = $pkt_size - 4;
+		$bytes_read = read STDIN, $buffer, $content_size;
+		if ( $bytes_read != $content_size ) {
+			die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
+		}
+		return ( 0, $buffer );
+	}
+	else {
+		die "invalid packet size: $pkt_size";
+	}
+}
+
+sub packet_txt_read {
+	my ( $res, $buf ) = packet_bin_read();
+	unless ( $buf =~ s/\n$// ) {
+		die "A non-binary line MUST be terminated by an LF.";
+	}
+	return ( $res, $buf );
+}
+
+sub packet_bin_write {
+	my $buf = shift;
+	print STDOUT sprintf( "%04x", length($buf) + 4 );
+	print STDOUT $buf;
+	STDOUT->flush();
+}
+
+sub packet_txt_write {
+	packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+	print STDOUT sprintf( "%04x", 0 );
+	STDOUT->flush();
+}
+
+( packet_txt_read() eq ( 0, "git-read-object-client" ) ) || die "bad initialize";
+( packet_txt_read() eq ( 0, "version=1" ) )				 || die "bad version";
+( packet_bin_read() eq ( 1, "" ) )                       || die "bad version end";
+
+packet_txt_write("git-read-object-server");
+packet_txt_write("version=1");
+packet_flush();
+
+( packet_txt_read() eq ( 0, "capability=get" ) )    || die "bad capability";
+( packet_bin_read() eq ( 1, "" ) )                  || die "bad capability end";
+
+packet_txt_write("capability=get");
+packet_flush();
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	if ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . ' | git -c core.virtualizeobjects=false hash-object -w --stdin >/dev/null 2>&1');
+		packet_txt_write(($?) ? "status=error" : "status=success");
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 29/49] Add t0410 to test read object mechanism
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (27 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 28/49] contrib: add long-running-read-object/example.pl Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 30/49] odb-helper: add read_object_process() Christian Couder
                   ` (22 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder,
	Ben Peart

From: Ben Peart <benpeart@microsoft.com>

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0450-read-object.sh | 30 +++++++++++++++++++++++++++
 t/t0450/read-object    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+)
 create mode 100755 t/t0450-read-object.sh
 create mode 100755 t/t0450/read-object

diff --git a/t/t0450-read-object.sh b/t/t0450-read-object.sh
new file mode 100755
index 0000000000..18d726fe28
--- /dev/null
+++ b/t/t0450-read-object.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0450"
+
+test_expect_success 'setup host repo with a root commit' '
+	test_commit zero &&
+	hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+	git init guest-repo &&
+	(cd guest-repo &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "faultin" &&
+	 git cat-file blob "$hash1")
+'
+
+test_expect_success 'invalid blobs generate errors' '
+	cd guest-repo &&
+	test_must_fail git cat-file blob "invalid"
+'
+
+
+test_done
diff --git a/t/t0450/read-object b/t/t0450/read-object
new file mode 100755
index 0000000000..bf5fa2652b
--- /dev/null
+++ b/t/t0450/read-object
@@ -0,0 +1,56 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#	cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard <sha from rev-parse call above>
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get");
+packet_write_capabilities("get");
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	if ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . ' | GIT_NO_EXTERNAL_ODB=1 git hash-object -w --stdin >/dev/null 2>&1');
+		packet_txt_write(($?) ? "status=error" : "status=success");
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 30/49] odb-helper: add read_object_process()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (28 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 29/49] Add t0410 to test read object mechanism Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-07-10 15:57   ` Ben Peart
  2017-06-20  7:55 ` [RFC/PATCH v4 31/49] external-odb: add external_odb_get_capabilities() Christian Couder
                   ` (21 subsequent siblings)
  51 siblings, 1 reply; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder,
	Ben Peart

From: Ben Peart <benpeart@microsoft.com>

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 odb-helper.h |   5 ++
 sha1_file.c  |  33 +++++++++-
 3 files changed, 227 insertions(+), 13 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 5fb56c6135..20e83cb55a 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -4,6 +4,187 @@
 #include "odb-helper.h"
 #include "run-command.h"
 #include "sha1-lookup.h"
+#include "sub-process.h"
+#include "pkt-line.h"
+#include "sigchain.h"
+
+struct read_object_process {
+	struct subprocess_entry subprocess;
+	unsigned int supported_capabilities;
+};
+
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
+
+static void parse_capabilities(char *cap_buf,
+			       unsigned int *supported_capabilities,
+			       const char *process_name)
+{
+	struct string_list cap_list = STRING_LIST_INIT_NODUP;
+
+	string_list_split_in_place(&cap_list, cap_buf, '=', 1);
+
+	if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, "capability")) {
+		const char *cap_name = cap_list.items[1].string;
+
+		if (!strcmp(cap_name, "get")) {
+			*supported_capabilities |= ODB_HELPER_CAP_GET;
+		} else if (!strcmp(cap_name, "put")) {
+			*supported_capabilities |= ODB_HELPER_CAP_PUT;
+		} else if (!strcmp(cap_name, "have")) {
+			*supported_capabilities |= ODB_HELPER_CAP_HAVE;
+		} else {
+			warning("external process '%s' requested unsupported read-object capability '%s'",
+				process_name, cap_name);
+		}
+	}
+
+	string_list_clear(&cap_list, 0);
+}
+
+static int start_read_object_fn(struct subprocess_entry *subprocess)
+{
+	int err;
+	struct read_object_process *entry = (struct read_object_process *)subprocess;
+	struct child_process *process = &subprocess->process;
+	char *cap_buf;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	err = packet_writel(process->in, "git-read-object-client", "version=1", NULL);
+	if (err)
+		goto done;
+
+	err = strcmp(packet_read_line(process->out, NULL), "git-read-object-server");
+	if (err) {
+		error("external process '%s' does not support read-object protocol version 1", subprocess->cmd);
+		goto done;
+	}
+	err = strcmp(packet_read_line(process->out, NULL), "version=1");
+	if (err)
+		goto done;
+	err = packet_read_line(process->out, NULL) != NULL;
+	if (err)
+		goto done;
+
+	err = packet_writel(process->in, "capability=get", NULL);
+	if (err)
+		goto done;
+
+	while ((cap_buf = packet_read_line(process->out, NULL)))
+		parse_capabilities(cap_buf, &entry->supported_capabilities, subprocess->cmd);
+
+done:
+	sigchain_pop(SIGPIPE);
+
+	return err;
+}
+
+static struct read_object_process *launch_read_object_process(const char *cmd)
+{
+	struct read_object_process *entry;
+
+	if (!subprocess_map_initialized) {
+		subprocess_map_initialized = 1;
+		hashmap_init(&subprocess_map, (hashmap_cmp_fn) cmd2process_cmp, 0);
+		entry = NULL;
+	} else {
+		entry = (struct read_object_process *)subprocess_find_entry(&subprocess_map, cmd);
+	}
+
+	fflush(NULL);
+
+	if (!entry) {
+		entry = xmalloc(sizeof(*entry));
+		entry->supported_capabilities = 0;
+
+		if (subprocess_start(&subprocess_map, &entry->subprocess, cmd, start_read_object_fn)) {
+			free(entry);
+			return 0;
+		}
+	}
+
+	return entry;
+}
+
+static int check_object_process_error(int err,
+				      const char *status,
+				      struct read_object_process *entry,
+				      const char *cmd,
+				      unsigned int capability)
+{
+	if (!err)
+		return;
+
+	if (!strcmp(status, "error")) {
+		/* The process signaled a problem with the file. */
+	} else if (!strcmp(status, "notfound")) {
+		/* Object was not found */
+		err = -1;
+	} else if (!strcmp(status, "abort")) {
+		/*
+		 * The process signaled a permanent problem. Don't try to read
+		 * objects with the same command for the lifetime of the current
+		 * Git process.
+		 */
+		if (capability)
+			entry->supported_capabilities &= ~capability;
+	} else {
+		/*
+		 * Something went wrong with the read-object process.
+		 * Force shutdown and restart if needed.
+		 */
+		error("external object process '%s' failed", cmd);
+		subprocess_stop(&subprocess_map, &entry->subprocess);
+		free(entry);
+	}
+
+	return err;
+}
+
+static int read_object_process(const unsigned char *sha1)
+{
+	int err;
+	struct read_object_process *entry;
+	struct child_process *process;
+	struct strbuf status = STRBUF_INIT;
+	const char *cmd = "read-object";
+	uint64_t start;
+
+	start = getnanotime();
+
+	entry = launch_read_object_process(cmd);
+	process = &entry->subprocess.process;
+
+	if (!(ODB_HELPER_CAP_GET & entry->supported_capabilities))
+		return -1;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	err = packet_write_fmt_gently(process->in, "command=get\n");
+	if (err)
+		goto done;
+
+	err = packet_write_fmt_gently(process->in, "sha1=%s\n", sha1_to_hex(sha1));
+	if (err)
+		goto done;
+
+	err = packet_flush_gently(process->in);
+	if (err)
+		goto done;
+
+	subprocess_read_status(process->out, &status);
+	err = strcmp(status.buf, "success");
+
+done:
+	sigchain_pop(SIGPIPE);
+
+	err = check_object_process_error(err, status.buf, entry, cmd, ODB_HELPER_CAP_GET);
+
+	trace_performance_since(start, "read_object_process");
+
+	return err;
+}
 
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
@@ -350,20 +531,21 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
 int odb_helper_fault_in_object(struct odb_helper *o,
 			       const unsigned char *sha1)
 {
-	struct odb_helper_object *obj;
-	struct odb_helper_cmd cmd;
+	struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
 
-	obj = odb_helper_lookup(o, sha1);
 	if (!obj)
 		return -1;
 
-	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
-		return -1;
-
-	if (odb_helper_finish(o, &cmd))
-		return -1;
-
-	return 0;
+	if (o->script_mode) {
+		struct odb_helper_cmd cmd;
+		if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
+			return -1;
+		if (odb_helper_finish(o, &cmd))
+			return -1;
+		return 0;
+	} else {
+		return read_object_process(sha1);
+	}
 }
 
 int odb_helper_fetch_object(struct odb_helper *o,
diff --git a/odb-helper.h b/odb-helper.h
index 44c98bbf56..b23544aa4a 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -9,11 +9,16 @@ enum odb_helper_fetch_kind {
 	ODB_FETCH_KIND_FAULT_IN
 };
 
+#define ODB_HELPER_CAP_GET    (1u<<0)
+#define ODB_HELPER_CAP_PUT    (1u<<1)
+#define ODB_HELPER_CAP_HAVE   (1u<<2)
+
 struct odb_helper {
 	const char *name;
 	const char *cmd;
 	enum odb_helper_fetch_kind fetch_kind;
 	int script_mode;
+	unsigned int supported_capabilities;
 
 	struct odb_helper_object {
 		unsigned char sha1[20];
diff --git a/sha1_file.c b/sha1_file.c
index 9d8e37432e..38a0404506 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -698,7 +698,17 @@ int check_and_freshen_file(const char *fn, int freshen)
 
 static int check_and_freshen_local(const unsigned char *sha1, int freshen)
 {
-	return check_and_freshen_file(sha1_file_name(sha1), freshen);
+	int ret;
+	int tried_hook = 0;
+
+retry:
+	ret = check_and_freshen_file(sha1_file_name(sha1), freshen);
+	if (!ret && !tried_hook) {
+		tried_hook = 1;
+		if (!external_odb_fault_in_object(sha1))
+			goto retry;
+	}
+	return ret;
 }
 
 static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
@@ -3000,7 +3010,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 	int rtype;
 	enum object_type real_type;
 	const unsigned char *real = lookup_replace_object_extended(sha1, flags);
+	int tried_hook = 0;
 
+retry:
 	co = find_cached_object(real);
 	if (co) {
 		if (oi->typep)
@@ -3026,8 +3038,14 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 
 		/* Not a loose object; someone else may have just packed it. */
 		reprepare_packed_git();
-		if (!find_pack_entry(real, &e))
+		if (!find_pack_entry(real, &e)) {
+			if (!tried_hook) {
+				tried_hook = 1;
+				if (!external_odb_fault_in_object(sha1))
+					goto retry;
+			}
 			return -1;
+		}
 	}
 
 	/*
@@ -3121,7 +3139,9 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 	unsigned long mapsize;
 	void *map, *buf;
 	struct cached_object *co;
+	int tried_hook = 0;
 
+retry:
 	co = find_cached_object(sha1);
 	if (co) {
 		*type = co->type;
@@ -3139,7 +3159,14 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 		return buf;
 	}
 	reprepare_packed_git();
-	return read_packed_sha1(sha1, type, size);
+	buf = read_packed_sha1(sha1, type, size);
+	if (!buf && !tried_hook) {
+		tried_hook = 1;
+		if (!external_odb_fault_in_object(sha1))
+			goto retry;
+	}
+
+	return buf;
 }
 
 /*
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 31/49] external-odb: add external_odb_get_capabilities()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (29 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 30/49] odb-helper: add read_object_process() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 32/49] t04*: add 'get_cap' support to helpers Christian Couder
                   ` (20 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 15 ++++++++++++++-
 odb-helper.c   | 23 +++++++++++++++++++++++
 odb-helper.h   |  1 +
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index 2efa805d12..8c2570b2e7 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -66,6 +66,14 @@ static int external_odb_config(const char *var, const char *value, void *data)
 	return 0;
 }
 
+static void external_odb_get_capabilities(void)
+{
+	struct odb_helper *o;
+
+	for (o = helpers; o; o = o->next)
+		odb_helper_get_capabilities(o);
+}
+
 static void external_odb_init(void)
 {
 	static int initialized;
@@ -75,6 +83,8 @@ static void external_odb_init(void)
 	initialized = 1;
 
 	git_config(external_odb_config, NULL);
+
+	external_odb_get_capabilities();
 }
 
 const char *external_odb_root(void)
@@ -94,9 +104,12 @@ int external_odb_has_object(const unsigned char *sha1)
 
 	external_odb_init();
 
-	for (o = helpers; o; o = o->next)
+	for (o = helpers; o; o = o->next) {
+		if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
+			return 1;
 		if (odb_helper_has_object(o, sha1))
 			return 1;
+	}
 	return 0;
 }
 
diff --git a/odb-helper.c b/odb-helper.c
index 20e83cb55a..a6bf81af8d 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -260,6 +260,29 @@ static int odb_helper_finish(struct odb_helper *o,
 	return 0;
 }
 
+int odb_helper_get_capabilities(struct odb_helper *o)
+{
+	struct odb_helper_cmd cmd;
+	FILE *fh;
+	struct strbuf line = STRBUF_INIT;
+
+	if (!o->script_mode)
+		return 0;
+
+	if (odb_helper_start(o, &cmd, 0, "get_cap") < 0)
+		return -1;
+
+	fh = xfdopen(cmd.child.out, "r");
+	while (strbuf_getline(&line, fh) != EOF)
+		parse_capabilities(line.buf, &o->supported_capabilities, o->name);
+
+	strbuf_release(&line);
+	fclose(fh);
+	odb_helper_finish(o, &cmd);
+
+	return 0;
+}
+
 static int parse_object_line(struct odb_helper_object *o, const char *line)
 {
 	char *end;
diff --git a/odb-helper.h b/odb-helper.h
index b23544aa4a..8e0b0fc781 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -33,6 +33,7 @@ struct odb_helper {
 };
 
 struct odb_helper *odb_helper_new(const char *name, int namelen);
+int odb_helper_get_capabilities(struct odb_helper *o);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 32/49] t04*: add 'get_cap' support to helpers
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (30 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 31/49] external-odb: add external_odb_get_capabilities() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 33/49] odb-helper: call odb_helper_lookup() with 'have' capability Christian Couder
                   ` (19 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh        | 4 ++++
 t/t0410-transfer-e-odb.sh      | 8 ++++++++
 t/t0420-transfer-http-e-odb.sh | 4 ++++
 3 files changed, 16 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 18d8c38862..efabf90a8b 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -13,6 +13,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
+get_cap)
+	echo "capability=get"
+	echo "capability=have"
+	;;
 have)
 	git cat-file --batch-check --batch-all-objects |
 	awk '{print $1 " " $3 " " $2}'
diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
index 8de9a08d7c..0c9cc3af7d 100755
--- a/t/t0410-transfer-e-odb.sh
+++ b/t/t0410-transfer-e-odb.sh
@@ -16,6 +16,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE1; export GIT_DIR
 case "$1" in
+get_cap)
+	echo "capability=get"
+	echo "capability=have"
+	;;
 have)
 	git cat-file --batch-check --batch-all-objects |
 	awk '{print $1 " " $3 " " $2}'
@@ -51,6 +55,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE2; export GIT_DIR
 case "$1" in
+get_cap)
+	echo "capability=get"
+	echo "capability=have"
+	;;
 have)
 	GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
 	;;
diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index b8062d14c0..45e66e355c 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -22,6 +22,10 @@ die() {
 }
 echo >&2 "odb-http-helper args:" "$@"
 case "$1" in
+get_cap)
+	echo "capability=get"
+	echo "capability=have"
+	;;
 have)
 	list_url="$HTTPD_URL/list/"
 	curl "$list_url" ||
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 33/49] odb-helper: call odb_helper_lookup() with 'have' capability
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (31 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 32/49] t04*: add 'get_cap' support to helpers Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 34/49] odb-helper: fix odb_helper_fetch_object() for read_object Christian Couder
                   ` (18 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index a6bf81af8d..910c87a482 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,19 +142,20 @@ static int check_object_process_error(int err,
 	return err;
 }
 
-static int read_object_process(const unsigned char *sha1)
+static int read_object_process(struct odb_helper *o, const unsigned char *sha1, int fd)
 {
 	int err;
 	struct read_object_process *entry;
 	struct child_process *process;
 	struct strbuf status = STRBUF_INIT;
-	const char *cmd = "read-object";
+	const char *cmd = o->cmd;
 	uint64_t start;
 
 	start = getnanotime();
 
 	entry = launch_read_object_process(cmd);
 	process = &entry->subprocess.process;
+	o->supported_capabilities = entry->supported_capabilities;
 
 	if (!(ODB_HELPER_CAP_GET & entry->supported_capabilities))
 		return -1;
@@ -173,6 +174,13 @@ static int read_object_process(const unsigned char *sha1)
 	if (err)
 		goto done;
 
+	if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN) {
+		struct strbuf buf;
+		read_packetized_to_strbuf(process->out, &buf);
+		if (err)
+			goto done;
+	}
+
 	subprocess_read_status(process->out, &status);
 	err = strcmp(status.buf, "success");
 
@@ -554,10 +562,11 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
 int odb_helper_fault_in_object(struct odb_helper *o,
 			       const unsigned char *sha1)
 {
-	struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
-
-	if (!obj)
-		return -1;
+	if (o->supported_capabilities & ODB_HELPER_CAP_HAVE) {
+		struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
+		if (!obj)
+			return -1;
+	}
 
 	if (o->script_mode) {
 		struct odb_helper_cmd cmd;
@@ -567,7 +576,7 @@ int odb_helper_fault_in_object(struct odb_helper *o,
 			return -1;
 		return 0;
 	} else {
-		return read_object_process(sha1);
+		return read_object_process(o, sha1, -1);
 	}
 }
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 34/49] odb-helper: fix odb_helper_fetch_object() for read_object
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (32 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 33/49] odb-helper: call odb_helper_lookup() with 'have' capability Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 35/49] Add t0460 to test passing git objects Christian Couder
                   ` (17 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 910c87a482..0017faa36e 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -584,15 +584,19 @@ int odb_helper_fetch_object(struct odb_helper *o,
 			    const unsigned char *sha1,
 			    int fd)
 {
-	switch(o->fetch_kind) {
-	case ODB_FETCH_KIND_PLAIN_OBJECT:
-		return odb_helper_fetch_plain_object(o, sha1, fd);
-	case ODB_FETCH_KIND_GIT_OBJECT:
-		return odb_helper_fetch_git_object(o, sha1, fd);
-	case ODB_FETCH_KIND_FAULT_IN:
-		return 0;
-	default:
-		BUG("invalid fetch kind '%d'", o->fetch_kind);
+	if (o->script_mode) {
+		switch(o->fetch_kind) {
+		case ODB_FETCH_KIND_PLAIN_OBJECT:
+			return odb_helper_fetch_plain_object(o, sha1, fd);
+		case ODB_FETCH_KIND_GIT_OBJECT:
+			return odb_helper_fetch_git_object(o, sha1, fd);
+		case ODB_FETCH_KIND_FAULT_IN:
+			return 0;
+		default:
+			BUG("invalid fetch kind '%d'", o->fetch_kind);
+		}
+	} else {
+		return read_object_process(o, sha1, fd);
 	}
 }
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 35/49] Add t0460 to test passing git objects
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (33 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 34/49] odb-helper: fix odb_helper_fetch_object() for read_object Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 36/49] odb-helper: add read_packetized_git_object_to_fd() Christian Couder
                   ` (16 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0460-read-object-git.sh | 29 ++++++++++++++++++++
 t/t0460/read-object-git    | 67 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+)
 create mode 100755 t/t0460-read-object-git.sh
 create mode 100755 t/t0460/read-object-git

diff --git a/t/t0460-read-object-git.sh b/t/t0460-read-object-git.sh
new file mode 100755
index 0000000000..d08b44cdce
--- /dev/null
+++ b/t/t0460-read-object-git.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process passing git objects'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0460"
+
+test_expect_success 'setup host repo with a root commit' '
+	test_commit zero &&
+	hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object-git"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+	git init guest-repo &&
+	(cd guest-repo &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "gitObject" &&
+	 git cat-file blob "$hash1")
+'
+
+test_expect_success 'invalid blobs generate errors' '
+	cd guest-repo &&
+	test_must_fail git cat-file blob "invalid"
+'
+
+test_done
diff --git a/t/t0460/read-object-git b/t/t0460/read-object-git
new file mode 100755
index 0000000000..356a22cd4c
--- /dev/null
+++ b/t/t0460/read-object-git
@@ -0,0 +1,67 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#	cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard <sha from rev-parse call above>
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get");
+packet_write_capabilities("get");
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	if ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		my $path = $sha1;
+		$path =~ s{..}{$&/};
+		$path = $DIR . "/objects/" . $path;
+
+		my $contents = do {
+		    local $/;
+		    open my $fh, $path or die "Can't open '$path': $!";
+		    <$fh>
+		};
+
+		packet_bin_write($contents);
+		packet_flush();
+		packet_txt_write("status=success");
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 36/49] odb-helper: add read_packetized_git_object_to_fd()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (34 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 35/49] Add t0460 to test passing git objects Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 37/49] odb-helper: add read_packetized_plain_object_to_fd() Christian Couder
                   ` (15 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 6 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 0017faa36e..a27208463c 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,6 +142,82 @@ static int check_object_process_error(int err,
 	return err;
 }
 
+static ssize_t read_packetized_git_object_to_fd(struct odb_helper *o,
+						const unsigned char *sha1,
+						int fd_in, int fd_out)
+{
+	ssize_t total_read = 0;
+	unsigned long total_got = 0;
+	int packet_len;
+	git_zstream stream;
+	int zret = Z_STREAM_END;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	memset(&stream, 0, sizeof(stream));
+	git_inflate_init(&stream);
+	git_SHA1_Init(&hash);
+
+	for (;;) {
+		/* packet_read() writes a '\0' extra byte at the end */
+		char buf[LARGE_PACKET_DATA_MAX + 1];
+
+		packet_len = packet_read(fd_in, NULL, NULL,
+			buf, LARGE_PACKET_DATA_MAX + 1,
+			PACKET_READ_GENTLE_ON_EOF);
+
+		if (packet_len <= 0)
+			break;
+
+		write_or_die(fd_out, buf, packet_len);
+
+		stream.next_in = (unsigned char *)buf;
+		stream.avail_in = packet_len;
+		do {
+			unsigned char inflated[4096];
+			unsigned long got;
+
+			stream.next_out = inflated;
+			stream.avail_out = sizeof(inflated);
+			zret = git_inflate(&stream, Z_SYNC_FLUSH);
+			got = sizeof(inflated) - stream.avail_out;
+
+			git_SHA1_Update(&hash, inflated, got);
+			/* skip header when counting size */
+			if (!total_got) {
+				const unsigned char *p = memchr(inflated, '\0', got);
+				if (p)
+					got -= p - inflated + 1;
+				else
+					got = 0;
+			}
+			total_got += got;
+		} while (stream.avail_in && zret == Z_OK);
+
+		total_read += packet_len;
+	}
+
+	git_inflate_end(&stream);
+
+	if (packet_len < 0)
+		return packet_len;
+
+	git_SHA1_Final(real_sha1, &hash);
+
+	if (zret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+	if (hashcmp(real_sha1, sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+
+	return total_read;
+}
+
 static int read_object_process(struct odb_helper *o, const unsigned char *sha1, int fd)
 {
 	int err;
@@ -174,12 +250,8 @@ static int read_object_process(struct odb_helper *o, const unsigned char *sha1,
 	if (err)
 		goto done;
 
-	if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN) {
-		struct strbuf buf;
-		read_packetized_to_strbuf(process->out, &buf);
-		if (err)
-			goto done;
-	}
+	if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN)
+		err = read_packetized_git_object_to_fd(o, sha1, process->out, fd) < 0;
 
 	subprocess_read_status(process->out, &status);
 	err = strcmp(status.buf, "success");
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 37/49] odb-helper: add read_packetized_plain_object_to_fd()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (35 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 36/49] odb-helper: add read_packetized_git_object_to_fd() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 38/49] Add t0470 to test passing plain objects Christian Couder
                   ` (14 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index a27208463c..b2d86a7928 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,6 +142,121 @@ static int check_object_process_error(int err,
 	return err;
 }
 
+static struct odb_helper_object *odb_helper_lookup(struct odb_helper *o,
+						   const unsigned char *sha1);
+
+static ssize_t read_packetized_plain_object_to_fd(struct odb_helper *o,
+						  const unsigned char *sha1,
+						  int fd_in, int fd_out)
+{
+	ssize_t total_read = 0;
+	unsigned long total_got = 0;
+	int packet_len;
+
+	char hdr[32];
+	int hdrlen;
+
+	int ret = Z_STREAM_END;
+	unsigned char compressed[4096];
+	git_zstream stream;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	off_t size;
+	enum object_type type;
+	const char *s;
+	int pkt_size;
+	char *size_buf;
+
+	size_buf = packet_read_line(fd_in, &pkt_size);
+	if (!skip_prefix(size_buf, "size=", &s))
+		return error("odb helper '%s' did not send size of plain object", o->name);
+	size = strtoumax(s, NULL, 10);
+	if (!skip_prefix(packet_read_line(fd_in, NULL), "kind=", &s))
+		return error("odb helper '%s' did not send kind of plain object", o->name);
+	/* Check if the object is not available */
+	if (!strcmp(s, "none"))
+		return -1;
+	type = type_from_string_gently(s, strlen(s), 1);
+	if (type < 0)
+		return error("odb helper '%s' sent bad type '%s'", o->name, s);
+
+	/* Set it up */
+	git_deflate_init(&stream, zlib_compression_level);
+	stream.next_out = compressed;
+	stream.avail_out = sizeof(compressed);
+	git_SHA1_Init(&hash);
+
+	/* First header.. */
+	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), size) + 1;
+	stream.next_in = (unsigned char *)hdr;
+	stream.avail_in = hdrlen;
+	while (git_deflate(&stream, 0) == Z_OK)
+		; /* nothing */
+	git_SHA1_Update(&hash, hdr, hdrlen);
+
+	for (;;) {
+		/* packet_read() writes a '\0' extra byte at the end */
+		char buf[LARGE_PACKET_DATA_MAX + 1];
+
+		packet_len = packet_read(fd_in, NULL, NULL,
+			buf, LARGE_PACKET_DATA_MAX + 1,
+			PACKET_READ_GENTLE_ON_EOF);
+
+		if (packet_len <= 0)
+			break;
+
+		total_got += packet_len;
+
+		/* Then the data itself.. */
+		stream.next_in = (void *)buf;
+		stream.avail_in = packet_len;
+		do {
+			unsigned char *in0 = stream.next_in;
+			ret = git_deflate(&stream, Z_FINISH);
+			git_SHA1_Update(&hash, in0, stream.next_in - in0);
+			write_or_die(fd_out, compressed, stream.next_out - compressed);
+			stream.next_out = compressed;
+			stream.avail_out = sizeof(compressed);
+		} while (ret == Z_OK);
+
+		total_read += packet_len;
+	}
+
+	if (packet_len < 0) {
+		error("unable to read from odb helper '%s': %s",
+		      o->name, strerror(errno));
+		git_deflate_end(&stream);
+		return packet_len;
+	}
+
+	if (ret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+
+	ret = git_deflate_end_gently(&stream);
+	if (ret != Z_OK) {
+		warning("deflateEnd on object %s from odb helper '%s' failed (%d)",
+			sha1_to_hex(sha1), o->name, ret);
+		return -1;
+	}
+	git_SHA1_Final(real_sha1, &hash);
+	if (hashcmp(sha1, real_sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+	if (total_got != size) {
+		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
+			o->name, sha1_to_hex(sha1), total_got, size);
+		return -1;
+	}
+
+	return total_read;
+}
+
 static ssize_t read_packetized_git_object_to_fd(struct odb_helper *o,
 						const unsigned char *sha1,
 						int fd_in, int fd_out)
@@ -250,7 +365,9 @@ static int read_object_process(struct odb_helper *o, const unsigned char *sha1,
 	if (err)
 		goto done;
 
-	if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN)
+	if (o->fetch_kind == ODB_FETCH_KIND_PLAIN_OBJECT)
+		err = read_packetized_plain_object_to_fd(o, sha1, process->out, fd) < 0;
+	else if (o->fetch_kind == ODB_FETCH_KIND_GIT_OBJECT)
 		err = read_packetized_git_object_to_fd(o, sha1, process->out, fd) < 0;
 
 	subprocess_read_status(process->out, &status);
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 38/49] Add t0470 to test passing plain objects
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (36 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 37/49] odb-helper: add read_packetized_plain_object_to_fd() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 39/49] odb-helper: add write_object_process() Christian Couder
                   ` (13 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0470-read-object-http-e-odb.sh | 123 ++++++++++++++++++++++++++++++++++++++
 t/t0470/read-object-plain         |  93 ++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)
 create mode 100755 t/t0470-read-object-http-e-odb.sh
 create mode 100755 t/t0470/read-object-plain

diff --git a/t/t0470-read-object-http-e-odb.sh b/t/t0470-read-object-http-e-odb.sh
new file mode 100755
index 0000000000..3360a98ec3
--- /dev/null
+++ b/t/t0470-read-object-http-e-odb.sh
@@ -0,0 +1,123 @@
+#!/bin/sh
+
+test_description='tests for read-object process passing plain objects to an HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0470"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain"
+
+test_expect_success 'setup repo with a root commit' '
+	test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+	git init other-repo &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+	git config odb.magic.command "$HELPER" &&
+	git config odb.magic.fetchKind "plainObject"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME&size=123&type=blob"
+
+test_expect_success 'can upload a file' '
+	echo "Hello Apache World!" >hello_to_send.txt &&
+	echo "How are you?" >>hello_to_send.txt &&
+	curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" >out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+	curl --include "$LIST_URL" >out_list &&
+	grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+	curl --data "delete" --include "$UPLOAD_URL&delete=1" >out_delete &&
+	curl --include "$LIST_URL" >out_list2 &&
+	! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	echo "$hash1-4-blob" >expected &&
+	ls "$FILES_DIR" >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+	git cat-file blob "$hash1" &&
+	git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "plainObject" &&
+	 git cat-file blob "$hash1" &&
+	 git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone .. . &&
+	 git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 test_must_fail git clone --no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git clone -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		--no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local initial-refspec clone succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "plainObject" &&
+	 git -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		clone --no-local --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
diff --git a/t/t0470/read-object-plain b/t/t0470/read-object-plain
new file mode 100755
index 0000000000..bb65ca908a
--- /dev/null
+++ b/t/t0470/read-object-plain
@@ -0,0 +1,93 @@
+#!/usr/bin/perl
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+use LWP::UserAgent;
+use HTTP::Request::Common;
+
+print STDERR "read-object-plain: starting\n";
+
+packet_initialize("git-read-object", 1);
+
+print STDERR "read-object-plain: after init\n";
+
+packet_read_and_check_capabilities("get", "put");
+packet_write_capabilities("get", "put");
+
+print STDERR "read-object-plain: after reading and writing get capability\n";
+
+my $http_url = $ENV{HTTPD_URL};
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	print STDERR "read-object-plain: command: '$command'\n";
+
+	if ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		my $get_url = $http_url . "/list/?sha1=" . $sha1;
+		print STDERR "read-object-plain: get_url: '$get_url'\n";
+
+		my $userAgent = LWP::UserAgent->new();
+
+		my $response = $userAgent->get( $get_url );
+
+		if ($response->is_error) {
+		    print STDERR $response->error_as_HTML . "\n";
+		    packet_txt_write("size=0");
+		    packet_txt_write("kind=none");	    
+		    packet_txt_write("status=notfound");
+		} else {
+		    print STDERR "content: \n";
+		    print STDERR $response->content;
+		    packet_txt_write("size=" . length($response->content));
+		    packet_txt_write("kind=blob");
+		    packet_bin_write($response->content);
+		    packet_flush();
+		    packet_txt_write("status=success");
+		}
+
+		packet_flush();
+	} elsif ( $command eq "put" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		print STDERR "read-object-plain: put sha1: '$sha1'\n";
+
+		my ($size) = packet_txt_read() =~ /^size=([0-9]+)$/;
+		print STDERR "read-object-plain: put size: '$size'\n";
+
+		my ($kind) = packet_txt_read() =~ /^kind=(\w+)$/;
+		print STDERR "read-object-plain: put kind: '$kind'\n";
+
+		packet_bin_read();
+
+		# We must read the content we are sent and send it to the right url
+		my ($res, $buf) = packet_bin_read();
+		die "bad packet_bin_read res ($res)" unless ($res eq 0);
+		( packet_bin_read() eq ( 1, "" ) ) || die "bad send end";		
+
+		my $upload_url = $http_url . "/upload/?sha1=" . $sha1 . "&size=" . $size . "&type=blob";
+		print STDERR "read-object-plain: upload_url: '$upload_url'\n";
+		print STDERR "read-object-plain: upload buffer: '$buf'\n";
+
+		my $userAgent = LWP::UserAgent->new();
+		my $request = POST $upload_url, Content_Type => 'multipart/form-data', Content => $buf;
+
+		my $response = $userAgent->request($request);
+
+		if ($response->is_error) {
+			print STDERR $response->error_as_HTML . "\n";
+			packet_txt_write("status=failure");
+		} else {
+			packet_txt_write("status=success");
+		}
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 39/49] odb-helper: add write_object_process()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (37 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 38/49] Add t0470 to test passing plain objects Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 40/49] Add t0480 to test "have" capability and plain objects Christian Couder
                   ` (12 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 73 insertions(+), 3 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index b2d86a7928..e21113c0b8 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -383,6 +383,65 @@ static int read_object_process(struct odb_helper *o, const unsigned char *sha1,
 	return err;
 }
 
+static int write_object_process(struct odb_helper *o,
+				const void *buf, size_t len,
+				const char *type, unsigned char *sha1)
+{
+	int err;
+	struct read_object_process *entry;
+	struct child_process *process;
+	struct strbuf status = STRBUF_INIT;
+	const char *cmd = o->cmd;
+	uint64_t start;
+
+	start = getnanotime();
+
+	entry = launch_read_object_process(cmd);
+	process = &entry->subprocess.process;
+	o->supported_capabilities = entry->supported_capabilities;
+
+	if (!(ODB_HELPER_CAP_PUT & entry->supported_capabilities))
+		return -1;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	err = packet_write_fmt_gently(process->in, "command=put\n");
+	if (err)
+		goto done;
+
+	err = packet_write_fmt_gently(process->in, "sha1=%s\n", sha1_to_hex(sha1));
+	if (err)
+		goto done;
+
+	err = packet_write_fmt_gently(process->in, "size=%"PRIuMAX"\n", len);
+	if (err)
+		goto done;
+
+	err = packet_write_fmt_gently(process->in, "kind=blob\n");
+	if (err)
+		goto done;
+
+	err = packet_flush_gently(process->in);
+	if (err)
+		goto done;
+
+	err = write_packetized_from_buf(buf, len, process->in);
+	if (err)
+		goto done;
+
+	subprocess_read_status(process->out, &status);
+	err = strcmp(status.buf, "success");
+
+done:
+	sigchain_pop(SIGPIPE);
+
+	err = check_object_process_error(err, status.buf, entry, cmd, ODB_HELPER_CAP_PUT);
+
+	trace_performance_since(start, "write_object_process");
+
+	return err;
+}
+
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
 	struct odb_helper *o;
@@ -804,9 +863,9 @@ int odb_helper_for_each_object(struct odb_helper *o,
 	return 0;
 }
 
-int odb_helper_write_object(struct odb_helper *o,
-			    const void *buf, size_t len,
-			    const char *type, unsigned char *sha1)
+int odb_helper_write_plain_object(struct odb_helper *o,
+				  const void *buf, size_t len,
+				  const char *type, unsigned char *sha1)
 {
 	struct odb_helper_cmd cmd;
 
@@ -832,3 +891,14 @@ int odb_helper_write_object(struct odb_helper *o,
 	odb_helper_finish(o, &cmd);
 	return 0;
 }
+
+int odb_helper_write_object(struct odb_helper *o,
+			    const void *buf, size_t len,
+			    const char *type, unsigned char *sha1)
+{
+	if (o->script_mode) {
+		return odb_helper_write_plain_object(o, buf, len, type, sha1);
+	} else {
+		return write_object_process(o, buf, len, type, sha1);
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 40/49] Add t0480 to test "have" capability and plain objects
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (38 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 39/49] odb-helper: add write_object_process() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 41/49] external-odb: add external_odb_do_fetch_object() Christian Couder
                   ` (11 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0480-read-object-have-http-e-odb.sh | 123 +++++++++++++++++++++++++++++++++
 t/t0480/read-object-plain-have         | 116 +++++++++++++++++++++++++++++++
 2 files changed, 239 insertions(+)
 create mode 100755 t/t0480-read-object-have-http-e-odb.sh
 create mode 100755 t/t0480/read-object-plain-have

diff --git a/t/t0480-read-object-have-http-e-odb.sh b/t/t0480-read-object-have-http-e-odb.sh
new file mode 100755
index 0000000000..52fb4d46c9
--- /dev/null
+++ b/t/t0480-read-object-have-http-e-odb.sh
@@ -0,0 +1,123 @@
+#!/bin/sh
+
+test_description='tests for read-object process with "have" cap and plain objects'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0480"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain-have"
+
+test_expect_success 'setup repo with a root commit' '
+	test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+	git init other-repo &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+	git config odb.magic.command "$HELPER" &&
+	git config odb.magic.fetchKind "plainObject"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME&size=123&type=blob"
+
+test_expect_success 'can upload a file' '
+	echo "Hello Apache World!" >hello_to_send.txt &&
+	echo "How are you?" >>hello_to_send.txt &&
+	curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" >out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+	curl --include "$LIST_URL" >out_list &&
+	grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+	curl --data "delete" --include "$UPLOAD_URL&delete=1" >out_delete &&
+	curl --include "$LIST_URL" >out_list2 &&
+	! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	echo "$hash1-4-blob" >expected &&
+	ls "$FILES_DIR" >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+	git cat-file blob "$hash1" &&
+	git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "plainObject" &&
+	 git cat-file blob "$hash1" &&
+	 git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone .. . &&
+	 git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 test_must_fail git clone --no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git clone -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		--no-local .. .) &&
+	rm -rf my-other-clone
+'
+
+test_expect_success 'no-local initial-refspec clone succeeds' '
+	mkdir my-other-clone &&
+	(cd my-other-clone &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.fetchKind "plainObject" &&
+	 git -c odb.magic.command="$HELPER" \
+		-c odb.magic.plainObjects="true" \
+		clone --no-local --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
diff --git a/t/t0480/read-object-plain-have b/t/t0480/read-object-plain-have
new file mode 100755
index 0000000000..dbed8eaefb
--- /dev/null
+++ b/t/t0480/read-object-plain-have
@@ -0,0 +1,116 @@
+#!/usr/bin/perl
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+use LWP::UserAgent;
+use HTTP::Request::Common;
+
+print STDERR "read-object-plain-have: starting\n";
+
+packet_initialize("git-read-object", 1);
+
+print STDERR "read-object-plain-have: after init\n";
+
+packet_read_and_check_capabilities("get", "put", "have");
+packet_write_capabilities("get", "put", "have");
+
+print STDERR "read-object-plain-have: after reading and writing get and have capabilities\n";
+
+my $http_url = $ENV{HTTPD_URL};
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	print STDERR "read-object-plain-have: command: '$command'\n";
+
+	if ( $command eq "have" ) {
+		# read the flush after the command
+		packet_bin_read();
+
+		my $have_url = $http_url . "/list/";
+		print STDERR "read-object-plain-have: have_url: '$have_url'\n";
+
+		my $userAgent = LWP::UserAgent->new();
+		my $response = $userAgent->get( $have_url );
+
+		if ($response->is_error) {
+		    print STDERR $response->error_as_HTML . "\n";
+		    packet_bin_write("");
+		    packet_flush();
+		    packet_txt_write("status=failure");
+		} else {
+		    print STDERR "content: \n";
+		    print STDERR $response->content;
+		    packet_bin_write($response->content);
+		    packet_flush();
+		    packet_txt_write("status=success");
+		}
+		packet_flush();
+	} elsif ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		my $get_url = $http_url . "/list/?sha1=" . $sha1;
+		print STDERR "read-object-plain-have: get_url: '$get_url'\n";
+
+		my $userAgent = LWP::UserAgent->new();
+
+		my $response = $userAgent->get( $get_url );
+
+		if ($response->is_error) {
+		    print STDERR $response->error_as_HTML . "\n";
+		    packet_txt_write("size=0");
+		    packet_txt_write("kind=none");	    
+		    packet_txt_write("status=notfound");
+		} else {
+		    print STDERR "content: \n";
+		    print STDERR $response->content;
+		    packet_txt_write("size=" . length($response->content));
+		    packet_txt_write("kind=blob");
+		    packet_bin_write($response->content);
+		    packet_flush();
+		    packet_txt_write("status=success");
+		}
+
+		packet_flush();
+	} elsif ( $command eq "put" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		print STDERR "read-object-plain-have: put sha1: '$sha1'\n";
+
+		my ($size) = packet_txt_read() =~ /^size=([0-9]+)$/;
+		print STDERR "read-object-plain-have: put size: '$size'\n";
+
+		my ($kind) = packet_txt_read() =~ /^kind=(\w+)$/;
+		print STDERR "read-object-plain-have: put kind: '$kind'\n";
+
+		packet_bin_read();
+
+		# We must read the content we are sent and send it to the right url
+		my ($res, $buf) = packet_bin_read();
+		die "bad packet_bin_read res ($res)" unless ($res eq 0);
+		( packet_bin_read() eq ( 1, "" ) ) || die "bad send end";		
+
+		my $upload_url = $http_url . "/upload/?sha1=" . $sha1 . "&size=" . $size . "&type=blob";
+		print STDERR "read-object-plain-have: upload_url: '$upload_url'\n";
+		print STDERR "read-object-plain-have: upload buffer: '$buf'\n";
+
+		my $userAgent = LWP::UserAgent->new();
+		my $request = POST $upload_url, Content_Type => 'multipart/form-data', Content => $buf;
+
+		my $response = $userAgent->request($request);
+
+		if ($response->is_error) {
+			print STDERR $response->error_as_HTML . "\n";
+			packet_txt_write("status=failure");
+		} else {
+			packet_txt_write("status=success");
+		}
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 41/49] external-odb: add external_odb_do_fetch_object()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (39 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 40/49] Add t0480 to test "have" capability and plain objects Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 42/49] odb-helper: advertise 'have' capability Christian Couder
                   ` (10 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 52 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 8c2570b2e7..c39f207dd3 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -95,32 +95,11 @@ const char *external_odb_root(void)
 	return root;
 }
 
-int external_odb_has_object(const unsigned char *sha1)
-{
-	struct odb_helper *o;
-
-	if (!use_external_odb)
-		return 0;
-
-	external_odb_init();
-
-	for (o = helpers; o; o = o->next) {
-		if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
-			return 1;
-		if (odb_helper_has_object(o, sha1))
-			return 1;
-	}
-	return 0;
-}
-
-int external_odb_fetch_object(const unsigned char *sha1)
+static int external_odb_do_fetch_object(const unsigned char *sha1)
 {
 	struct odb_helper *o;
 	const char *path;
 
-	if (!external_odb_has_object(sha1))
-		return -1;
-
 	path = sha1_file_name_alt(external_odb_root(), sha1);
 	safe_create_leading_directories_const(path);
 	prepare_external_alt_odb();
@@ -175,6 +154,35 @@ int external_odb_fault_in_object(const unsigned char *sha1)
 	return -1;
 }
 
+int external_odb_has_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	if (!use_external_odb)
+		return 0;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next) {
+		if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE)) {
+			if (o->fetch_kind == ODB_FETCH_KIND_FAULT_IN)
+				return 1;
+			return !external_odb_do_fetch_object(sha1);
+		}
+		if (odb_helper_has_object(o, sha1))
+			return 1;
+	}
+	return 0;
+}
+
+int external_odb_fetch_object(const unsigned char *sha1)
+{
+	if (!external_odb_has_object(sha1))
+		return -1;
+
+	return external_odb_do_fetch_object(sha1);
+}
+
 int external_odb_for_each_object(each_external_object_fn fn, void *data)
 {
 	struct odb_helper *o;
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 42/49] odb-helper: advertise 'have' capability
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (40 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 41/49] external-odb: add external_odb_do_fetch_object() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 43/49] odb-helper: advertise 'put' capability Christian Couder
                   ` (9 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index e21113c0b8..2cd1f25e83 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -67,7 +67,7 @@ static int start_read_object_fn(struct subprocess_entry *subprocess)
 	if (err)
 		goto done;
 
-	err = packet_writel(process->in, "capability=get", NULL);
+	err = packet_writel(process->in, "capability=get", "capability=have", NULL);
 	if (err)
 		goto done;
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 43/49] odb-helper: advertise 'put' capability
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (41 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 42/49] odb-helper: advertise 'have' capability Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 44/49] odb-helper: add have_object_process() Christian Couder
                   ` (8 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index 2cd1f25e83..2e5d8af526 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -67,7 +67,11 @@ static int start_read_object_fn(struct subprocess_entry *subprocess)
 	if (err)
 		goto done;
 
-	err = packet_writel(process->in, "capability=get", "capability=have", NULL);
+	err = packet_writel(process->in,
+			    "capability=get",
+			    "capability=put",
+			    "capability=have",
+			    NULL);
 	if (err)
 		goto done;
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 44/49] odb-helper: add have_object_process()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (42 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 43/49] odb-helper: advertise 'put' capability Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 45/49] clone: add initial param to write_remote_refs() Christian Couder
                   ` (7 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 odb-helper.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 12 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 2e5d8af526..01cd6a713c 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -579,27 +579,106 @@ static int odb_helper_object_cmp(const void *va, const void *vb)
 	return hashcmp(a->sha1, b->sha1);
 }
 
+static int have_object_process(struct odb_helper *o)
+{
+	int err;
+	struct read_object_process *entry;
+	struct child_process *process;
+	struct strbuf status = STRBUF_INIT;
+	const char *cmd = o->cmd;
+	uint64_t start;
+	char *line;
+	int packet_len;
+	int total_got = 0;
+
+	start = getnanotime();
+
+	entry = launch_read_object_process(cmd);
+	process = &entry->subprocess.process;
+	o->supported_capabilities = entry->supported_capabilities;
+
+	if (!(ODB_HELPER_CAP_HAVE & entry->supported_capabilities))
+		return -1;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	err = packet_write_fmt_gently(process->in, "command=have\n");
+	if (err)
+		goto done;
+
+	err = packet_flush_gently(process->in);
+	if (err)
+		goto done;
+
+	for (;;) {
+		/* packet_read() writes a '\0' extra byte at the end */
+		char buf[LARGE_PACKET_DATA_MAX + 1];
+		char *p = buf;
+		int more;
+
+		packet_len = packet_read(process->out, NULL, NULL,
+			buf, LARGE_PACKET_DATA_MAX + 1,
+			PACKET_READ_GENTLE_ON_EOF);
+
+		if (packet_len <= 0)
+			break;
+
+		total_got += packet_len;
+
+		do {
+			char *eol = strchrnul(p, '\n');
+			more = (*eol == '\n');
+			*eol = '\0';
+			if (add_have_entry(o, p))
+				break;
+			p = eol + 1;
+		} while (more);
+	}
+
+	if (packet_len < 0) {
+		err = packet_len;
+		goto done;
+	}
+
+	subprocess_read_status(process->out, &status);
+	err = strcmp(status.buf, "success");
+
+done:
+	sigchain_pop(SIGPIPE);
+
+	err = check_object_process_error(err, status.buf, entry, cmd, ODB_HELPER_CAP_HAVE);
+
+	trace_performance_since(start, "have_object_process");
+
+	return err;
+}
+
 static void odb_helper_load_have(struct odb_helper *o)
 {
-	struct odb_helper_cmd cmd;
-	FILE *fh;
-	struct strbuf line = STRBUF_INIT;
 
 	if (o->have_valid)
 		return;
 	o->have_valid = 1;
 
-	if (odb_helper_start(o, &cmd, 0, "have") < 0)
-		return;
+	if (o->script_mode) {
+		struct odb_helper_cmd cmd;
+		FILE *fh;
+		struct strbuf line = STRBUF_INIT;
 
-	fh = xfdopen(cmd.child.out, "r");
-	while (strbuf_getline(&line, fh) != EOF)
-		if (add_have_entry(o, line.buf))
-			break;
+		if (odb_helper_start(o, &cmd, 0, "have") < 0)
+			return;
 
-	strbuf_release(&line);
-	fclose(fh);
-	odb_helper_finish(o, &cmd);
+		fh = xfdopen(cmd.child.out, "r");
+		while (strbuf_getline(&line, fh) != EOF)
+			if (add_have_entry(o, line.buf))
+				break;
+
+		strbuf_release(&line);
+		fclose(fh);
+		odb_helper_finish(o, &cmd);
+	} else {
+		have_object_process(o);
+	}
 
 	qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
 }
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 45/49] clone: add initial param to write_remote_refs()
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (43 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 44/49] odb-helper: add have_object_process() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 46/49] clone: add --initial-refspec option Christian Couder
                   ` (6 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/clone.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 370a233d22..bd690576e6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -572,7 +572,7 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
 	return local_refs;
 }
 
-static void write_remote_refs(const struct ref *local_refs)
+static void write_remote_refs(const struct ref *local_refs, int initial)
 {
 	const struct ref *r;
 
@@ -591,8 +591,13 @@ static void write_remote_refs(const struct ref *local_refs)
 			die("%s", err.buf);
 	}
 
-	if (initial_ref_transaction_commit(t, &err))
-		die("%s", err.buf);
+	if (initial) {
+		if (initial_ref_transaction_commit(t, &err))
+			die("%s", err.buf);
+	} else {
+		if (ref_transaction_commit(t, &err))
+			die("%s", err.buf);
+	}
 
 	strbuf_release(&err);
 	ref_transaction_free(t);
@@ -639,7 +644,8 @@ static void update_remote_refs(const struct ref *refs,
 			       const char *branch_top,
 			       const char *msg,
 			       struct transport *transport,
-			       int check_connectivity)
+			       int check_connectivity,
+			       int initial)
 {
 	const struct ref *rm = mapped_refs;
 
@@ -654,7 +660,7 @@ static void update_remote_refs(const struct ref *refs,
 	}
 
 	if (refs) {
-		write_remote_refs(mapped_refs);
+		write_remote_refs(mapped_refs, initial);
 		if (option_single_branch && !option_no_tags)
 			write_followtags(refs, msg);
 	}
@@ -1163,7 +1169,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		transport_fetch_refs(transport, mapped_refs);
 
 	update_remote_refs(refs, mapped_refs, remote_head_points_at,
-			   branch_top.buf, reflog_msg.buf, transport, !is_local);
+			   branch_top.buf, reflog_msg.buf, transport,
+			   !is_local, 0);
 
 	update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 46/49] clone: add --initial-refspec option
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (44 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 45/49] clone: add initial param to write_remote_refs() Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 47/49] clone: disable external odb before initial clone Christian Couder
                   ` (5 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/clone.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index bd690576e6..dda0ad360b 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -55,6 +55,7 @@ static enum transport_family family;
 static struct string_list option_config = STRING_LIST_INIT_NODUP;
 static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
 static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
+static struct string_list option_initial_refspec = STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
@@ -105,6 +106,8 @@ static struct option builtin_clone_options[] = {
 			N_("reference repository")),
 	OPT_STRING_LIST(0, "reference-if-able", &option_optional_reference,
 			N_("repo"), N_("reference repository")),
+	OPT_STRING_LIST(0, "initial-refspec", &option_initial_refspec,
+			N_("refspec"), N_("fetch this refspec first")),
 	OPT_BOOL(0, "dissociate", &option_dissociate,
 		 N_("use --reference only while cloning")),
 	OPT_STRING('o', "origin", &option_origin, N_("name"),
@@ -864,6 +867,47 @@ static void dissociate_from_references(void)
 	free(alternates);
 }
 
+static struct refspec *parse_initial_refspecs(void)
+{
+	const char **refspecs;
+	struct refspec *initial_refspecs;
+	struct string_list_item *rs;
+	int i = 0;
+
+	if (!option_initial_refspec.nr)
+		return NULL;
+
+	refspecs = xcalloc(option_initial_refspec.nr, sizeof(const char *));
+
+	for_each_string_list_item(rs, &option_initial_refspec)
+		refspecs[i++] = rs->string;
+
+	initial_refspecs = parse_fetch_refspec(option_initial_refspec.nr, refspecs);
+
+	free(refspecs);
+
+	return initial_refspecs;
+}
+
+static void fetch_initial_refs(struct transport *transport,
+			       const struct ref *refs,
+			       struct refspec *initial_refspecs,
+			       const char *branch_top,
+			       const char *reflog_msg,
+			       int is_local)
+{
+	int i;
+
+	for (i = 0; i < option_initial_refspec.nr; i++) {
+		struct ref *init_refs = NULL;
+		struct ref **tail = &init_refs;
+		get_fetch_map(refs, &initial_refspecs[i], &tail, 0);
+		transport_fetch_refs(transport, init_refs);
+		update_remote_refs(refs, init_refs, NULL, branch_top, reflog_msg,
+				   transport, !is_local, 1);
+	}
+}
+
 int cmd_clone(int argc, const char **argv, const char *prefix)
 {
 	int is_bundle = 0, is_local;
@@ -887,6 +931,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	struct refspec *refspec;
 	const char *fetch_pattern;
 
+	struct refspec *initial_refspecs;
+	int is_initial;
+
 	packet_trace_identity("clone");
 	argc = parse_options(argc, argv, prefix, builtin_clone_options,
 			     builtin_clone_usage, 0);
@@ -1054,6 +1101,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
+	initial_refspecs = parse_initial_refspecs();
+
 	fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
 	refspec = parse_fetch_refspec(1, &fetch_pattern);
 	free((char *)fetch_pattern);
@@ -1109,6 +1158,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	refs = transport_get_remote_refs(transport);
 
 	if (refs) {
+		fetch_initial_refs(transport, refs, initial_refspecs,
+				   branch_top.buf, reflog_msg.buf, is_local);
+
 		mapped_refs = wanted_peer_refs(refs, refspec);
 		/*
 		 * transport_get_remote_refs() may return refs with null sha-1
@@ -1168,9 +1220,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	else if (refs && complete_refs_before_fetch)
 		transport_fetch_refs(transport, mapped_refs);
 
+	is_initial = !refs || option_initial_refspec.nr == 0;
 	update_remote_refs(refs, mapped_refs, remote_head_points_at,
 			   branch_top.buf, reflog_msg.buf, transport,
-			   !is_local, 0);
+			   !is_local, is_initial);
 
 	update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 47/49] clone: disable external odb before initial clone
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (45 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 46/49] clone: add --initial-refspec option Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 48/49] Add test for 'clone --initial-refspec' Christian Couder
                   ` (4 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/clone.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index dda0ad360b..a0d7b2bd2f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -933,6 +933,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 	struct refspec *initial_refspecs;
 	int is_initial;
+	int saved_use_external_odb;
 
 	packet_trace_identity("clone");
 	argc = parse_options(argc, argv, prefix, builtin_clone_options,
@@ -1078,6 +1079,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 
+	/* Temporarily disable external ODB before initial clone */
+	saved_use_external_odb = use_external_odb;
+	use_external_odb = 0;
+
 	if (option_bare) {
 		if (option_mirror)
 			src_ref_prefix = "refs/";
@@ -1161,6 +1166,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		fetch_initial_refs(transport, refs, initial_refspecs,
 				   branch_top.buf, reflog_msg.buf, is_local);
 
+		use_external_odb = saved_use_external_odb;
+
 		mapped_refs = wanted_peer_refs(refs, refspec);
 		/*
 		 * transport_get_remote_refs() may return refs with null sha-1
@@ -1202,6 +1209,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 					option_branch, option_origin);
 
 		warning(_("You appear to have cloned an empty repository."));
+
+		use_external_odb = saved_use_external_odb;
+
 		mapped_refs = NULL;
 		our_head_points_at = NULL;
 		remote_head_points_at = NULL;
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 48/49] Add test for 'clone --initial-refspec'
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (46 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 47/49] clone: disable external odb before initial clone Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20  7:55 ` [RFC/PATCH v4 49/49] t: add t0430 to test cloning using bundles Christian Couder
                   ` (3 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t5616-clone-initial-refspec.sh | 48 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100755 t/t5616-clone-initial-refspec.sh

diff --git a/t/t5616-clone-initial-refspec.sh b/t/t5616-clone-initial-refspec.sh
new file mode 100755
index 0000000000..ccbc27f83f
--- /dev/null
+++ b/t/t5616-clone-initial-refspec.sh
@@ -0,0 +1,48 @@
+#!/bin/sh
+
+test_description='test clone with --initial-refspec option'
+. ./test-lib.sh
+
+
+test_expect_success 'setup regular repo' '
+	# Make two branches, "master" and "side"
+	echo one >file &&
+	git add file &&
+	git commit -m one &&
+	echo two >file &&
+	git commit -a -m two &&
+	git tag two &&
+	echo three >file &&
+	git commit -a -m three &&
+	git checkout -b side &&
+	echo four >file &&
+	git commit -a -m four &&
+	git checkout master
+'
+
+test_expect_success 'add a special ref pointing to a blob' '
+	hash=$(echo "Hello world!" | git hash-object -w -t blob --stdin) &&
+	git update-ref refs/special/hello "$hash"
+'
+
+test_expect_success 'no-local clone from the first repo' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone --no-local .. . &&
+	 test_must_fail git cat-file blob "$hash") &&
+	rm -rf my-clone
+'
+
+test_expect_success 'no-local clone with --initial-refspec' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone --no-local --initial-refspec "refs/special/*:refs/special/*" .. . &&
+	 git cat-file blob "$hash" &&
+	 git rev-parse refs/special/hello >actual &&
+	 echo "$hash" >expected &&
+	 test_cmp expected actual) &&
+	rm -rf my-clone
+'
+
+test_done
+
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC/PATCH v4 49/49] t: add t0430 to test cloning using bundles
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (47 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 48/49] Add test for 'clone --initial-refspec' Christian Couder
@ 2017-06-20  7:55 ` Christian Couder
  2017-06-20 13:48 ` [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (2 subsequent siblings)
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20  7:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0430-clone-bundle-e-odb.sh | 91 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)
 create mode 100755 t/t0430-clone-bundle-e-odb.sh

diff --git a/t/t0430-clone-bundle-e-odb.sh b/t/t0430-clone-bundle-e-odb.sh
new file mode 100755
index 0000000000..8934bea006
--- /dev/null
+++ b/t/t0430-clone-bundle-e-odb.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='tests for cloning using a bundle through e-odb'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-clone-bundle-helper <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+echo >&2 "odb-clone-bundle-helper args:" "$@"
+case "$1" in
+get_cap)
+	echo "capability=get"
+	echo "capability=have"
+	;;
+have)
+	ref_hash=$(git rev-parse refs/odbs/magic/bundle) ||
+	die "couldn't find refs/odbs/magic/bundle"
+	GIT_NO_EXTERNAL_ODB=1 git cat-file blob "$ref_hash" >bundle_info ||
+	die "couldn't get blob $ref_hash"
+	bundle_url=$(sed -e 's/bundle url: //' bundle_info)
+	echo >&2 "bundle_url: '$bundle_url'"
+	curl "$bundle_url" -o bundle_file ||
+	die "curl '$bundle_url' failed"
+	GIT_NO_EXTERNAL_ODB=1 git bundle unbundle bundle_file >unbundling_info ||
+	die "unbundling 'bundle_file' failed"
+	;;
+get)
+	die "odb-clone-bundle-helper 'get' called"
+	;;
+put)
+	die "odb-clone-bundle-helper 'put' called"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-clone-bundle-helper"
+
+
+test_expect_success 'setup repo with a few commits' '
+	test_commit one &&
+	test_commit two &&
+	test_commit three &&
+	test_commit four
+'
+
+BUNDLE_FILE="file.bundle"
+FILES_DIR="httpd/www/files"
+GET_URL="$HTTPD_URL/files/$BUNDLE_FILE"
+
+test_expect_success 'create a bundle for this repo and check that it can be downloaded' '
+	git bundle create "$BUNDLE_FILE" master &&
+	mkdir "$FILES_DIR" &&
+	cp "$BUNDLE_FILE" "$FILES_DIR/" &&
+	curl "$GET_URL" --output actual &&
+	test_cmp "$BUNDLE_FILE" actual
+'
+
+test_expect_success 'create an e-odb ref for this bundle' '
+	ref_hash=$(echo "bundle url: $GET_URL" | GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) &&
+	git update-ref refs/odbs/magic/bundle "$ref_hash"
+'
+
+test_expect_success 'clone using the e-odb helper to download and install the bundle' '
+	mkdir my-clone &&
+	(cd my-clone &&
+	 git clone --no-local \
+		-c odb.magic.command="$HELPER" \
+		-c odb.magic.fetchKind="faultin" \
+		-c odb.magic.scriptMode="true" \
+		--initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
-- 
2.13.1.565.gbfcd7a9048


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (48 preceding siblings ...)
  2017-06-20  7:55 ` [RFC/PATCH v4 49/49] t: add t0430 to test cloning using bundles Christian Couder
@ 2017-06-20 13:48 ` Christian Couder
  2017-06-23 18:24 ` Ben Peart
  2017-07-12 19:06 ` Jonathan Tan
  51 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-06-20 13:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

On Tue, Jun 20, 2017 at 9:54 AM, Christian Couder
<christian.couder@gmail.com> wrote:
>
> Future work
> ~~~~~~~~~~~
>
> First sorry about the state of this patch series, it is not as clean
> as I would have liked, butI think it is interesting to get feedback
> from the mailing list at this point, because the previous RFC was sent
> a long time ago and a lot of things changed.
>
> So a big part of the future work will be about cleaning this patch series.
>
> Other things I think I am going to do:
>
>   -

Ooops, I had not save my emacs buffer where I wrote this when I sent
the patch series.

This should have been:

Other things I think I may work on:

  - Remove the "odb.<odbname>.scriptMode" and "odb.<odbname>.command"
    options and instead have just "odb.<odbname>.scriptCommand" and
    "odb.<odbname>.subprocessCommand".

  - Use capabilities instead of "odb.<odbname>.fetchKind" to decide
    which kind of "get" will be used.

  - Better test all the combinations of the above modes with and
    without "have" and "put" instructions.

  - Maybe also have different kinds of "put" so that Git could pass
    either a git object a plain object or ask the helper to retreive
    it directly from Git's object database.

  - Maybe add an "init" instruction as the script mode has something
    like this called "get_cap" and it would help the sub-process mode
    too, as it makes it possible for Git to know the capabilities
    before trying to send any instruction (that might not be supported
    by the helper). The "init" instruction would be the only required
    instruction for any helper to implement.

  - Add more long running tests and improve tests in general.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (49 preceding siblings ...)
  2017-06-20 13:48 ` [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
@ 2017-06-23 18:24 ` Ben Peart
  2017-07-01 19:41   ` Christian Couder
  2017-07-12 19:06 ` Jonathan Tan
  51 siblings, 1 reply; 64+ messages in thread
From: Ben Peart @ 2017-06-23 18:24 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder



On 6/20/2017 3:54 AM, Christian Couder wrote:
> Goal
> ~~~~
> 
> Git can store its objects only in the form of loose objects in
> separate files or packed objects in a pack file.
> 
> To be able to better handle some kind of objects, for example big
> blobs, it would be nice if Git could store its objects in other object
> databases (ODB).
> 
> To do that, this patch series makes it possible to register commands,
> also called "helpers", using "odb.<odbname>.command" config variables,
> to access external ODBs where objects can be stored and retrieved.
> 
> External ODBs should be able to tranfer information about the blobs
> they store. This patch series shows how this is possible using kind of
> replace refs.
> 

Great to see this making progress!

My thoughts and questions are mostly about the overall design tradeoffs.

Is your intention to enable the ODB to completely replace the regular 
object store or just to supplement it?  I think it would be good to 
ensure the interface is robust and performant enough to actually replace 
the current object store interface (even if we don't actually do that 
just yet).

Another way of asking this is: do the 3 verbs (have, get, put) and the 3 
types of "get" enable you to wrap the current loose object and pack file 
code as ODBs and run completely via the external ODB interface?  If not, 
what is missing and can it be added?

_Eventually_ it would be great to see the current object store(s) moved 
behind the new ODB interface.

When there are multiple ODB providers, what is the order they are 
called?  If one fails a request (get, have, put) are the others called 
to see if they can fulfill the request?

Can the order they are called for various verb be configured explicitly? 
For example, it would be nice to have a "large object ODB handler" 
configured to get first try at all "put" verbs.  Then if it meets it's 
size requirements, it will handle the verb, otherwise it fail and git 
will try the other ODBs.


> Design
> ~~~~~~
> 
> * The "helpers" (registered commands)
> 
> Each helper manages access to one external ODB.
> 
> There are now 2 different modes for helper:
> 
>    - When "odb.<odbname>.scriptMode" is set to "true", the helper is
>      launched each time Git wants to communicate with the <odbname>
>      external ODB.
> 
>    - When "odb.<odbname>.scriptMode" is not set or set to "false", then
>      the helper is launched once as a sub-process (using
>      sub-process.h), and Git communicates with it using packet lines.
> 

Is it worth supporting two different modes long term?  It seems that 
this could be simplified (less code to write, debug, document, support) 
by only supporting the 2nd that uses the sub-process.  As far as I can 
tell, the capabilities are the same, it's just the second one is more 
performant when multiple calls are made.

> A helper can be given different instructions by Git. The instructions
> that are supported are negociated at the beginning of the
> communication using a capability mechanism.
> 
> For now the following instructions are supported:
> 
>    - "have": the helper should respond with the sha1, size and type of
>      all the objects the external ODB contains, one object per line.
> 
>    - "get <sha1>": the helper should then read from the external ODB
>      the content of the object corresponding to <sha1> and pass it to Git.
> 
>    - "put <sha1> <size> <type>": the helper should then read from from
>      Git an object and store it in the external ODB.
> 
> Currently "have" and "put" are optional.

It's good the various verbs can be optional.  That way any particular 
ODB only has to handle those it needs to provide a different behavior for.

> 
> There are 3 different kinds of "get" instructions depending on how the
> helper passes objects to Git:
> 
>    - "fault_in": the helper will write the requested objects directly
>      into the regular Git object database, and then Git will retry
>      reading it from there.
> 

I think the "fault_in" behavior can be implemented efficiently without 
the overhead of a 3rd special "get" instruction if we enable some of the 
other capabilities discussed.

For example, assume an ODB is setup to handle missing objects (by 
registering itself as "last" in the prioritized list of ODB handlers). 
If it is ever asked to retrieve a missing object, it can retrieve the 
object and return it as a "git_object" or "plain_object" and also cache 
it locally as a loose object, pack file, or any other ODB handler 
supported mechanism.  Future requests will then provide that object via 
the locally cached copy and its associated ODB handler.

>    - "git_object": the helper will send the object as a Git object.
> 
>    - "plain_object": the helper will send the object (a blob) as a raw
>      object. (The blob content will be sent as is.)
> 
> For now the kind of "get" that is supported is read from the
> "odb.<odbname>.fetchKind" configuration variable, but in the future it
> should be decided as part of the capability negociation.
> 

I agree it makes sense to move this into the capability negotiation but 
I also wonder if we really need to support both.  Is there a reason we 
can't just choose one and force all ODBs to support it?

> * Transfering information

This whole section on "odb ref" feels out of place to me.  Below you 
state it is optional in which case I think it should be discussed in the 
patches that implement the tests that use it rather than here.  It seems 
to be a test ODB specific implementation detail.

> 
> To tranfer information about the blobs stored in external ODB, some
> special refs, called "odb ref", similar as replace refs, are used in
> the tests of this series, but in general nothing forces the helper to
> use that mechanism.
> 
> The external odb helper is responsible for using and creating the refs
> in refs/odbs/<odbname>/, if it wants to do that. It is free for example
> to just create one ref, as it is also free to create many refs. Git
> would just transmit the refs that have been created by this helper, if
> Git is asked to do so.
> 
> For now in the tests there is one odb ref per blob, as it is simple
> and as it is similar to what git-lfs does. Each ref name is
> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> in the external odb named <odbname>.
> 
> These odb refs point to a blob that is stored in the Git
> repository and contain information about the blob stored in the
> external odb. This information can be specific to the external odb.
> The repos can then share this information using commands like:
> 
> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
> 
> At the end of the current patch series, "git clone" is teached a
> "--initial-refspec" option, that asks it to first fetch some specified
> refs. This is used in the tests to fetch the odb refs first.
>  > This way only one "git clone" command can setup a repo using the
> external ODB mechanism as long as the right helper is installed on the
> machine and as long as the following options are used:
> 
>    - "--initial-refspec <odbrefspec>" to fetch the odb refspec
>    - "-c odb.<odbname>.command=<helper>" to configure the helper
> 
> There is also a test script that shows that the "--initial-refspec"
> option along with the external ODB mechanism can be used to implement
> cloning using bundles.

The fact that "git clone is taught a --initial-refspec" option" 
indicates this isn't just an ODB implementation detail.  Is there a 
general capability that is missing from the ODB interface that needs to 
be addressed here?

I don't believe there is.  Instead, I think we should allow the various 
"partial clone" patch series already in progress solve the problem of 
how you do a partial clone of a repo.

[1] 
https://public-inbox.org/git/1488994685-37403-1-git-send-email-jeffhost@microsoft.com/
[2] https://public-inbox.org/git/20170309073117.g3br5btsfwntcdpe@sigill.intra.peff.net/
[3] https://public-inbox.org/git/cover.1496361873.git.jonathantanmy@google.com/
[4] https://public-inbox.org/git/20170602232508.GA21733@aiede.mtv.corp.google.com/ 


> 
> * External object database
> 
> This RFC patch series shows in the tests:
> 
>    - how to use another git repository as an external ODB (storing Git objects)
>    - how to use an http server as an external ODB (storing plain objects)
> 
> (This works in both script mode and sub-process mode.)
> 
> * Performance
> 
> So the sub-process mode, which is now the default, has been
> implemented in this new version of this patch series.
> 
> This has been implemented using the refactoring that Ben Peart did on
> top of Lars Schneider's work on using sub-processes and packet lines
> in the smudge/clean filters for git-lfs. This also uses further work
> from Ben Peart called "read object process".
> 
> See:
> 
> http://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/
> http://public-inbox.org/git/20170322165220.5660-1-benpeart@microsoft.com/
> 
> Thanks to this, the external ODB mechanism should in the end perform
> as well as the git-lfs mechanism when many objects should be
> transfered.
> 
> Implementation
> ~~~~~~~~~~~~~~
> 
> * Mechanism to call the registered commands
> 
> This series adds a set of function in external-odb.{c,h} that are
> called by the rest of Git to manage all the external ODBs.
> 
> These functions use 'struct odb_helper' and its associated functions
> defined in odb-helper.{c,h} to talk to the different external ODBs by
> launching the configured "odb.<odbname>.command" commands and writing
> to or reading from them.
> 
> * ODB refs
> 
> For now odb ref management is only implemented in a helper in t0410.
> 
> When a new blob is added to an external odb, its sha1, size and type
> are writen in another new blob and the odb ref is created.
> 
> When the list of existing blobs is requested from the external odb,
> the content of the blobs pointed to by the odb refs can also be used
> by the odb to claim that it can get the objects.
> 
> When a blob is actually requested from the external odb, it can use
> the content stored in the blobs pointed to by the odb refs to get the
> actual blobs and then pass them.
> 
> Highlevel view of the patches in the series
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
>      - Patch 1/49 is a small code cleanup that I already sent to the
>        mailing list but will probably be removed in the end bdue to
>        ongoing work on "git clone"
> 
>      - Patches 02/49 to 08/49 create a Git/Packet.pm module by
>        refactoring "t0021/rot13-filter.pl". Functions from this new
>        module will be used later in test scripts.

Nice!

> 
>      - Patches 09/49 to 16/49 create the external ODB insfrastructure
>        in external-odb.{c,h} and odb-helper.{c,h} for the script mode.
> 
>      - Patches 17/49 to 23/49 improve lib-http to make it possible to
>        use it as an external ODB to test storing blobs in an HTTP
>        server.
> 
>      - Patches 24/49 to 44/49 improve the external ODB insfrastructure
>        to support sub-processes and make everything work using them.
> 

I understand why it is this way historically but it seems these should 
just be combined with patches 9-16. Instead of writing the new odb 
specific routines and then patching them to support sub-process, just 
write them that way the "first" time.

>      - Patches 45/49 to 49/49 add the --initial-refspec to git clone
>        along with tests.
> 

I'm hopeful the changes to git clone to add the --initial-refspec can 
eventually be dropped. It seems we should be able to test the ODB 
capabilities without having to add options to git clone.

> Future work
> ~~~~~~~~~~~
> 
> First sorry about the state of this patch series, it is not as clean
> as I would have liked, butI think it is interesting to get feedback
> from the mailing list at this point, because the previous RFC was sent
> a long time ago and a lot of things changed.
> 
> So a big part of the future work will be about cleaning this patch series.
> 
> Other things I think I am going to do:
> 
>    -
> 
> Previous work and discussions
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> (Sorry for the old Gmane links, I will try to replace them with
> public-inbox.org at one point.)
> 
> Peff started to work on this and discuss this some years ago:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/206886/focus=207040
> http://thread.gmane.org/gmane.comp.version-control.git/247171
> http://thread.gmane.org/gmane.comp.version-control.git/202902/focus=203020
> 
> His work, which is not compile-tested any more, is still there:
> 
> https://github.com/peff/git/commits/jk/external-odb-wip
> 
> Initial discussions about this new series are there:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/288151/focus=295160
> 
> Version 1, 2 and 3 of this RFC/PATCH series are here:
> 
> https://public-inbox.org/git/20160613085546.11784-1-chriscool@tuxfamily.org/
> https://public-inbox.org/git/20160628181933.24620-1-chriscool@tuxfamily.org/
> https://public-inbox.org/git/20161130210420.15982-1-chriscool@tuxfamily.org/
> 
> Some of the discussions related to Ben Peart's work that is used by
> this series are here:
> 
> http://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/
> http://public-inbox.org/git/20170322165220.5660-1-benpeart@microsoft.com/
> 
> Links
> ~~~~~
> 
> This patch series is available here:
> 
> https://github.com/chriscool/git/commits/external-odb
> 
> Version 1, 2 and 3 are here:
> 
> https://github.com/chriscool/git/commits/gl-external-odb12
> https://github.com/chriscool/git/commits/gl-external-odb22
> https://github.com/chriscool/git/commits/gl-external-odb61
> 
> 
> Ben Peart (4):
>    Documentation: add read-object-protocol.txt
>    contrib: add long-running-read-object/example.pl
>    Add t0410 to test read object mechanism
>    odb-helper: add read_object_process()
> 
> Christian Couder (43):
>    builtin/clone: get rid of 'value' strbuf
>    t0021/rot13-filter: refactor packet reading functions
>    t0021/rot13-filter: improve 'if .. elsif .. else' style
>    Add Git/Packet.pm from parts of t0021/rot13-filter.pl
>    t0021/rot13-filter: use Git/Packet.pm
>    Git/Packet.pm: improve error message
>    Git/Packet.pm: add packet_initialize()
>    Git/Packet: add capability functions
>    t0400: add 'put' command to odb-helper script
>    external odb: add write support
>    external-odb: accept only blobs for now
>    t0400: add test for external odb write support
>    Add GIT_NO_EXTERNAL_ODB env variable
>    Add t0410 to test external ODB transfer
>    lib-httpd: pass config file to start_httpd()
>    lib-httpd: add upload.sh
>    lib-httpd: add list.sh
>    lib-httpd: add apache-e-odb.conf
>    odb-helper: add 'store_plain_objects' to 'struct odb_helper'
>    pack-objects: don't pack objects in external odbs
>    t0420: add test with HTTP external odb
>    odb-helper: start fault in implementation
>    external-odb: add external_odb_fault_in_object()
>    odb-helper: add script_mode
>    external-odb: add external_odb_get_capabilities()
>    t04*: add 'get_cap' support to helpers
>    odb-helper: call odb_helper_lookup() with 'have' capability
>    odb-helper: fix odb_helper_fetch_object() for read_object
>    Add t0460 to test passing git objects
>    odb-helper: add read_packetized_git_object_to_fd()
>    odb-helper: add read_packetized_plain_object_to_fd()
>    Add t0470 to test passing plain objects
>    odb-helper: add write_object_process()
>    Add t0480 to test "have" capability and plain objects
>    external-odb: add external_odb_do_fetch_object()
>    odb-helper: advertise 'have' capability
>    odb-helper: advertise 'put' capability
>    odb-helper: add have_object_process()
>    clone: add initial param to write_remote_refs()
>    clone: add --initial-refspec option
>    clone: disable external odb before initial clone
>    Add test for 'clone --initial-refspec'
>    t: add t0430 to test cloning using bundles
> 
> Jeff King (2):
>    Add initial external odb support
>    external odb foreach
> 
>   Documentation/technical/read-object-protocol.txt | 102 +++
>   Makefile                                         |   2 +
>   builtin/clone.c                                  |  91 ++-
>   builtin/pack-objects.c                           |   4 +
>   cache.h                                          |  18 +
>   contrib/long-running-read-object/example.pl      | 114 +++
>   environment.c                                    |   4 +
>   external-odb.c                                   | 220 +++++
>   external-odb.h                                   |  17 +
>   odb-helper.c                                     | 987 +++++++++++++++++++++++
>   odb-helper.h                                     |  47 ++
>   perl/Git/Packet.pm                               | 118 +++
>   sha1_file.c                                      | 117 ++-
>   t/lib-httpd.sh                                   |   8 +-
>   t/lib-httpd/apache-e-odb.conf                    | 214 +++++
>   t/lib-httpd/list.sh                              |  41 +
>   t/lib-httpd/upload.sh                            |  45 ++
>   t/t0021/rot13-filter.pl                          |  97 +--
>   t/t0400-external-odb.sh                          |  85 ++
>   t/t0410-transfer-e-odb.sh                        | 148 ++++
>   t/t0420-transfer-http-e-odb.sh                   | 159 ++++
>   t/t0430-clone-bundle-e-odb.sh                    |  91 +++
>   t/t0450-read-object.sh                           |  30 +
>   t/t0450/read-object                              |  56 ++
>   t/t0460-read-object-git.sh                       |  29 +
>   t/t0460/read-object-git                          |  67 ++
>   t/t0470-read-object-http-e-odb.sh                | 123 +++
>   t/t0470/read-object-plain                        |  93 +++
>   t/t0480-read-object-have-http-e-odb.sh           | 123 +++
>   t/t0480/read-object-plain-have                   | 116 +++
>   t/t5616-clone-initial-refspec.sh                 |  48 ++
>   31 files changed, 3296 insertions(+), 118 deletions(-)
>   create mode 100644 Documentation/technical/read-object-protocol.txt
>   create mode 100644 contrib/long-running-read-object/example.pl
>   create mode 100644 external-odb.c
>   create mode 100644 external-odb.h
>   create mode 100644 odb-helper.c
>   create mode 100644 odb-helper.h
>   create mode 100644 perl/Git/Packet.pm
>   create mode 100644 t/lib-httpd/apache-e-odb.conf
>   create mode 100644 t/lib-httpd/list.sh
>   create mode 100644 t/lib-httpd/upload.sh
>   create mode 100755 t/t0400-external-odb.sh
>   create mode 100755 t/t0410-transfer-e-odb.sh
>   create mode 100755 t/t0420-transfer-http-e-odb.sh
>   create mode 100755 t/t0430-clone-bundle-e-odb.sh
>   create mode 100755 t/t0450-read-object.sh
>   create mode 100755 t/t0450/read-object
>   create mode 100755 t/t0460-read-object-git.sh
>   create mode 100755 t/t0460/read-object-git
>   create mode 100755 t/t0470-read-object-http-e-odb.sh
>   create mode 100755 t/t0470/read-object-plain
>   create mode 100755 t/t0480-read-object-have-http-e-odb.sh
>   create mode 100755 t/t0480/read-object-plain-have
>   create mode 100755 t/t5616-clone-initial-refspec.sh
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize()
  2017-06-20  7:54 ` [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize() Christian Couder
@ 2017-06-23 18:55   ` Ben Peart
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Peart @ 2017-06-23 18:55 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

I like where this ends but it seems to me that patches 6, 7 and 8 should 
just get merged into patch 4 and 5.

On 6/20/2017 3:54 AM, Christian Couder wrote:
> Add a function to initialize the communication. And use this
> function in 't/t0021/rot13-filter.pl'.
> 
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>   perl/Git/Packet.pm      | 13 +++++++++++++
>   t/t0021/rot13-filter.pl |  8 +-------
>   2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
> index 2ad6b00d6c..b0233caf37 100644
> --- a/perl/Git/Packet.pm
> +++ b/perl/Git/Packet.pm
> @@ -19,6 +19,7 @@ our @EXPORT = qw(
>   			packet_bin_write
>   			packet_txt_write
>   			packet_flush
> +			packet_initialize
>   		);
>   our @EXPORT_OK = @EXPORT;
>   
> @@ -70,3 +71,15 @@ sub packet_flush {
>   	print STDOUT sprintf( "%04x", 0 );
>   	STDOUT->flush();
>   }
> +
> +sub packet_initialize {
> +	my ($name, $version) = @_;
> +
> +	( packet_txt_read() eq ( 0, $name . "-client" ) )	|| die "bad initialize";
> +	( packet_txt_read() eq ( 0, "version=" . $version ) )	|| die "bad version";
> +	( packet_bin_read() eq ( 1, "" ) )			|| die "bad version end";
> +
> +	packet_txt_write( $name . "-server" );
> +	packet_txt_write( "version=" . $version );
> +	packet_flush();
> +}
> diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
> index 36a9eb3608..5b05518640 100644
> --- a/t/t0021/rot13-filter.pl
> +++ b/t/t0021/rot13-filter.pl
> @@ -40,13 +40,7 @@ sub rot13 {
>   print $debug "START\n";
>   $debug->flush();
>   
> -( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
> -( packet_txt_read() eq ( 0, "version=2" ) )         || die "bad version";
> -( packet_bin_read() eq ( 1, "" ) )                  || die "bad version end";
> -
> -packet_txt_write("git-filter-server");
> -packet_txt_write("version=2");
> -packet_flush();
> +packet_initialize("git-filter", 2);
>   
>   ( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
>   ( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 09/49] Add initial external odb support
  2017-06-20  7:54 ` [RFC/PATCH v4 09/49] Add initial external odb support Christian Couder
@ 2017-06-23 19:49   ` Ben Peart
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Peart @ 2017-06-23 19:49 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder



On 6/20/2017 3:54 AM, Christian Couder wrote:
> From: Jeff King <peff@peff.net>
> 
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>

I'd suggest you make the function names consistent with the capabilities 
flags (ie get, put, have) both here in odb_helper.c/h and in 
external_odb.c/h.

> +int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
> +int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
> +			    int fd);
> +
> +#endif /* ODB_HELPER_H */

The following patch fixes a few compiler warnings/errors.


diff --git a/odb-helper.c b/odb-helper.c
index 01cd6a713c..ffbbd2fc87 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -118,7 +118,7 @@ static int check_object_process_error(int err,
                                       unsigned int capability)
  {
         if (!err)
-               return;
+               return 0;

         if (!strcmp(status, "error")) {
                 /* The process signaled a problem with the file. */
@@ -192,7 +192,7 @@ static ssize_t 
read_packetized_plain_object_to_fd(struct odb_helper *o,
         git_SHA1_Init(&hash);

         /* First header.. */
-       hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), 
size) + 1;
+       hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %"PRIuMAX, 
typename(type), size) + 1;
         stream.next_in = (unsigned char *)hdr;
         stream.avail_in = hdrlen;
         while (git_deflate(&stream, 0) == Z_OK)
@@ -253,7 +253,7 @@ static ssize_t 
read_packetized_plain_object_to_fd(struct odb_helper *o,
                 return -1;
         }
         if (total_got != size) {
-               warning("size mismatch from odb helper '%s' for %s (%lu 
!= %lu)",
+               warning("size mismatch from odb helper '%s' for %s (%lu 
!= %"PRIuMAX")",
                         o->name, sha1_to_hex(sha1), total_got, size);
                 return -1;
         }
@@ -587,7 +587,6 @@ static int have_object_process(struct odb_helper *o)
         struct strbuf status = STRBUF_INIT;
         const char *cmd = o->cmd;
         uint64_t start;
-       char *line;
         int packet_len;
         int total_got = 0;

@@ -946,7 +945,7 @@ int odb_helper_for_each_object(struct odb_helper *o,
         return 0;
  }

-int odb_helper_write_plain_object(struct odb_helper *o,
+static int odb_helper_write_plain_object(struct odb_helper *o,
                                   const void *buf, size_t len,
                                   const char *type, unsigned char *sha1)
  {


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-06-23 18:24 ` Ben Peart
@ 2017-07-01 19:41   ` Christian Couder
  2017-07-01 20:12     ` Christian Couder
                       ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Christian Couder @ 2017-07-01 19:41 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peartben@gmail.com> wrote:
>
>
> On 6/20/2017 3:54 AM, Christian Couder wrote:

>> To be able to better handle some kind of objects, for example big
>> blobs, it would be nice if Git could store its objects in other object
>> databases (ODB).
>>
>> To do that, this patch series makes it possible to register commands,
>> also called "helpers", using "odb.<odbname>.command" config variables,
>> to access external ODBs where objects can be stored and retrieved.
>>
>> External ODBs should be able to tranfer information about the blobs
>> they store. This patch series shows how this is possible using kind of
>> replace refs.
>
> Great to see this making progress!
>
> My thoughts and questions are mostly about the overall design tradeoffs.
>
> Is your intention to enable the ODB to completely replace the regular object
> store or just to supplement it?

It is to supplement it, as I think the regular object store works very
well most of the time.

> I think it would be good to ensure the
> interface is robust and performant enough to actually replace the current
> object store interface (even if we don't actually do that just yet).

I agree that it should be robust and performant, but I don't think it
needs to be as performant in all cases as the current object store
right now.

> Another way of asking this is: do the 3 verbs (have, get, put) and the 3
> types of "get" enable you to wrap the current loose object and pack file
> code as ODBs and run completely via the external ODB interface?  If not,
> what is missing and can it be added?

Right now the "put" verb only send plain blobs, so the most logical
way to run completely via the external ODB interface would be to use
it to send and receive plain blobs. There are tests scripts (t0420,
t0470 and t0480) that use an http server as the external ODB and all
the blobs are stored in it.

And yeah for now it works only for blobs. There is a temporary patch
in the series that limits it to blobs. For the non RFC patch series, I
think it should either use the attribute system to tell which objects
should be run via the external ODB interface, or perhaps there should
be a way to ask each external ODB helper which kind of objects and
blobs it can handle. I should add that in the future work part.

> _Eventually_ it would be great to see the current object store(s) moved
> behind the new ODB interface.

This is not one of my goals and I think it could be a problem if we
want to keep the "fault in" mode.
In this mode the helper writes or reads directly to or from the
current object store, so it needs the current object store to be
available.

Also I think compatibility with other git implementations is important
and it is a good thing that they can all work on a common repository
format.

> When there are multiple ODB providers, what is the order they are called?

The external_odb_config() function creates the helpers for the
external ODBs in the order they are found in the config file, and then
these helpers are called in turn in the same order.

> If one fails a request (get, have, put) are the others called to see if they
> can fulfill the request?

Yes, but there are no tests to check that it works well. I will need
to add some.

> Can the order they are called for various verb be configured explicitly?

Right now, you can configure the order by changing the config file,
but the order will be the same for all the verbs.

> For
> example, it would be nice to have a "large object ODB handler" configured to
> get first try at all "put" verbs.  Then if it meets it's size requirements,
> it will handle the verb, otherwise it fail and git will try the other ODBs.

This can work if the "large object ODB handler" is configured first.

Also this is linked with how you define which objects are handled by
which helper. For example if the attribute system is used to describe
which external ODB is used for which files, there could be a way to
tell for example that blobs larger than 1MB are handled by the "large
object ODB handler" while those that are smaller are handled by
another helper.

>> Design
>> ~~~~~~
>>
>> * The "helpers" (registered commands)
>>
>> Each helper manages access to one external ODB.
>>
>> There are now 2 different modes for helper:
>>
>>    - When "odb.<odbname>.scriptMode" is set to "true", the helper is
>>      launched each time Git wants to communicate with the <odbname>
>>      external ODB.
>>
>>    - When "odb.<odbname>.scriptMode" is not set or set to "false", then
>>      the helper is launched once as a sub-process (using
>>      sub-process.h), and Git communicates with it using packet lines.
>
> Is it worth supporting two different modes long term?  It seems that this
> could be simplified (less code to write, debug, document, support) by only
> supporting the 2nd that uses the sub-process.  As far as I can tell, the
> capabilities are the same, it's just the second one is more performant when
> multiple calls are made.

Yeah, capabilities are the same, but I think the script mode has
value, because helper are simpler to  write and debug.
If for example one wants to use the external ODB system to implement
clone bundles, like what is done in t0430 at the end of the patch
series, then it's much simpler if the helper uses the script mode, and
there is no performance downside as the helper will be called once.

I think people might want to implement these kinds of helpers to just
setup a repo quickly and properly for their specific needs.
For example for companies using big monorepos, there could be
different helpers for different kind of people working on different
parts of the code. For example one for front end developers and one
for back end developers.

Also my goal is to share as much of the implementation as possible
between the script and the sub process mode. I think this can help us
be confident that the overall design is good.

>> A helper can be given different instructions by Git. The instructions
>> that are supported are negociated at the beginning of the
>> communication using a capability mechanism.
>>
>> For now the following instructions are supported:
>>
>>    - "have": the helper should respond with the sha1, size and type of
>>      all the objects the external ODB contains, one object per line.
>>
>>    - "get <sha1>": the helper should then read from the external ODB
>>      the content of the object corresponding to <sha1> and pass it to Git.
>>
>>    - "put <sha1> <size> <type>": the helper should then read from from
>>      Git an object and store it in the external ODB.
>>
>> Currently "have" and "put" are optional.
>
> It's good the various verbs can be optional.  That way any particular ODB
> only has to handle those it needs to provide a different behavior for.

Yeah, my goal is to have only the "init" verb be required.

>> There are 3 different kinds of "get" instructions depending on how the
>> helper passes objects to Git:
>>
>>    - "fault_in": the helper will write the requested objects directly
>>      into the regular Git object database, and then Git will retry
>>      reading it from there.
>>
>
> I think the "fault_in" behavior can be implemented efficiently without the
> overhead of a 3rd special "get" instruction if we enable some of the other
> capabilities discussed.

The "fault_in" behavior will be just a special kind of "get"
capability. Git needs to know that the helper has this capability, but
there should be no overhead. So I am not sure if I understand you
properly.

> For example, assume an ODB is setup to handle missing objects (by
> registering itself as "last" in the prioritized list of ODB handlers). If it
> is ever asked to retrieve a missing object, it can retrieve the object and
> return it as a "git_object" or "plain_object" and also cache it locally as a
> loose object, pack file, or any other ODB handler supported mechanism.
> Future requests will then provide that object via the locally cached copy
> and its associated ODB handler.

Yeah, but if the ODB handler has written it the first time as a loose
file into the regular Git object database, then it won't be requested
again from the handler. Maybe you are saying that using the regular
Git object database is an overhead?

>>    - "git_object": the helper will send the object as a Git object.
>>
>>    - "plain_object": the helper will send the object (a blob) as a raw
>>      object. (The blob content will be sent as is.)
>>
>> For now the kind of "get" that is supported is read from the
>> "odb.<odbname>.fetchKind" configuration variable, but in the future it
>> should be decided as part of the capability negociation.
>
> I agree it makes sense to move this into the capability negotiation but I
> also wonder if we really need to support both.  Is there a reason we can't
> just choose one and force all ODBs to support it?

I think the different kind of get can all be useful and can all be
natural in different situations.

For example if there are different Git repos that share a lot of
common quite big blobs that may not be often needed by the developers
using these repos. They might want to store them in only one repo and
configure the other repos to get these blobs from the first one only
if they are needed. In this case it is better if the repos can share
the blobs as Git objects.

If on the other hand one want to store a few very big blobs on an HTTP
server, where they could also be retrieved using a browser, it is
simpler if they are stored as plain objects and passed to Git as such.

The "fault in" mode is interesting in case of a clone bundle, or
fetching bundles, for example.

Forcing only one mode means making it harder for some use cases and
probably having external ODB helpers duplicate a lot of code in
different programming languages.

>> * Transfering information
>
> This whole section on "odb ref" feels out of place to me.  Below you state
> it is optional in which case I think it should be discussed in the patches
> that implement the tests that use it rather than here.  It seems to be a
> test ODB specific implementation detail.

I think it is good to provide somewhere in the cover letter an idea
about how this part can already easily work, so that people can
understand the whole picture. But I agree it should probably move to
the "Implementation" section.

>> To tranfer information about the blobs stored in external ODB, some
>> special refs, called "odb ref", similar as replace refs, are used in
>> the tests of this series, but in general nothing forces the helper to
>> use that mechanism.
>>
>> The external odb helper is responsible for using and creating the refs
>> in refs/odbs/<odbname>/, if it wants to do that. It is free for example
>> to just create one ref, as it is also free to create many refs. Git
>> would just transmit the refs that have been created by this helper, if
>> Git is asked to do so.
>>
>> For now in the tests there is one odb ref per blob, as it is simple
>> and as it is similar to what git-lfs does. Each ref name is
>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>> in the external odb named <odbname>.
>>
>> These odb refs point to a blob that is stored in the Git
>> repository and contain information about the blob stored in the
>> external odb. This information can be specific to the external odb.
>> The repos can then share this information using commands like:
>>
>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>>
>> At the end of the current patch series, "git clone" is teached a
>> "--initial-refspec" option, that asks it to first fetch some specified
>> refs. This is used in the tests to fetch the odb refs first.
>>  > This way only one "git clone" command can setup a repo using the
>> external ODB mechanism as long as the right helper is installed on the
>> machine and as long as the following options are used:
>>
>>    - "--initial-refspec <odbrefspec>" to fetch the odb refspec
>>    - "-c odb.<odbname>.command=<helper>" to configure the helper
>>
>> There is also a test script that shows that the "--initial-refspec"
>> option along with the external ODB mechanism can be used to implement
>> cloning using bundles.
>
> The fact that "git clone is taught a --initial-refspec" option" indicates
> this isn't just an ODB implementation detail.  Is there a general capability
> that is missing from the ODB interface that needs to be addressed here?

Technically you don't need to teach `git clone` the --initial-refspec
option to make it work.
It can work like this:

$ git init
$ git remote add origin <originurl>
$ git fetch origin <odbrefspec>
$ git config odb.<odbname>.command <helper>
$ git fetch origin

But it is much simpler for the user to instead just do:

$ git clone -c odb.<odbname>.command=<helper> --initial-refspec
<odbrefspec> <originurl>

I also think that the --initial-refspec option could perhaps be useful
for other kinds of refs for example tags, notes or replace refs, to
make sure that those refs are fetched first and that hooks can use
them when fetching other refs like branches in the later part of the
clone.

I have put the --initial-refspec at the end of the patch series
because there could be other ways to do it. For example the helper
could perhaps be passed <originurl> in the "init" instruction and then
use this URL to get all the information it needs (for example by
fetching special refs from this URL or any other way it wants).

The problem with that is if the setup is done in separate steps (`git
init`, then `git remote add` ...) like I show above, then it is not
sure that the "init" would trigger when fetching the right remote/URL.
It could trigger when the user does a `git log` in between the above
steps, and this means that helpers receiving "init" should not rely on
being always passed a remote/URL.

So the "init" handling in the helpers would have to be more complex
than if they only need to rely on the <odbrefspec> having been fetched
before.

> I don't believe there is.  Instead, I think we should allow the various
> "partial clone" patch series already in progress solve the problem of how
> you do a partial clone of a repo.

I think this is not really related to partial clone, but maybe I
should take a closer look at these patch series.

> [1]
> https://public-inbox.org/git/1488994685-37403-1-git-send-email-jeffhost@microsoft.com/
> [2]
> https://public-inbox.org/git/20170309073117.g3br5btsfwntcdpe@sigill.intra.peff.net/
> [3]
> https://public-inbox.org/git/cover.1496361873.git.jonathantanmy@google.com/
> [4]
> https://public-inbox.org/git/20170602232508.GA21733@aiede.mtv.corp.google.com/

>> Highlevel view of the patches in the series
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>>      - Patch 1/49 is a small code cleanup that I already sent to the
>>        mailing list but will probably be removed in the end bdue to
>>        ongoing work on "git clone"
>>
>>      - Patches 02/49 to 08/49 create a Git/Packet.pm module by
>>        refactoring "t0021/rot13-filter.pl". Functions from this new
>>        module will be used later in test scripts.
>
> Nice!
>
>>      - Patches 09/49 to 16/49 create the external ODB insfrastructure
>>        in external-odb.{c,h} and odb-helper.{c,h} for the script mode.
>>
>>      - Patches 17/49 to 23/49 improve lib-http to make it possible to
>>        use it as an external ODB to test storing blobs in an HTTP
>>        server.
>>
>>      - Patches 24/49 to 44/49 improve the external ODB insfrastructure
>>        to support sub-processes and make everything work using them.
>
> I understand why it is this way historically but it seems these should just
> be combined with patches 9-16. Instead of writing the new odb specific
> routines and then patching them to support sub-process, just write them that
> way the "first" time.

Yeah, I could do that but I fear it would result in a some bigger
patches that would be harder to review and understand than a longer
patch series where feature are implemented and tested in small steps.

>>      - Patches 45/49 to 49/49 add the --initial-refspec to git clone
>>        along with tests.
>
> I'm hopeful the changes to git clone to add the --initial-refspec can
> eventually be dropped. It seems we should be able to test the ODB
> capabilities without having to add options to git clone.

I described above (and there are examples in the tests of) how
--initial-refspec can be avoided, but in the end I think we should
really have a robust and easy way for the user to setup everything.
And for now I think --initial-refspec is at least a good solution.

Thanks for your interest and your help in this!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-01 19:41   ` Christian Couder
@ 2017-07-01 20:12     ` Christian Couder
  2017-07-01 20:33     ` Junio C Hamano
  2017-07-06 17:36     ` Ben Peart
  2 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-07-01 20:12 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

On Sat, Jul 1, 2017 at 9:41 PM, Christian Couder
<christian.couder@gmail.com> wrote:
> On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peartben@gmail.com> wrote:

>> The fact that "git clone is taught a --initial-refspec" option" indicates
>> this isn't just an ODB implementation detail.  Is there a general capability
>> that is missing from the ODB interface that needs to be addressed here?
>
> Technically you don't need to teach `git clone` the --initial-refspec
> option to make it work.
> It can work like this:
>
> $ git init
> $ git remote add origin <originurl>
> $ git fetch origin <odbrefspec>
> $ git config odb.<odbname>.command <helper>
> $ git fetch origin
>
> But it is much simpler for the user to instead just do:
>
> $ git clone -c odb.<odbname>.command=<helper> --initial-refspec
> <odbrefspec> <originurl>
>
> I also think that the --initial-refspec option could perhaps be useful
> for other kinds of refs for example tags, notes or replace refs, to
> make sure that those refs are fetched first and that hooks can use
> them when fetching other refs like branches in the later part of the
> clone.

Actually I am not sure that it's possible to setup hooks per se before
or while cloning, but perhaps there are other kind of scripts or git
commands that could trigger and use the refs that have been fetched
first.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-01 19:41   ` Christian Couder
  2017-07-01 20:12     ` Christian Couder
@ 2017-07-01 20:33     ` Junio C Hamano
  2017-07-02  4:25       ` Christian Couder
  2017-07-06 17:36     ` Ben Peart
  2 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-07-01 20:33 UTC (permalink / raw)
  To: Christian Couder
  Cc: Ben Peart, git, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>> I think it would be good to ensure the
>> interface is robust and performant enough to actually replace the current
>> object store interface (even if we don't actually do that just yet).
>
> I agree that it should be robust and performant, but I don't think it
> needs to be as performant in all cases as the current object store
> right now.

That sounds like starting from a defeatest position.  Is there a
reason why you think using an external interface could never perform
well enough to be usable in everyday work?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-01 20:33     ` Junio C Hamano
@ 2017-07-02  4:25       ` Christian Couder
  2017-07-03 16:56         ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Christian Couder @ 2017-07-02  4:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, git, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

On Sat, Jul 1, 2017 at 10:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>>> I think it would be good to ensure the
>>> interface is robust and performant enough to actually replace the current
>>> object store interface (even if we don't actually do that just yet).
>>
>> I agree that it should be robust and performant, but I don't think it
>> needs to be as performant in all cases as the current object store
>> right now.
>
> That sounds like starting from a defeatest position.  Is there a
> reason why you think using an external interface could never perform
> well enough to be usable in everyday work?

Perhaps in the future we will be able to make it as performant as, or
perhaps even more performant, than the current object store, but in
the current implementation the following issues mean that it will be
less performant:

- The external object stores are searched for an object after the
object has not been found in the current object store. This means that
searching for an object will be slower if the object is in an external
object store. To overcome this the "have" information (when the
external helper implements it) could be merged with information about
what objects are in the current object store, for example in a big
table or bitmap, so that only one lookup in this table or bitmap would
be needed to know if an object is available and in which object store
it is. But I really don't want to get into this right now.

- When an external odb helper retrieves an object and passes it to
Git, Git (or the helper itself in "fault in" mode) then stores the
object in the current object store. This is because we assume that it
will be faster to retrieve it again if it is cached in the current
object store. There could be a capability that asks Git to not cache
the objects that are retrieved from the external odb, but again I
don't think it is necessary at all to implement this right now.

I still think though that in some cases, like when the external odb is
used to implement a bundle clone, using the external odb mechanism can
already be more performant.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-02  4:25       ` Christian Couder
@ 2017-07-03 16:56         ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2017-07-03 16:56 UTC (permalink / raw)
  To: Christian Couder
  Cc: Ben Peart, git, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> On Sat, Jul 1, 2017 at 10:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>>>> I think it would be good to ensure the
>>>> interface is robust and performant enough to actually replace the current
>>>> object store interface (even if we don't actually do that just yet).
>>>
>>> I agree that it should be robust and performant, but I don't think it
>>> needs to be as performant in all cases as the current object store
>>> right now.
>>
>> That sounds like starting from a defeatest position.  Is there a
>> reason why you think using an external interface could never perform
>> well enough to be usable in everyday work?
>
> Perhaps in the future we will be able to make it as performant as, or
> perhaps even more performant, than the current object store, but in
> the current implementation the following issues mean that it will be
> less performant

That might be an answer to a different question; I was hoping to
hear that it should be performant enough for everyday work, but
never thought it would perform as well as local disk.

I haven't used network filesystem quite a while, but a repository on
NFS may still usable, and we know our own access pattern bettern
than NFS which cannot anticipate what paths the next operations by
its client happen, so it is not inconceivable that a well designed
external object database interface would let us outperform "repo on
NFS" scenario.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-01 19:41   ` Christian Couder
  2017-07-01 20:12     ` Christian Couder
  2017-07-01 20:33     ` Junio C Hamano
@ 2017-07-06 17:36     ` Ben Peart
  2017-09-15 12:56       ` Christian Couder
  2 siblings, 1 reply; 64+ messages in thread
From: Ben Peart @ 2017-07-06 17:36 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder



On 7/1/2017 3:41 PM, Christian Couder wrote:
> On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peartben@gmail.com> wrote:
>>
>>
>> On 6/20/2017 3:54 AM, Christian Couder wrote:
> 
>>> To be able to better handle some kind of objects, for example big
>>> blobs, it would be nice if Git could store its objects in other object
>>> databases (ODB).
>>>
>>> To do that, this patch series makes it possible to register commands,
>>> also called "helpers", using "odb.<odbname>.command" config variables,
>>> to access external ODBs where objects can be stored and retrieved.
>>>
>>> External ODBs should be able to tranfer information about the blobs
>>> they store. This patch series shows how this is possible using kind of
>>> replace refs.
>>
>> Great to see this making progress!
>>
>> My thoughts and questions are mostly about the overall design tradeoffs.
>>
>> Is your intention to enable the ODB to completely replace the regular object
>> store or just to supplement it?
> 
> It is to supplement it, as I think the regular object store works very
> well most of the time.
> 

I certainly understand the desire to restrict the scope of the patch 
series.  I know full replacement is a much larger problem as it would 
touch much more of the codebase.

I'd still like to see an object store that was thread safe, more robust 
(ie transactional) and hopefully faster so I am hoping we can design the 
ODB interface to eventually enable that.

For example: it seems the ODB helpers need to be able to be called 
before the regular object store in the "put" case (so they can intercept 
large objects for example) and after in in the "get" case to enable 
"fault-in."  Something like this:

have/get
========
git object store
large object ODB helper

put
===
large object ODB helper
git object store

It would be nice if that order wasn't hard coded but that the order or 
level of the "git object store" could be specified using the same 
mechanism as used for the ODB helpers so that some day you could do 
something like this:

have/get
========
"LMDB" ODB helper
git object store

put
===
"LMDB" ODB helper
git object store

(and even further out, drop the current git object store completely :)).

>> I think it would be good to ensure the
>> interface is robust and performant enough to actually replace the current
>> object store interface (even if we don't actually do that just yet).
> 
> I agree that it should be robust and performant, but I don't think it
> needs to be as performant in all cases as the current object store
> right now.
> 
>> Another way of asking this is: do the 3 verbs (have, get, put) and the 3
>> types of "get" enable you to wrap the current loose object and pack file
>> code as ODBs and run completely via the external ODB interface?  If not,
>> what is missing and can it be added?
> 

One example of what I think is missing is a way to stream objects (ie 
get_stream, put_stream).  This isn't used often in git but it did exist 
last I checked.  I'm not saying this needs to be supported in the first 
version - more if we want to support total replacement.

I also wonder if we'd need an "optimize" verb (for "git gc") or a 
"validate" verb (for "git fsck").  Again, only if/when we are looking at 
total replacement.

> Right now the "put" verb only send plain blobs, so the most logical
> way to run completely via the external ODB interface would be to use
> it to send and receive plain blobs. There are tests scripts (t0420,
> t0470 and t0480) that use an http server as the external ODB and all
> the blobs are stored in it.
> 
> And yeah for now it works only for blobs. There is a temporary patch
> in the series that limits it to blobs. For the non RFC patch series, I
> think it should either use the attribute system to tell which objects
> should be run via the external ODB interface, or perhaps there should
> be a way to ask each external ODB helper which kind of objects and
> blobs it can handle. I should add that in the future work part.
> 

Sounds good.  For GVFS we handle all object types (including commits and 
trees) so would need this to be enabled so that we can switch to using it.

>> _Eventually_ it would be great to see the current object store(s) moved
>> behind the new ODB interface.
> 
> This is not one of my goals and I think it could be a problem if we
> want to keep the "fault in" mode.
 > In this mode the helper writes or reads directly to or from the
 > current object store, so it needs the current object store to be
 > available.
 >

I think implementing "fault in" should be an option that the ODB handler 
can implement but should not be required by the design/interface.  As 
you state above, this could be as simple as having the ODB handler write 
the object to the git object store on "get."

> Also I think compatibility with other git implementations is important
> and it is a good thing that they can all work on a common repository
> format.

I agree this should be an option but I don't want to say we'll _never_ 
move to a better object store.

> 
>> When there are multiple ODB providers, what is the order they are called?
> 
> The external_odb_config() function creates the helpers for the
> external ODBs in the order they are found in the config file, and then
> these helpers are called in turn in the same order.
> 
>> If one fails a request (get, have, put) are the others called to see if they
>> can fulfill the request?
> 
> Yes, but there are no tests to check that it works well. I will need
> to add some.
> 
>> Can the order they are called for various verb be configured explicitly?
> 
> Right now, you can configure the order by changing the config file,
> but the order will be the same for all the verbs.

Do you mean it will work like this?

have/get
========
git object store
first ODB helper
second ODB helper
third ODB helper

put
===
first ODB helper
second ODB helper
third ODB helper
git object store

If so, I'd prefer having more flexibility and be able to specify where 
the "git object store" fits in the stack along with the ODB helpers so 
that you could eventually support the "LMDB" ODB helper example above.

> 
>> For
>> example, it would be nice to have a "large object ODB handler" configured to
>> get first try at all "put" verbs.  Then if it meets it's size requirements,
>> it will handle the verb, otherwise it fail and git will try the other ODBs.
> 
> This can work if the "large object ODB handler" is configured first.
> 
> Also this is linked with how you define which objects are handled by
> which helper. For example if the attribute system is used to describe
> which external ODB is used for which files, there could be a way to
> tell for example that blobs larger than 1MB are handled by the "large
> object ODB handler" while those that are smaller are handled by
> another helper.
> 

I can see that using at attribute system could get complex to implement 
generically in git.  It seems defining the set of attributes and the 
rules for matching against them could get complex.

I wonder if it is sufficient to just have a hierarchy and then let each 
ODB handler determine which objects it wants to handle based on whatever 
criteria (size, object type, etc) it wants.

The downside of doing it this way is that calling the various handlers 
needs to be fast as there may be a lot of calls passed through to the 
next handler.  I think this can be accomplished by using persistent 
background processes instead of spawning a new one on each call.

>>> Design
>>> ~~~~~~
>>>
>>> * The "helpers" (registered commands)
>>>
>>> Each helper manages access to one external ODB.
>>>
>>> There are now 2 different modes for helper:
>>>
>>>     - When "odb.<odbname>.scriptMode" is set to "true", the helper is
>>>       launched each time Git wants to communicate with the <odbname>
>>>       external ODB.
>>>
>>>     - When "odb.<odbname>.scriptMode" is not set or set to "false", then
>>>       the helper is launched once as a sub-process (using
>>>       sub-process.h), and Git communicates with it using packet lines.
>>
>> Is it worth supporting two different modes long term?  It seems that this
>> could be simplified (less code to write, debug, document, support) by only
>> supporting the 2nd that uses the sub-process.  As far as I can tell, the
>> capabilities are the same, it's just the second one is more performant when
>> multiple calls are made.
> 
> Yeah, capabilities are the same, but I think the script mode has
> value, because helper are simpler to  write and debug.
> If for example one wants to use the external ODB system to implement
> clone bundles, like what is done in t0430 at the end of the patch
> series, then it's much simpler if the helper uses the script mode, and
> there is no performance downside as the helper will be called once.
> 
> I think people might want to implement these kinds of helpers to just
> setup a repo quickly and properly for their specific needs.
> For example for companies using big monorepos, there could be
> different helpers for different kind of people working on different
> parts of the code. For example one for front end developers and one
> for back end developers.
> 

I think the ease of writing a script mode helper vs sub-process can be 
alleviated with a library of helper routines to help with the 
sub-process interface.  You have a start on that already and given it is 
also already used by the filter interface there should be a good body of 
code to copy/paste for future implementations.  I've written them in 
perl and in C/C++ and once you have some helper functions, it's really 
quite simple.

> Also my goal is to share as much of the implementation as possible
> between the script and the sub process mode. I think this can help us
> be confident that the overall design is good.
> 

Ahh, but a single implementation is easier to code, test and maintain 
than two implementations - even with shared code. :)

>>> A helper can be given different instructions by Git. The instructions
>>> that are supported are negociated at the beginning of the
>>> communication using a capability mechanism.
>>>
>>> For now the following instructions are supported:
>>>
>>>     - "have": the helper should respond with the sha1, size and type of
>>>       all the objects the external ODB contains, one object per line.
>>>
>>>     - "get <sha1>": the helper should then read from the external ODB
>>>       the content of the object corresponding to <sha1> and pass it to Git.
>>>
>>>     - "put <sha1> <size> <type>": the helper should then read from from
>>>       Git an object and store it in the external ODB.
>>>
>>> Currently "have" and "put" are optional.
>>
>> It's good the various verbs can be optional.  That way any particular ODB
>> only has to handle those it needs to provide a different behavior for.
> 
> Yeah, my goal is to have only the "init" verb be required.
> 
>>> There are 3 different kinds of "get" instructions depending on how the
>>> helper passes objects to Git:
>>>
>>>     - "fault_in": the helper will write the requested objects directly
>>>       into the regular Git object database, and then Git will retry
>>>       reading it from there.
>>>
>>
>> I think the "fault_in" behavior can be implemented efficiently without the
>> overhead of a 3rd special "get" instruction if we enable some of the other
>> capabilities discussed.
> 
> The "fault_in" behavior will be just a special kind of "get"
> capability. Git needs to know that the helper has this capability, but
> there should be no overhead. So I am not sure if I understand you
> properly.
> 

Hopefully my sample diagrams above better illustrated what I was 
thinking and why I don't think git needs to be aware of "fault_in" 
behaviors.

>> For example, assume an ODB is setup to handle missing objects (by
>> registering itself as "last" in the prioritized list of ODB handlers). If it
>> is ever asked to retrieve a missing object, it can retrieve the object and
>> return it as a "git_object" or "plain_object" and also cache it locally as a
>> loose object, pack file, or any other ODB handler supported mechanism.
>> Future requests will then provide that object via the locally cached copy
>> and its associated ODB handler.
> 
> Yeah, but if the ODB handler has written it the first time as a loose
> file into the regular Git object database, then it won't be requested
> again from the handler. Maybe you are saying that using the regular
> Git object database is an overhead?
> 
>>>     - "git_object": the helper will send the object as a Git object.
>>>
>>>     - "plain_object": the helper will send the object (a blob) as a raw
>>>       object. (The blob content will be sent as is.)
>>>
>>> For now the kind of "get" that is supported is read from the
>>> "odb.<odbname>.fetchKind" configuration variable, but in the future it
>>> should be decided as part of the capability negociation.
>>
>> I agree it makes sense to move this into the capability negotiation but I
>> also wonder if we really need to support both.  Is there a reason we can't
>> just choose one and force all ODBs to support it?
> 
> I think the different kind of get can all be useful and can all be
> natural in different situations.
> 
> For example if there are different Git repos that share a lot of
> common quite big blobs that may not be often needed by the developers
> using these repos. They might want to store them in only one repo and
> configure the other repos to get these blobs from the first one only
> if they are needed. In this case it is better if the repos can share
> the blobs as Git objects.
> 
> If on the other hand one want to store a few very big blobs on an HTTP
> server, where they could also be retrieved using a browser, it is
> simpler if they are stored as plain objects and passed to Git as such.
> 

I can see how this could be handy, I was just trying to simplify things 
as much as possible.

> The "fault in" mode is interesting in case of a clone bundle, or
> fetching bundles, for example.
> 
> Forcing only one mode means making it harder for some use cases and
> probably having external ODB helpers duplicate a lot of code in
> different programming languages.
> 
>>> * Transfering information
>>
>> This whole section on "odb ref" feels out of place to me.  Below you state
>> it is optional in which case I think it should be discussed in the patches
>> that implement the tests that use it rather than here.  It seems to be a
>> test ODB specific implementation detail.
> 
> I think it is good to provide somewhere in the cover letter an idea
> about how this part can already easily work, so that people can
> understand the whole picture. But I agree it should probably move to
> the "Implementation" section.
> 
>>> To tranfer information about the blobs stored in external ODB, some
>>> special refs, called "odb ref", similar as replace refs, are used in
>>> the tests of this series, but in general nothing forces the helper to
>>> use that mechanism.
>>>
>>> The external odb helper is responsible for using and creating the refs
>>> in refs/odbs/<odbname>/, if it wants to do that. It is free for example
>>> to just create one ref, as it is also free to create many refs. Git
>>> would just transmit the refs that have been created by this helper, if
>>> Git is asked to do so.
>>>
>>> For now in the tests there is one odb ref per blob, as it is simple
>>> and as it is similar to what git-lfs does. Each ref name is
>>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>>> in the external odb named <odbname>.
>>>
>>> These odb refs point to a blob that is stored in the Git
>>> repository and contain information about the blob stored in the
>>> external odb. This information can be specific to the external odb.
>>> The repos can then share this information using commands like:
>>>
>>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>>>
>>> At the end of the current patch series, "git clone" is teached a
>>> "--initial-refspec" option, that asks it to first fetch some specified
>>> refs. This is used in the tests to fetch the odb refs first.
>>>   > This way only one "git clone" command can setup a repo using the
>>> external ODB mechanism as long as the right helper is installed on the
>>> machine and as long as the following options are used:
>>>
>>>     - "--initial-refspec <odbrefspec>" to fetch the odb refspec
>>>     - "-c odb.<odbname>.command=<helper>" to configure the helper
>>>
>>> There is also a test script that shows that the "--initial-refspec"
>>> option along with the external ODB mechanism can be used to implement
>>> cloning using bundles.
>>
>> The fact that "git clone is taught a --initial-refspec" option" indicates
>> this isn't just an ODB implementation detail.  Is there a general capability
>> that is missing from the ODB interface that needs to be addressed here?
> 
> Technically you don't need to teach `git clone` the --initial-refspec
> option to make it work.
> It can work like this:
> 
> $ git init
> $ git remote add origin <originurl>
> $ git fetch origin <odbrefspec>
> $ git config odb.<odbname>.command <helper>
> $ git fetch origin
> 
> But it is much simpler for the user to instead just do:
> 
> $ git clone -c odb.<odbname>.command=<helper> --initial-refspec
> <odbrefspec> <originurl>
> 
> I also think that the --initial-refspec option could perhaps be useful
> for other kinds of refs for example tags, notes or replace refs, to
> make sure that those refs are fetched first and that hooks can use
> them when fetching other refs like branches in the later part of the
> clone.
> 
> I have put the --initial-refspec at the end of the patch series
> because there could be other ways to do it. For example the helper
> could perhaps be passed <originurl> in the "init" instruction and then
> use this URL to get all the information it needs (for example by
> fetching special refs from this URL or any other way it wants).
> 
> The problem with that is if the setup is done in separate steps (`git
> init`, then `git remote add` ...) like I show above, then it is not
> sure that the "init" would trigger when fetching the right remote/URL.
> It could trigger when the user does a `git log` in between the above
> steps, and this means that helpers receiving "init" should not rely on
> being always passed a remote/URL.
> 
> So the "init" handling in the helpers would have to be more complex
> than if they only need to rely on the <odbrefspec> having been fetched
> before.
> 
>> I don't believe there is.  Instead, I think we should allow the various
>> "partial clone" patch series already in progress solve the problem of how
>> you do a partial clone of a repo.
> 
> I think this is not really related to partial clone, but maybe I
> should take a closer look at these patch series.

My thought was that since a standard clone copies all objects into the 
local object store, there is no need for a way to retrieve "missing" 
objects as, by definition, none are missing.

The various partial clone patch series are discussing how to clone a 
repo and _not_ copy down all objects which creates the need for a way to 
retrieve "missing" objects.  They are also dealing with how to enable 
git to know the object is intentionally missing (ie "have" can succeed 
but "get" will have to go retrieve the missing object before it can 
return it).

This enables having missing objects without some local placeholder (ref) 
so that repos with large numbers of objects can still do a fast clone as 
they don't have to download millions of refs for the "missing" objects.

> 
>> [1]
>> https://public-inbox.org/git/1488994685-37403-1-git-send-email-jeffhost@microsoft.com/
>> [2]
>> https://public-inbox.org/git/20170309073117.g3br5btsfwntcdpe@sigill.intra.peff.net/
>> [3]
>> https://public-inbox.org/git/cover.1496361873.git.jonathantanmy@google.com/
>> [4]
>> https://public-inbox.org/git/20170602232508.GA21733@aiede.mtv.corp.google.com/
> 
>>> Highlevel view of the patches in the series
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>>       - Patch 1/49 is a small code cleanup that I already sent to the
>>>         mailing list but will probably be removed in the end bdue to
>>>         ongoing work on "git clone"
>>>
>>>       - Patches 02/49 to 08/49 create a Git/Packet.pm module by
>>>         refactoring "t0021/rot13-filter.pl". Functions from this new
>>>         module will be used later in test scripts.
>>
>> Nice!
>>
>>>       - Patches 09/49 to 16/49 create the external ODB insfrastructure
>>>         in external-odb.{c,h} and odb-helper.{c,h} for the script mode.
>>>
>>>       - Patches 17/49 to 23/49 improve lib-http to make it possible to
>>>         use it as an external ODB to test storing blobs in an HTTP
>>>         server.
>>>
>>>       - Patches 24/49 to 44/49 improve the external ODB insfrastructure
>>>         to support sub-processes and make everything work using them.
>>
>> I understand why it is this way historically but it seems these should just
>> be combined with patches 9-16. Instead of writing the new odb specific
>> routines and then patching them to support sub-process, just write them that
>> way the "first" time.
> 
> Yeah, I could do that but I fear it would result in a some bigger
> patches that would be harder to review and understand than a longer
> patch series where feature are implemented and tested in small steps.
> 
>>>       - Patches 45/49 to 49/49 add the --initial-refspec to git clone
>>>         along with tests.
>>
>> I'm hopeful the changes to git clone to add the --initial-refspec can
>> eventually be dropped. It seems we should be able to test the ODB
>> capabilities without having to add options to git clone.
> 
> I described above (and there are examples in the tests of) how
> --initial-refspec can be avoided, but in the end I think we should
> really have a robust and easy way for the user to setup everything.
> And for now I think --initial-refspec is at least a good solution.
> 
> Thanks for your interest and your help in this!
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 30/49] odb-helper: add read_object_process()
  2017-06-20  7:55 ` [RFC/PATCH v4 30/49] odb-helper: add read_object_process() Christian Couder
@ 2017-07-10 15:57   ` Ben Peart
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Peart @ 2017-07-10 15:57 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder,
	Ben Peart



On 6/20/2017 3:55 AM, Christian Couder wrote:
> From: Ben Peart <benpeart@microsoft.com>
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>   odb-helper.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>   odb-helper.h |   5 ++
>   sha1_file.c  |  33 +++++++++-
>   3 files changed, 227 insertions(+), 13 deletions(-)
> 
> diff --git a/odb-helper.c b/odb-helper.c
> index 5fb56c6135..20e83cb55a 100644
> --- a/odb-helper.c
> +++ b/odb-helper.c
> @@ -4,6 +4,187 @@
>   #include "odb-helper.h"
>   #include "run-command.h"
>   #include "sha1-lookup.h"
> +#include "sub-process.h"
> +#include "pkt-line.h"
> +#include "sigchain.h"
> +
> +struct read_object_process {
> +	struct subprocess_entry subprocess;
> +	unsigned int supported_capabilities;
> +};
> +
> +static int subprocess_map_initialized;
> +static struct hashmap subprocess_map;
> +
> +static void parse_capabilities(char *cap_buf,
> +			       unsigned int *supported_capabilities,
> +			       const char *process_name)
> +{
> +	struct string_list cap_list = STRING_LIST_INIT_NODUP;
> +
> +	string_list_split_in_place(&cap_list, cap_buf, '=', 1);
> +
> +	if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, "capability")) {
> +		const char *cap_name = cap_list.items[1].string;
> +
> +		if (!strcmp(cap_name, "get")) {
> +			*supported_capabilities |= ODB_HELPER_CAP_GET;
> +		} else if (!strcmp(cap_name, "put")) {
> +			*supported_capabilities |= ODB_HELPER_CAP_PUT;
> +		} else if (!strcmp(cap_name, "have")) {
> +			*supported_capabilities |= ODB_HELPER_CAP_HAVE;
> +		} else {
> +			warning("external process '%s' requested unsupported read-object capability '%s'",
> +				process_name, cap_name);
> +		}
> +	}
> +
> +	string_list_clear(&cap_list, 0);
> +}
> +
> +static int start_read_object_fn(struct subprocess_entry *subprocess)
> +{
> +	int err;
> +	struct read_object_process *entry = (struct read_object_process *)subprocess;
> +	struct child_process *process = &subprocess->process;
> +	char *cap_buf;
> +
> +	sigchain_push(SIGPIPE, SIG_IGN);
> +
> +	err = packet_writel(process->in, "git-read-object-client", "version=1", NULL);
> +	if (err)
> +		goto done;
> +
> +	err = strcmp(packet_read_line(process->out, NULL), "git-read-object-server");
> +	if (err) {
> +		error("external process '%s' does not support read-object protocol version 1", subprocess->cmd);
> +		goto done;
> +	}
> +	err = strcmp(packet_read_line(process->out, NULL), "version=1");
> +	if (err)
> +		goto done;
> +	err = packet_read_line(process->out, NULL) != NULL;
> +	if (err)
> +		goto done;
> +
> +	err = packet_writel(process->in, "capability=get", NULL);
> +	if (err)
> +		goto done;
> +
> +	while ((cap_buf = packet_read_line(process->out, NULL)))
> +		parse_capabilities(cap_buf, &entry->supported_capabilities, subprocess->cmd);
> +
> +done:
> +	sigchain_pop(SIGPIPE);
> +
> +	return err;
> +}
> +
> +static struct read_object_process *launch_read_object_process(const char *cmd)
> +{
> +	struct read_object_process *entry;
> +
> +	if (!subprocess_map_initialized) {
> +		subprocess_map_initialized = 1;
> +		hashmap_init(&subprocess_map, (hashmap_cmp_fn) cmd2process_cmp, 0);
> +		entry = NULL;
> +	} else {
> +		entry = (struct read_object_process *)subprocess_find_entry(&subprocess_map, cmd);
> +	}
> +
> +	fflush(NULL);
> +
> +	if (!entry) {
> +		entry = xmalloc(sizeof(*entry));
> +		entry->supported_capabilities = 0;
> +
> +		if (subprocess_start(&subprocess_map, &entry->subprocess, cmd, start_read_object_fn)) {
> +			free(entry);
> +			return 0;
> +		}
> +	}
> +
> +	return entry;
> +}
> +
> +static int check_object_process_error(int err,
> +				      const char *status,
> +				      struct read_object_process *entry,
> +				      const char *cmd,
> +				      unsigned int capability)
> +{
> +	if (!err)
> +		return;
> +
> +	if (!strcmp(status, "error")) {
> +		/* The process signaled a problem with the file. */
> +	} else if (!strcmp(status, "notfound")) {
> +		/* Object was not found */
> +		err = -1;
> +	} else if (!strcmp(status, "abort")) {
> +		/*
> +		 * The process signaled a permanent problem. Don't try to read
> +		 * objects with the same command for the lifetime of the current
> +		 * Git process.
> +		 */
> +		if (capability)
> +			entry->supported_capabilities &= ~capability;
> +	} else {
> +		/*
> +		 * Something went wrong with the read-object process.
> +		 * Force shutdown and restart if needed.
> +		 */
> +		error("external object process '%s' failed", cmd);
> +		subprocess_stop(&subprocess_map, &entry->subprocess);
> +		free(entry);
> +	}
> +
> +	return err;
> +}
> +
> +static int read_object_process(const unsigned char *sha1)
> +{
> +	int err;
> +	struct read_object_process *entry;
> +	struct child_process *process;
> +	struct strbuf status = STRBUF_INIT;
> +	const char *cmd = "read-object";
> +	uint64_t start;
> +
> +	start = getnanotime();
> +
> +	entry = launch_read_object_process(cmd);
> +	process = &entry->subprocess.process;
> +
> +	if (!(ODB_HELPER_CAP_GET & entry->supported_capabilities))
> +		return -1;
> +
> +	sigchain_push(SIGPIPE, SIG_IGN);
> +
> +	err = packet_write_fmt_gently(process->in, "command=get\n");
> +	if (err)
> +		goto done;
> +
> +	err = packet_write_fmt_gently(process->in, "sha1=%s\n", sha1_to_hex(sha1));
> +	if (err)
> +		goto done;
> +
> +	err = packet_flush_gently(process->in);
> +	if (err)
> +		goto done;
> +
> +	subprocess_read_status(process->out, &status);
> +	err = strcmp(status.buf, "success");
> +
> +done:
> +	sigchain_pop(SIGPIPE);
> +
> +	err = check_object_process_error(err, status.buf, entry, cmd, ODB_HELPER_CAP_GET);
> +
> +	trace_performance_since(start, "read_object_process");
> +
> +	return err;
> +}
>   
>   struct odb_helper *odb_helper_new(const char *name, int namelen)
>   {
> @@ -350,20 +531,21 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
>   int odb_helper_fault_in_object(struct odb_helper *o,
>   			       const unsigned char *sha1)
>   {
> -	struct odb_helper_object *obj;
> -	struct odb_helper_cmd cmd;
> +	struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
>   
> -	obj = odb_helper_lookup(o, sha1);
>   	if (!obj)
>   		return -1;
>   
> -	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
> -		return -1;
> -
> -	if (odb_helper_finish(o, &cmd))
> -		return -1;
> -
> -	return 0;
> +	if (o->script_mode) {
> +		struct odb_helper_cmd cmd;
> +		if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
> +			return -1;
> +		if (odb_helper_finish(o, &cmd))
> +			return -1;
> +		return 0;
> +	} else {
> +		return read_object_process(sha1);
> +	}
>   }
>   
>   int odb_helper_fetch_object(struct odb_helper *o,
> diff --git a/odb-helper.h b/odb-helper.h
> index 44c98bbf56..b23544aa4a 100644
> --- a/odb-helper.h
> +++ b/odb-helper.h
> @@ -9,11 +9,16 @@ enum odb_helper_fetch_kind {
>   	ODB_FETCH_KIND_FAULT_IN
>   };
>   
> +#define ODB_HELPER_CAP_GET    (1u<<0)
> +#define ODB_HELPER_CAP_PUT    (1u<<1)
> +#define ODB_HELPER_CAP_HAVE   (1u<<2)
> +
>   struct odb_helper {
>   	const char *name;
>   	const char *cmd;
>   	enum odb_helper_fetch_kind fetch_kind;
>   	int script_mode;
> +	unsigned int supported_capabilities;
>   
>   	struct odb_helper_object {
>   		unsigned char sha1[20];
> diff --git a/sha1_file.c b/sha1_file.c
> index 9d8e37432e..38a0404506 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -698,7 +698,17 @@ int check_and_freshen_file(const char *fn, int freshen)
>   
>   static int check_and_freshen_local(const unsigned char *sha1, int freshen)
>   {
> -	return check_and_freshen_file(sha1_file_name(sha1), freshen);
> +	int ret;
> +	int tried_hook = 0;
> +
> +retry:
> +	ret = check_and_freshen_file(sha1_file_name(sha1), freshen);
> +	if (!ret && !tried_hook) {
> +		tried_hook = 1;
> +		if (!external_odb_fault_in_object(sha1))
> +			goto retry;
> +	}
> +	return ret;
>   }
>   
>   static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
> @@ -3000,7 +3010,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
>   	int rtype;
>   	enum object_type real_type;
>   	const unsigned char *real = lookup_replace_object_extended(sha1, flags);
> +	int tried_hook = 0;
>   
> +retry:
>   	co = find_cached_object(real);
>   	if (co) {
>   		if (oi->typep)
> @@ -3026,8 +3038,14 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
>   
>   		/* Not a loose object; someone else may have just packed it. */
>   		reprepare_packed_git();
> -		if (!find_pack_entry(real, &e))
> +		if (!find_pack_entry(real, &e)) {

Instead of adding the hook here, it needs to be moved out of 
check_and_freshen_local and into check_and_freshen. Otherwise, for any 
object written to the alternates location, check_and_freshen_local will 
fail to find it and try to download it before it tries the alternates 
location (where it would have found it).

> +			if (!tried_hook) {
> +				tried_hook = 1;
> +				if (!external_odb_fault_in_object(sha1))
> +					goto retry;
> +			}
>   			return -1;
> +		}
>   	}
>   
>   	/*
> @@ -3121,7 +3139,9 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
>   	unsigned long mapsize;
>   	void *map, *buf;
>   	struct cached_object *co;
> +	int tried_hook = 0;
>   
> +retry:
>   	co = find_cached_object(sha1);
>   	if (co) {
>   		*type = co->type;
> @@ -3139,7 +3159,14 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
>   		return buf;
>   	}
>   	reprepare_packed_git();
> -	return read_packed_sha1(sha1, type, size);
> +	buf = read_packed_sha1(sha1, type, size);
> +	if (!buf && !tried_hook) {
> +		tried_hook = 1;
> +		if (!external_odb_fault_in_object(sha1))
> +			goto retry;
> +	}
> +
> +	return buf;
>   }
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
                   ` (50 preceding siblings ...)
  2017-06-23 18:24 ` Ben Peart
@ 2017-07-12 19:06 ` Jonathan Tan
  2017-09-15 13:16   ` Christian Couder
  51 siblings, 1 reply; 64+ messages in thread
From: Jonathan Tan @ 2017-07-12 19:06 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

On Tue, 20 Jun 2017 09:54:34 +0200
Christian Couder <christian.couder@gmail.com> wrote:

> Git can store its objects only in the form of loose objects in
> separate files or packed objects in a pack file.
> 
> To be able to better handle some kind of objects, for example big
> blobs, it would be nice if Git could store its objects in other object
> databases (ODB).

Thanks for this, and sorry for the late reply. It's good to know that
others are thinking about "missing" objects in repos too.

>   - "have": the helper should respond with the sha1, size and type of
>     all the objects the external ODB contains, one object per line.

This should work well if we are not caching this "have" information
locally (that is, if the object store can be accessed with low latency),
but I am not sure if this will work otherwise. I see that you have
proposed a local cache-using method later in the e-mail - my comments on
that are below.

>   - "get <sha1>": the helper should then read from the external ODB
>     the content of the object corresponding to <sha1> and pass it to
> Git.

This makes sense - I have some patches [1] that implement this with the
"fault_in" mechanism described in your e-mail.

[1] https://public-inbox.org/git/cover.1499800530.git.jonathantanmy@google.com/

> * Transfering information
> 
> To tranfer information about the blobs stored in external ODB, some
> special refs, called "odb ref", similar as replace refs, are used in
> the tests of this series, but in general nothing forces the helper to
> use that mechanism.
> 
> The external odb helper is responsible for using and creating the refs
> in refs/odbs/<odbname>/, if it wants to do that. It is free for
> example to just create one ref, as it is also free to create many
> refs. Git would just transmit the refs that have been created by this
> helper, if Git is asked to do so.
> 
> For now in the tests there is one odb ref per blob, as it is simple
> and as it is similar to what git-lfs does. Each ref name is
> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> in the external odb named <odbname>.
> 
> These odb refs point to a blob that is stored in the Git
> repository and contain information about the blob stored in the
> external odb. This information can be specific to the external odb.
> The repos can then share this information using commands like:
> 
> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
> 
> At the end of the current patch series, "git clone" is teached a
> "--initial-refspec" option, that asks it to first fetch some specified
> refs. This is used in the tests to fetch the odb refs first.
> 
> This way only one "git clone" command can setup a repo using the
> external ODB mechanism as long as the right helper is installed on the
> machine and as long as the following options are used:
> 
>   - "--initial-refspec <odbrefspec>" to fetch the odb refspec
>   - "-c odb.<odbname>.command=<helper>" to configure the helper

A method like this means that information about every object is
downloaded, regardless of which branches were actually cloned, and
regardless of what parameters (e.g. max blob size) were used to control
the objects that were actually cloned.

We could make, say, one "odb ref" per size and branch - for example,
"refs/odbs/master/0", "refs/odbs/master/1k", "refs/odbs/master/1m", etc.
- and have the client know which one to download. But this wouldn't
scale if we introduce different object filters in the clone and fetch
commands.

I think that it is best to have upload-pack send this information
together with the packfile, since it knows exactly what objects were
omitted, and therefore what information the client needs. As discussed
in a sibling e-mail, clone/fetch already needs to be modified to omit
objects anyway.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-06 17:36     ` Ben Peart
@ 2017-09-15 12:56       ` Christian Couder
  0 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-09-15 12:56 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

(It looks like I did not reply to this email yet, sorry about this late reply.)

On Thu, Jul 6, 2017 at 7:36 PM, Ben Peart <peartben@gmail.com> wrote:
>
> On 7/1/2017 3:41 PM, Christian Couder wrote:
>>
>> On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peartben@gmail.com> wrote:
>>>
>>> Great to see this making progress!
>>>
>>> My thoughts and questions are mostly about the overall design tradeoffs.
>>>
>>> Is your intention to enable the ODB to completely replace the regular
>>> object
>>> store or just to supplement it?
>>
>> It is to supplement it, as I think the regular object store works very
>> well most of the time.
>
> I certainly understand the desire to restrict the scope of the patch series.
> I know full replacement is a much larger problem as it would touch much more
> of the codebase.
>
> I'd still like to see an object store that was thread safe, more robust (ie
> transactional) and hopefully faster so I am hoping we can design the ODB
> interface to eventually enable that.

I doubt that the way Git and the external odb helpers communicate in
process mode is good enough for multi-threading, so I think this would
require another communication mechanism altogether.

> For example: it seems the ODB helpers need to be able to be called before
> the regular object store in the "put" case (so they can intercept large
> objects for example) and after in in the "get" case to enable "fault-in."
> Something like this:
>
> have/get
> ========
> git object store
> large object ODB helper
>
> put
> ===
> large object ODB helper
> git object store
>
> It would be nice if that order wasn't hard coded but that the order or level
> of the "git object store" could be specified using the same mechanism as
> used for the ODB helpers so that some day you could do something like this:
>
> have/get
> ========
> "LMDB" ODB helper
> git object store
>
> put
> ===
> "LMDB" ODB helper
> git object store
>
> (and even further out, drop the current git object store completely :)).

Yeah, I understand that it could help.

>>> I think it would be good to ensure the
>>> interface is robust and performant enough to actually replace the current
>>> object store interface (even if we don't actually do that just yet).
>>
>>
>> I agree that it should be robust and performant, but I don't think it
>> needs to be as performant in all cases as the current object store
>> right now.
>>
>>> Another way of asking this is: do the 3 verbs (have, get, put) and the 3
>>> types of "get" enable you to wrap the current loose object and pack file
>>> code as ODBs and run completely via the external ODB interface?  If not,
>>> what is missing and can it be added?
>
> One example of what I think is missing is a way to stream objects (ie
> get_stream, put_stream).  This isn't used often in git but it did exist last
> I checked.  I'm not saying this needs to be supported in the first version -
> more if we want to support total replacement.

I agree and it seems to me that others have already pointed that the
streaming API could be used.

> I also wonder if we'd need an "optimize" verb (for "git gc") or a "validate"
> verb (for "git fsck").  Again, only if/when we are looking at total
> replacement.

Yeah, I agree that something might be useful for these commands.

>> Right now the "put" verb only send plain blobs, so the most logical
>> way to run completely via the external ODB interface would be to use
>> it to send and receive plain blobs. There are tests scripts (t0420,
>> t0470 and t0480) that use an http server as the external ODB and all
>> the blobs are stored in it.
>>
>> And yeah for now it works only for blobs. There is a temporary patch
>> in the series that limits it to blobs. For the non RFC patch series, I
>> think it should either use the attribute system to tell which objects
>> should be run via the external ODB interface, or perhaps there should
>> be a way to ask each external ODB helper which kind of objects and
>> blobs it can handle. I should add that in the future work part.
>
> Sounds good.  For GVFS we handle all object types (including commits and
> trees) so would need this to be enabled so that we can switch to using it.

Ok.

>>> _Eventually_ it would be great to see the current object store(s) moved
>>> behind the new ODB interface.
>>
>> This is not one of my goals and I think it could be a problem if we
>> want to keep the "fault in" mode.
>
>> In this mode the helper writes or reads directly to or from the
>> current object store, so it needs the current object store to be
>> available.
>
> I think implementing "fault in" should be an option that the ODB handler can
> implement but should not be required by the design/interface.  As you state
> above, this could be as simple as having the ODB handler write the object to
> the git object store on "get."

This is 'get_direct' since v5 and yeah it is optional.

>> Also I think compatibility with other git implementations is important
>> and it is a good thing that they can all work on a common repository
>> format.
>
> I agree this should be an option but I don't want to say we'll _never_ move
> to a better object store.

I agree but that looks like a different problem with a lot of additional issues.

>>> When there are multiple ODB providers, what is the order they are called?
>>
>> The external_odb_config() function creates the helpers for the
>> external ODBs in the order they are found in the config file, and then
>> these helpers are called in turn in the same order.
>>
>>> If one fails a request (get, have, put) are the others called to see if
>>> they
>>> can fulfill the request?
>>
>> Yes, but there are no tests to check that it works well. I will need
>> to add some.
>>
>>> Can the order they are called for various verb be configured explicitly?
>>
>> Right now, you can configure the order by changing the config file,
>> but the order will be the same for all the verbs.
>
> Do you mean it will work like this?
>
> have/get
> ========
> git object store
> first ODB helper
> second ODB helper
> third ODB helper
>
> put
> ===
> first ODB helper
> second ODB helper
> third ODB helper
> git object store

Yes.

> If so, I'd prefer having more flexibility and be able to specify where the
> "git object store" fits in the stack along with the ODB helpers so that you
> could eventually support the "LMDB" ODB helper example above.

Maybe this could be supported later with odb.<odbname>.priority or
odb.<odbname>.<instruction>_priority config option that could be
possibly negative integer.
A negative integer would means before the git object store and a
positive one after the object store.

>>> For
>>> example, it would be nice to have a "large object ODB handler" configured
>>> to
>>> get first try at all "put" verbs.  Then if it meets it's size
>>> requirements,
>>> it will handle the verb, otherwise it fail and git will try the other
>>> ODBs.
>>
>> This can work if the "large object ODB handler" is configured first.
>>
>> Also this is linked with how you define which objects are handled by
>> which helper. For example if the attribute system is used to describe
>> which external ODB is used for which files, there could be a way to
>> tell for example that blobs larger than 1MB are handled by the "large
>> object ODB handler" while those that are smaller are handled by
>> another helper.
>
> I can see that using at attribute system could get complex to implement
> generically in git.  It seems defining the set of attributes and the rules
> for matching against them could get complex.
>
> I wonder if it is sufficient to just have a hierarchy and then let each ODB
> handler determine which objects it wants to handle based on whatever
> criteria (size, object type, etc) it wants.

Yeah, I think that could be interesting to have later.

> The downside of doing it this way is that calling the various handlers needs
> to be fast as there may be a lot of calls passed through to the next
> handler.  I think this can be accomplished by using persistent background
> processes instead of spawning a new one on each call.

Yeah, this kind of mechanism would not be so simple to design if we
want things to be fast. That's why I prefer to leave it for later.

>> Yeah, capabilities are the same, but I think the script mode has
>> value, because helper are simpler to  write and debug.
>> If for example one wants to use the external ODB system to implement
>> clone bundles, like what is done in t0430 at the end of the patch
>> series, then it's much simpler if the helper uses the script mode, and
>> there is no performance downside as the helper will be called once.
>>
>> I think people might want to implement these kinds of helpers to just
>> setup a repo quickly and properly for their specific needs.
>> For example for companies using big monorepos, there could be
>> different helpers for different kind of people working on different
>> parts of the code. For example one for front end developers and one
>> for back end developers.
>
> I think the ease of writing a script mode helper vs sub-process can be
> alleviated with a library of helper routines to help with the sub-process
> interface.  You have a start on that already and given it is also already
> used by the filter interface there should be a good body of code to
> copy/paste for future implementations.  I've written them in perl and in
> C/C++ and once you have some helper functions, it's really quite simple.

It's quite simple once you know how to use the libraries, and then the
debugging is also more complex.
In the following email I gave an example of how simple some things can
be with the script mode:

https://public-inbox.org/git/CAP8UFD3ZV4Ezucn+Tv-roY6vzDyk2j4ypRsNR1YbOqoQK_qr8A@mail.gmail.com/

>> Also my goal is to share as much of the implementation as possible
>> between the script and the sub process mode. I think this can help us
>> be confident that the overall design is good.
>
> Ahh, but a single implementation is easier to code, test and maintain than
> two implementations - even with shared code. :)

Sure.

>>>> There are 3 different kinds of "get" instructions depending on how the
>>>> helper passes objects to Git:
>>>>
>>>>     - "fault_in": the helper will write the requested objects directly
>>>>       into the regular Git object database, and then Git will retry
>>>>       reading it from there.
>>>>
>>>
>>> I think the "fault_in" behavior can be implemented efficiently without
>>> the
>>> overhead of a 3rd special "get" instruction if we enable some of the
>>> other
>>> capabilities discussed.
>>
>> The "fault_in" behavior will be just a special kind of "get"
>> capability. Git needs to know that the helper has this capability, but
>> there should be no overhead. So I am not sure if I understand you
>> properly.
>
> Hopefully my sample diagrams above better illustrated what I was thinking
> and why I don't think git needs to be aware of "fault_in" behaviors.

I am not sure I understand as if a helper has some 'get_*' capability
Git needs to know anyway about that.

>>> I don't believe there is.  Instead, I think we should allow the various
>>> "partial clone" patch series already in progress solve the problem of how
>>> you do a partial clone of a repo.
>>
>> I think this is not really related to partial clone, but maybe I
>> should take a closer look at these patch series.
>
> My thought was that since a standard clone copies all objects into the local
> object store, there is no need for a way to retrieve "missing" objects as,
> by definition, none are missing.
>
> The various partial clone patch series are discussing how to clone a repo
> and _not_ copy down all objects which creates the need for a way to retrieve
> "missing" objects.  They are also dealing with how to enable git to know the
> object is intentionally missing (ie "have" can succeed but "get" will have
> to go retrieve the missing object before it can return it).
>
> This enables having missing objects without some local placeholder (ref) so
> that repos with large numbers of objects can still do a fast clone as they
> don't have to download millions of refs for the "missing" objects.

There could be just one ref pointing to a blob that contains a custom
service URL and the helper can just use the custom service URL to get
all the info it wants.
The one ref per missing object is obviously when you have a very small
number of missing objects.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
  2017-07-12 19:06 ` Jonathan Tan
@ 2017-09-15 13:16   ` Christian Couder
  0 siblings, 0 replies; 64+ messages in thread
From: Christian Couder @ 2017-09-15 13:16 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Nguyen Thai Ngoc Duy,
	Mike Hommey, Lars Schneider, Eric Wong, Christian Couder

(It looks like I did not reply to this other email yet, sorry about
this late reply.)

On Wed, Jul 12, 2017 at 9:06 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> On Tue, 20 Jun 2017 09:54:34 +0200
> Christian Couder <christian.couder@gmail.com> wrote:
>
>> Git can store its objects only in the form of loose objects in
>> separate files or packed objects in a pack file.
>>
>> To be able to better handle some kind of objects, for example big
>> blobs, it would be nice if Git could store its objects in other object
>> databases (ODB).
>
> Thanks for this, and sorry for the late reply. It's good to know that
> others are thinking about "missing" objects in repos too.
>
>>   - "have": the helper should respond with the sha1, size and type of
>>     all the objects the external ODB contains, one object per line.
>
> This should work well if we are not caching this "have" information
> locally (that is, if the object store can be accessed with low latency),
> but I am not sure if this will work otherwise.

Yeah, there could be problems related to caching or not caching the
"have" information.
As a repo should not send the blobs that are in an external odb, I
think it could be useful to cache the "have" information.
I plan to take a look and add related tests soon.

> I see that you have
> proposed a local cache-using method later in the e-mail - my comments on
> that are below.
>
>>   - "get <sha1>": the helper should then read from the external ODB
>>     the content of the object corresponding to <sha1> and pass it to
>> Git.
>
> This makes sense - I have some patches [1] that implement this with the
> "fault_in" mechanism described in your e-mail.
>
> [1] https://public-inbox.org/git/cover.1499800530.git.jonathantanmy@google.com/
>
>> * Transfering information
>>
>> To tranfer information about the blobs stored in external ODB, some
>> special refs, called "odb ref", similar as replace refs, are used in
>> the tests of this series, but in general nothing forces the helper to
>> use that mechanism.
>>
>> The external odb helper is responsible for using and creating the refs
>> in refs/odbs/<odbname>/, if it wants to do that. It is free for
>> example to just create one ref, as it is also free to create many
>> refs. Git would just transmit the refs that have been created by this
>> helper, if Git is asked to do so.
>>
>> For now in the tests there is one odb ref per blob, as it is simple
>> and as it is similar to what git-lfs does. Each ref name is
>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>> in the external odb named <odbname>.
>>
>> These odb refs point to a blob that is stored in the Git
>> repository and contain information about the blob stored in the
>> external odb. This information can be specific to the external odb.
>> The repos can then share this information using commands like:
>>
>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>>
>> At the end of the current patch series, "git clone" is teached a
>> "--initial-refspec" option, that asks it to first fetch some specified
>> refs. This is used in the tests to fetch the odb refs first.
>>
>> This way only one "git clone" command can setup a repo using the
>> external ODB mechanism as long as the right helper is installed on the
>> machine and as long as the following options are used:
>>
>>   - "--initial-refspec <odbrefspec>" to fetch the odb refspec
>>   - "-c odb.<odbname>.command=<helper>" to configure the helper
>
> A method like this means that information about every object is
> downloaded, regardless of which branches were actually cloned, and
> regardless of what parameters (e.g. max blob size) were used to control
> the objects that were actually cloned.
>
> We could make, say, one "odb ref" per size and branch - for example,
> "refs/odbs/master/0", "refs/odbs/master/1k", "refs/odbs/master/1m", etc.
> - and have the client know which one to download. But this wouldn't
> scale if we introduce different object filters in the clone and fetch
> commands.

Yeah, there are multiple ways to do that.

> I think that it is best to have upload-pack send this information
> together with the packfile, since it knows exactly what objects were
> omitted, and therefore what information the client needs. As discussed
> in a sibling e-mail, clone/fetch already needs to be modified to omit
> objects anyway.

I try to avoid sending this information as I don't think it is
necessary and it simplify things a lot to not have to change the
communication protocol.

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2017-09-15 13:16 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-20  7:54 [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 02/49] t0021/rot13-filter: refactor packet reading functions Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 03/49] t0021/rot13-filter: improve 'if .. elsif .. else' style Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 04/49] Add Git/Packet.pm from parts of t0021/rot13-filter.pl Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 05/49] t0021/rot13-filter: use Git/Packet.pm Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 06/49] Git/Packet.pm: improve error message Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize() Christian Couder
2017-06-23 18:55   ` Ben Peart
2017-06-20  7:54 ` [RFC/PATCH v4 08/49] Git/Packet: add capability functions Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 09/49] Add initial external odb support Christian Couder
2017-06-23 19:49   ` Ben Peart
2017-06-20  7:54 ` [RFC/PATCH v4 10/49] external odb foreach Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 11/49] t0400: add 'put' command to odb-helper script Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 12/49] external odb: add write support Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 13/49] external-odb: accept only blobs for now Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 14/49] t0400: add test for external odb write support Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 15/49] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 16/49] Add t0410 to test external ODB transfer Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 17/49] lib-httpd: pass config file to start_httpd() Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 18/49] lib-httpd: add upload.sh Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 19/49] lib-httpd: add list.sh Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 20/49] lib-httpd: add apache-e-odb.conf Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 21/49] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 22/49] pack-objects: don't pack objects in external odbs Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 23/49] t0420: add test with HTTP external odb Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 24/49] odb-helper: start fault in implementation Christian Couder
2017-06-20  7:54 ` [RFC/PATCH v4 25/49] external-odb: add external_odb_fault_in_object() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 26/49] odb-helper: add script_mode Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 27/49] Documentation: add read-object-protocol.txt Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 28/49] contrib: add long-running-read-object/example.pl Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 29/49] Add t0410 to test read object mechanism Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 30/49] odb-helper: add read_object_process() Christian Couder
2017-07-10 15:57   ` Ben Peart
2017-06-20  7:55 ` [RFC/PATCH v4 31/49] external-odb: add external_odb_get_capabilities() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 32/49] t04*: add 'get_cap' support to helpers Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 33/49] odb-helper: call odb_helper_lookup() with 'have' capability Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 34/49] odb-helper: fix odb_helper_fetch_object() for read_object Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 35/49] Add t0460 to test passing git objects Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 36/49] odb-helper: add read_packetized_git_object_to_fd() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 37/49] odb-helper: add read_packetized_plain_object_to_fd() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 38/49] Add t0470 to test passing plain objects Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 39/49] odb-helper: add write_object_process() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 40/49] Add t0480 to test "have" capability and plain objects Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 41/49] external-odb: add external_odb_do_fetch_object() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 42/49] odb-helper: advertise 'have' capability Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 43/49] odb-helper: advertise 'put' capability Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 44/49] odb-helper: add have_object_process() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 45/49] clone: add initial param to write_remote_refs() Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 46/49] clone: add --initial-refspec option Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 47/49] clone: disable external odb before initial clone Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 48/49] Add test for 'clone --initial-refspec' Christian Couder
2017-06-20  7:55 ` [RFC/PATCH v4 49/49] t: add t0430 to test cloning using bundles Christian Couder
2017-06-20 13:48 ` [RFC/PATCH v4 00/49] Add initial experimental external ODB support Christian Couder
2017-06-23 18:24 ` Ben Peart
2017-07-01 19:41   ` Christian Couder
2017-07-01 20:12     ` Christian Couder
2017-07-01 20:33     ` Junio C Hamano
2017-07-02  4:25       ` Christian Couder
2017-07-03 16:56         ` Junio C Hamano
2017-07-06 17:36     ` Ben Peart
2017-09-15 12:56       ` Christian Couder
2017-07-12 19:06 ` Jonathan Tan
2017-09-15 13:16   ` Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).