git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH v3 00/16] Add initial experimental external ODB support
@ 2016-11-30 21:04 Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
                   ` (17 more replies)
  0 siblings, 18 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Goal
~~~~

Git can store its objects only in the form of loose objects in
separate files or packed objects in a pack file.

To be able to better handle some kind of objects, for example big
blobs, it would be nice if Git could store its objects in other object
databases (ODB).

To do that, this patch series makes it possible to register commands,
using "odb.<odbname>.command" config variables, to access external
ODBs where objects can be stored and retrieved.

External ODBs should be able to tranfer information about the blobs
they store. This patch series shows how this is possible using kind of
replace refs.

Design
~~~~~~

* Registered command

Each registered command manages access to one external ODB and will be
called the following ways:

  - "<command> have": the command should output the sha1, size and
type of all the objects the external ODB contains, one object per
line.

  - "<command> get <sha1>": the command should then read from the
external ODB the content of the object corresponding to <sha1> and
output it on stdout.

  - "<command> put <sha1> <size> <type>": the command should then read
from stdin an object and store it in the external ODB.

* Transfer

To tranfer information about the blobs stored in external ODB, some
special refs, called "odb ref", similar as replace refs, are used.

For now there should be one odb ref per blob. Each ref name should be
refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
in the external odb named <odbname>.

These odb refs should all point to a blob that should be stored in the
Git repository and contain information about the blob stored in the
external odb. This information can be specific to the external odb.
The repos can then share this information using commands like:

`git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

* External object database

This RFC patch series shows in the tests:

  - how to use another git repository as an external ODB
  - how to use an http server as an external ODB

Design discussion about performance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Yeah, it is not efficient to fork/exec a command to just read or write
one object to or from the external ODB. Batch calls and/or using a
daemon and/or RPC should be used instead to be able to store regular
objects in an external ODB. But for now the external ODB would be all
about really big files, where the cost of a fork+exec should not
matter much. If we later want to extend usage of external ODBs, yeah
we will probably need to design other mechanisms.

Here are some related explanations from Peff:

{{{
Because this "external odb" essentially acts as a git alternate, we
would hit it only when we couldn't find an object through regular means.
Git would then make the object available in the usual on-disk format
(probably as a loose object).

So in most processes, we would not need to consult the odb command at
all. And when we do, the first thing would be to get its "have" list,
which would at most run once per process.

So the per-object cost is really calling "get", and my assumption there
was that the cost of actually retrieving the object over the network
would dwarf the fork/exec cost.

I also waffled on having git cache the output of "<command> have" in
some fast-lookup format to save even the single fork/exec. But I figured
that was something that could be added later if needed.

You'll note that this is sort of a "fault-in" model. Another model would
be to treat external odb updates similar to fetches. I.e., we touch the
network only during a special update operation, and then try to work
locally with whatever the external odb has. IMHO this policy could
actually be up to the external odb itself (i.e., its "have" command
could serve from a local cache if it likes).
}}}

Implementation
~~~~~~~~~~~~~~

* Mechanism to call the registered commands

This series adds a set of function in external-odb.{c,h} that are
called by the rest of Git to manage all the external ODBs.

These functions use 'struct odb_helper' and its associated functions
defined in odb-helper.{c,h} to talk to the different external ODBs by
launching the configured "odb.<odbname>.command" commands and writing
to or reading from them.

The tests in this series creates an odb-helper script that is
registered using the "odb.magic.command" config variable, and then
called to read from and write to the external ODB.

* ODB refs

For now odb ref management is only implemented in a registered command
in t0410, but maybe this or some parts of it could be done by Git
itself.

When a new blob is added to an external odb, its sha1, size and type
are writen in another new blob and the odb ref is created.

When the list of existing blobs is requested from the external odb,
the content of the blobs pointed to by the odb refs can also be used
by the odb to claim that it can get the objects.

When a blob is actually requested from the external odb, it can use
the content stored in the blobs pointed to by the odb refs to get the
actual blobs and then pass them.

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    - Patches 01/16 and 02/16 are Peff's initial work. They are not
      changed since v1.

    - Patches 03/16 is an optimization in the odb-helper script that
      is used for testing. I will probably squash it into 01/08, but
      didn't yet. So there is no change since v1.

    - Patches 04/16 and 05/16 are adding "put" support in the
      odb-helper script and testing that. They are not changed since
      v1.

    - Patches 06/16 and 08/16 are enhancing external-odb.{c,h} and
      odb-helper.{c,h}, so that Git can write into an external
      ODB. They are not changed since v1.

    - Patch 07/16 limits write support to "blobs" for now to
      simplify things. It did not change since v1.

    - Patch 09/16 adds a GIT_NO_EXTERNAL_ODB env variable to disable
      using the external database. It was new in v2.

    - Patch 10/16 adds test t0410 that shows how odb refs can be used
      to transfer information about blobs managed by an external
      odb. It was new in v2.

    - Patches 11/16 to 14/16 are preparing cgi and a apache config
      file so that an apache server can be used as an external object
      database. It is based on existing infrastructure in t/lib-http/.
      This is new in v3.

    - Patch 15/16 adds support for external ODBs that are storing
      files in their original format instead of as Git objects. It
      adds the odb.<helper>.plainObject config option to support these
      external ODBs. This is new in v3.

    - Patch 16/16 adds test t0420 that shows how an apache server can be used
      as an external ODB. This is new in v3.


Future work
~~~~~~~~~~~

I think that the odb refs don't prevent a regular fetch or push from
wanting to send the objects that are managed by an external odb. So I
am interested in suggestions about this problem. I will take a look at
previous discussions and how other mechanisms (shallow clone, bundle
v3, ...) handle this.

One interesting thing also would be to use the streaming api when
reading from or writing to the external ODB. (If it is not
automatically used already when the blob is bigger than
core.bigFileThreshold.)

Previous work and discussions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(Sorry for the old Gmane links, I will try to replace them with
public-inbox.org at one point.)

Peff started to work on this and discuss this some years ago:

http://thread.gmane.org/gmane.comp.version-control.git/206886/focus=207040
http://thread.gmane.org/gmane.comp.version-control.git/247171
http://thread.gmane.org/gmane.comp.version-control.git/202902/focus=203020

His work, which is not compile-tested any more, is still there:

https://github.com/peff/git/commits/jk/external-odb-wip

Initial discussions about this new series are there:

http://thread.gmane.org/gmane.comp.version-control.git/288151/focus=295160

Version 1 and 2 of this RFC/PATCH series are here:

https://public-inbox.org/git/20160613085546.11784-1-chriscool@tuxfamily.org/
https://public-inbox.org/git/20160628181933.24620-1-chriscool@tuxfamily.org/

Links
~~~~~

This patch series is available here:

https://github.com/chriscool/git/commits/external-odb

Version 1 and 2 are here:

https://github.com/chriscool/git/commits/gl-external-odb12
https://github.com/chriscool/git/commits/gl-external-odb22


Christian Couder (14):
  t0400: use --batch-all-objects to get all objects
  t0400: add 'put' command to odb-helper script
  t0400: add test for 'put' command
  external odb: add write support
  external-odb: accept only blobs for now
  t0400: add test for external odb write support
  Add GIT_NO_EXTERNAL_ODB env variable
  Add t0410 to test external ODB transfer
  lib-httpd: pass config file to start_httpd()
  lib-httpd: add upload.sh
  lib-httpd: add list.sh
  lib-httpd: add apache-e-odb.conf
  odb-helper: add 'store_plain_objects' to 'struct odb_helper'
  t0420: add test with HTTP external odb

Jeff King (2):
  Add initial external odb support
  external odb foreach

 Makefile                       |   2 +
 cache.h                        |  18 ++
 environment.c                  |   4 +
 external-odb.c                 | 156 ++++++++++++++++
 external-odb.h                 |  16 ++
 odb-helper.c                   | 396 +++++++++++++++++++++++++++++++++++++++++
 odb-helper.h                   |  33 ++++
 sha1_file.c                    |  69 +++++--
 t/lib-httpd.sh                 |   8 +-
 t/lib-httpd/apache-e-odb.conf  | 214 ++++++++++++++++++++++
 t/lib-httpd/list.sh            |  34 ++++
 t/lib-httpd/upload.sh          |  45 +++++
 t/t0400-external-odb.sh        |  77 ++++++++
 t/t0410-transfer-e-odb.sh      | 136 ++++++++++++++
 t/t0420-transfer-http-e-odb.sh | 118 ++++++++++++
 15 files changed, 1309 insertions(+), 17 deletions(-)
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100644 t/lib-httpd/apache-e-odb.conf
 create mode 100644 t/lib-httpd/list.sh
 create mode 100644 t/lib-httpd/upload.sh
 create mode 100755 t/t0400-external-odb.sh
 create mode 100755 t/t0410-transfer-e-odb.sh
 create mode 100755 t/t0420-transfer-http-e-odb.sh

-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 01/16] Add initial external odb support
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 23:30   ` Junio C Hamano
  2016-11-30 21:04 ` [RFC/PATCH v3 02/16] external odb foreach Christian Couder
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

From: Jeff King <peff@peff.net>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile                |   2 +
 cache.h                 |   9 ++
 external-odb.c          | 115 +++++++++++++++++++++++
 external-odb.h          |   8 ++
 odb-helper.c            | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
 odb-helper.h            |  25 +++++
 sha1_file.c             |  64 ++++++++++---
 t/t0400-external-odb.sh |  48 ++++++++++
 8 files changed, 495 insertions(+), 15 deletions(-)
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100755 t/t0400-external-odb.sh

diff --git a/Makefile b/Makefile
index f53fcc90d7..88e78da886 100644
--- a/Makefile
+++ b/Makefile
@@ -747,6 +747,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += external-odb.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
@@ -779,6 +780,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += odb-helper.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
diff --git a/cache.h b/cache.h
index a50a61a197..b419b9b7ce 100644
--- a/cache.h
+++ b/cache.h
@@ -885,6 +885,12 @@ const char *git_path_shallow(void);
 extern const char *sha1_file_name(const unsigned char *sha1);
 
 /*
+ * Like sha1_file_name, but return the filename within a specific alternate
+ * object directory. Shares the same static buffer with sha1_file_name.
+ */
+extern const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1);
+
+/*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
@@ -1135,6 +1141,8 @@ extern int do_check_packed_object_crc;
 
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
+extern int create_object_tmpfile(struct strbuf *tmp, const char *filename);
+extern void close_sha1_file(int fd);
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 extern int has_sha1_pack(const unsigned char *sha1);
@@ -1409,6 +1417,7 @@ extern void read_info_alternates(const char * relative_base, int depth);
 extern char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 extern int foreach_alt_odb(alt_odb_fn, void*);
+extern void prepare_external_alt_odb(void);
 
 /*
  * Allocate a "struct alternate_object_database" but do _not_ actually
diff --git a/external-odb.c b/external-odb.c
new file mode 100644
index 0000000000..1ccfa99a01
--- /dev/null
+++ b/external-odb.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+#include "external-odb.h"
+#include "odb-helper.h"
+
+static struct odb_helper *helpers;
+static struct odb_helper **helpers_tail = &helpers;
+
+static struct odb_helper *find_or_create_helper(const char *name, int len)
+{
+	struct odb_helper *o;
+
+	for (o = helpers; o; o = o->next)
+		if (!strncmp(o->name, name, len) && !o->name[len])
+			return o;
+
+	o = odb_helper_new(name, len);
+	*helpers_tail = o;
+	helpers_tail = &o->next;
+
+	return o;
+}
+
+static int external_odb_config(const char *var, const char *value, void *data)
+{
+	struct odb_helper *o;
+	const char *key, *dot;
+
+	if (!skip_prefix(var, "odb.", &key))
+		return 0;
+	dot = strrchr(key, '.');
+	if (!dot)
+		return 0;
+
+	o = find_or_create_helper(key, dot - key);
+	key = dot + 1;
+
+	if (!strcmp(key, "command"))
+		return git_config_string(&o->cmd, var, value);
+
+	return 0;
+}
+
+static void external_odb_init(void)
+{
+	static int initialized;
+
+	if (initialized)
+		return;
+	initialized = 1;
+
+	git_config(external_odb_config, NULL);
+}
+
+const char *external_odb_root(void)
+{
+	static const char *root;
+	if (!root)
+		root = git_pathdup("objects/external");
+	return root;
+}
+
+int external_odb_has_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next)
+		if (odb_helper_has_object(o, sha1))
+			return 1;
+	return 0;
+}
+
+int external_odb_fetch_object(const unsigned char *sha1)
+{
+	struct odb_helper *o;
+	const char *path;
+
+	if (!external_odb_has_object(sha1))
+		return -1;
+
+	path = sha1_file_name_alt(external_odb_root(), sha1);
+	safe_create_leading_directories_const(path);
+	prepare_external_alt_odb();
+
+	for (o = helpers; o; o = o->next) {
+		struct strbuf tmpfile = STRBUF_INIT;
+		int ret;
+		int fd;
+
+		if (!odb_helper_has_object(o, sha1))
+			continue;
+
+		fd = create_object_tmpfile(&tmpfile, path);
+		if (fd < 0) {
+			strbuf_release(&tmpfile);
+			return -1;
+		}
+
+		if (odb_helper_fetch_object(o, sha1, fd) < 0) {
+			close(fd);
+			unlink(tmpfile.buf);
+			strbuf_release(&tmpfile);
+			continue;
+		}
+
+		close_sha1_file(fd);
+		ret = finalize_object_file(tmpfile.buf, path);
+		strbuf_release(&tmpfile);
+		if (!ret)
+			return 0;
+	}
+
+	return -1;
+}
diff --git a/external-odb.h b/external-odb.h
new file mode 100644
index 0000000000..2397477684
--- /dev/null
+++ b/external-odb.h
@@ -0,0 +1,8 @@
+#ifndef EXTERNAL_ODB_H
+#define EXTERNAL_ODB_H
+
+const char *external_odb_root(void);
+int external_odb_has_object(const unsigned char *sha1);
+int external_odb_fetch_object(const unsigned char *sha1);
+
+#endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
new file mode 100644
index 0000000000..244bc86792
--- /dev/null
+++ b/odb-helper.c
@@ -0,0 +1,239 @@
+#include "cache.h"
+#include "object.h"
+#include "argv-array.h"
+#include "odb-helper.h"
+#include "run-command.h"
+#include "sha1-lookup.h"
+
+struct odb_helper *odb_helper_new(const char *name, int namelen)
+{
+	struct odb_helper *o;
+
+	o = xcalloc(1, sizeof(*o));
+	o->name = xmemdupz(name, namelen);
+
+	return o;
+}
+
+struct odb_helper_cmd {
+	struct argv_array argv;
+	struct child_process child;
+};
+
+static void prepare_helper_command(struct argv_array *argv, const char *cmd,
+				   const char *fmt, va_list ap)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addstr(&buf, cmd);
+	strbuf_addch(&buf, ' ');
+	strbuf_vaddf(&buf, fmt, ap);
+
+	argv_array_push(argv, buf.buf);
+	strbuf_release(&buf);
+}
+
+__attribute__((format (printf,3,4)))
+static int odb_helper_start(struct odb_helper *o,
+			    struct odb_helper_cmd *cmd,
+			    const char *fmt, ...)
+{
+	va_list ap;
+
+	memset(cmd, 0, sizeof(*cmd));
+	argv_array_init(&cmd->argv);
+
+	if (!o->cmd)
+		return -1;
+
+	va_start(ap, fmt);
+	prepare_helper_command(&cmd->argv, o->cmd, fmt, ap);
+	va_end(ap);
+
+	cmd->child.argv = cmd->argv.argv;
+	cmd->child.use_shell = 1;
+	cmd->child.no_stdin = 1;
+	cmd->child.out = -1;
+
+	if (start_command(&cmd->child) < 0) {
+		argv_array_clear(&cmd->argv);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int odb_helper_finish(struct odb_helper *o,
+			     struct odb_helper_cmd *cmd)
+{
+	int ret = finish_command(&cmd->child);
+	argv_array_clear(&cmd->argv);
+	if (ret) {
+		warning("odb helper '%s' reported failure", o->name);
+		return -1;
+	}
+	return 0;
+}
+
+static int parse_object_line(struct odb_helper_object *o, const char *line)
+{
+	char *end;
+	if (get_sha1_hex(line, o->sha1) < 0)
+		return -1;
+
+	line += 40;
+	if (*line++ != ' ')
+		return -1;
+
+	o->size = strtoul(line, &end, 10);
+	if (line == end || *end++ != ' ')
+		return -1;
+
+	o->type = type_from_string(end);
+	return 0;
+}
+
+static int odb_helper_object_cmp(const void *va, const void *vb)
+{
+	const struct odb_helper_object *a = va, *b = vb;
+	return hashcmp(a->sha1, b->sha1);
+}
+
+static void odb_helper_load_have(struct odb_helper *o)
+{
+	struct odb_helper_cmd cmd;
+	FILE *fh;
+	struct strbuf line = STRBUF_INIT;
+
+	if (o->have_valid)
+		return;
+	o->have_valid = 1;
+
+	if (odb_helper_start(o, &cmd, "have") < 0)
+		return;
+
+	fh = xfdopen(cmd.child.out, "r");
+	while (strbuf_getline(&line, fh) != EOF) {
+		ALLOC_GROW(o->have, o->have_nr+1, o->have_alloc);
+		if (parse_object_line(&o->have[o->have_nr], line.buf) < 0) {
+			warning("bad 'have' input from odb helper '%s': %s",
+				o->name, line.buf);
+			break;
+		}
+		o->have_nr++;
+	}
+
+	strbuf_release(&line);
+	fclose(fh);
+	odb_helper_finish(o, &cmd);
+
+	qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
+}
+
+static struct odb_helper_object *odb_helper_lookup(struct odb_helper *o,
+						   const unsigned char *sha1)
+{
+	int idx;
+
+	odb_helper_load_have(o);
+	idx = sha1_entry_pos(o->have, sizeof(*o->have), 0,
+			     0, o->have_nr, o->have_nr,
+			     sha1);
+	if (idx < 0)
+		return NULL;
+	return &o->have[idx];
+}
+
+int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1)
+{
+	return !!odb_helper_lookup(o, sha1);
+}
+
+int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
+			    int fd)
+{
+	struct odb_helper_object *obj;
+	struct odb_helper_cmd cmd;
+	unsigned long total_got;
+	git_zstream stream;
+	int zret = Z_STREAM_END;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	obj = odb_helper_lookup(o, sha1);
+	if (!obj)
+		return -1;
+
+	if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
+		return -1;
+
+	memset(&stream, 0, sizeof(stream));
+	git_inflate_init(&stream);
+	git_SHA1_Init(&hash);
+	total_got = 0;
+
+	for (;;) {
+		unsigned char buf[4096];
+		int r;
+
+		r = xread(cmd.child.out, buf, sizeof(buf));
+		if (r < 0) {
+			error("unable to read from odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			git_inflate_end(&stream);
+			return -1;
+		}
+		if (r == 0)
+			break;
+
+		write_or_die(fd, buf, r);
+
+		stream.next_in = buf;
+		stream.avail_in = r;
+		do {
+			unsigned char inflated[4096];
+			unsigned long got;
+
+			stream.next_out = inflated;
+			stream.avail_out = sizeof(inflated);
+			zret = git_inflate(&stream, Z_SYNC_FLUSH);
+			got = sizeof(inflated) - stream.avail_out;
+
+			git_SHA1_Update(&hash, inflated, got);
+			/* skip header when counting size */
+			if (!total_got) {
+				const unsigned char *p = memchr(inflated, '\0', got);
+				if (p)
+					got -= p - inflated + 1;
+				else
+					got = 0;
+			}
+			total_got += got;
+		} while (stream.avail_in && zret == Z_OK);
+	}
+
+	close(cmd.child.out);
+	git_inflate_end(&stream);
+	git_SHA1_Final(real_sha1, &hash);
+	if (odb_helper_finish(o, &cmd))
+		return -1;
+	if (zret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+	if (total_got != obj->size) {
+		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
+			o->name, sha1_to_hex(sha1), total_got, obj->size);
+		return -1;
+	}
+	if (hashcmp(real_sha1, sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
new file mode 100644
index 0000000000..0f704f9452
--- /dev/null
+++ b/odb-helper.h
@@ -0,0 +1,25 @@
+#ifndef ODB_HELPER_H
+#define ODB_HELPER_H
+
+struct odb_helper {
+	const char *name;
+	const char *cmd;
+
+	struct odb_helper_object {
+		unsigned char sha1[20];
+		unsigned long size;
+		enum object_type type;
+	} *have;
+	int have_nr;
+	int have_alloc;
+	int have_valid;
+
+	struct odb_helper *next;
+};
+
+struct odb_helper *odb_helper_new(const char *name, int namelen);
+int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
+int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
+			    int fd);
+
+#endif /* ODB_HELPER_H */
diff --git a/sha1_file.c b/sha1_file.c
index 9c86d1924a..6d68157e30 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -26,6 +26,7 @@
 #include "mru.h"
 #include "list.h"
 #include "mergesort.h"
+#include "external-odb.h"
 
 #ifndef O_NOATIME
 #if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
@@ -185,12 +186,12 @@ static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
 	}
 }
 
-const char *sha1_file_name(const unsigned char *sha1)
+const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1)
 {
 	static struct strbuf buf = STRBUF_INIT;
 
 	strbuf_reset(&buf);
-	strbuf_addf(&buf, "%s/", get_object_directory());
+	strbuf_addf(&buf, "%s/", objdir);
 
 	fill_sha1_path(&buf, sha1);
 	return buf.buf;
@@ -210,6 +211,11 @@ static const char *alt_sha1_path(struct alternate_object_database *alt,
 	return buf->buf;
 }
 
+const char *sha1_file_name(const unsigned char *sha1)
+{
+	return sha1_file_name_alt(get_object_directory(), sha1);
+}
+
 /*
  * Return the name of the pack or index file with the specified sha1
  * in its filename.  *base and *name are scratch space that must be
@@ -545,6 +551,21 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
 	return r;
 }
 
+void prepare_external_alt_odb(void)
+{
+	static int linked_external;
+	const char *path;
+
+	if (linked_external)
+		return;
+
+	path = external_odb_root();
+	if (!access(path, F_OK)) {
+		link_alt_odb_entry(path, NULL, 0, "");
+		linked_external = 1;
+	}
+}
+
 void prepare_alt_odb(void)
 {
 	const char *alt;
@@ -559,6 +580,7 @@ void prepare_alt_odb(void)
 	link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0);
 
 	read_info_alternates(get_object_directory(), 0);
+	prepare_external_alt_odb();
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
@@ -599,7 +621,7 @@ static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
 		if (check_and_freshen_file(path, freshen))
 			return 1;
 	}
-	return 0;
+	return external_odb_has_object(sha1);
 }
 
 static int check_and_freshen(const unsigned char *sha1, int freshen)
@@ -1631,21 +1653,15 @@ static int stat_sha1_file(const unsigned char *sha1, struct stat *st)
 	return -1;
 }
 
-static int open_sha1_file(const unsigned char *sha1)
+static int open_sha1_file_alt(const unsigned char *sha1)
 {
-	int fd;
 	struct alternate_object_database *alt;
-	int most_interesting_errno;
-
-	fd = git_open(sha1_file_name(sha1));
-	if (fd >= 0)
-		return fd;
-	most_interesting_errno = errno;
+	int most_interesting_errno = errno;
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
 		const char *path = alt_sha1_path(alt, sha1);
-		fd = git_open(path);
+		int fd = git_open(path);
 		if (fd >= 0)
 			return fd;
 		if (most_interesting_errno == ENOENT)
@@ -1655,6 +1671,24 @@ static int open_sha1_file(const unsigned char *sha1)
 	return -1;
 }
 
+static int open_sha1_file(const unsigned char *sha1)
+{
+	int fd;
+
+	fd = git_open(sha1_file_name(sha1));
+	if (fd >= 0)
+		return fd;
+
+	fd = open_sha1_file_alt(sha1);
+	if (fd >= 0)
+		return fd;
+
+	if (!external_odb_fetch_object(sha1))
+		fd = open_sha1_file_alt(sha1);
+
+	return fd;
+}
+
 void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
 {
 	void *map;
@@ -3139,7 +3173,7 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_sha1_file(int fd)
+void close_sha1_file(int fd)
 {
 	if (fsync_object_files)
 		fsync_or_die(fd, "sha1 file");
@@ -3163,7 +3197,7 @@ static inline int directory_size(const char *filename)
  * We want to avoid cross-directory filename renames, because those
  * can have problems on various filesystems (FAT, NFS, Coda).
  */
-static int create_tmpfile(struct strbuf *tmp, const char *filename)
+int create_object_tmpfile(struct strbuf *tmp, const char *filename)
 {
 	int fd, dirlen = directory_size(filename);
 
@@ -3203,7 +3237,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	const char *filename = sha1_file_name(sha1);
 
-	fd = create_tmpfile(&tmp_file, filename);
+	fd = create_object_tmpfile(&tmp_file, filename);
 	if (fd < 0) {
 		if (errno == EACCES)
 			return error("insufficient permission for adding an object to repository database %s", get_object_directory());
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
new file mode 100755
index 0000000000..2b016173a0
--- /dev/null
+++ b/t/t0400-external-odb.sh
@@ -0,0 +1,48 @@
+#!/bin/sh
+
+test_description='basic tests for external object databases'
+
+. ./test-lib.sh
+
+ALT_SOURCE="$PWD/alt-repo/.git"
+export ALT_SOURCE
+write_script odb-helper <<\EOF
+GIT_DIR=$ALT_SOURCE; export GIT_DIR
+case "$1" in
+have)
+	git rev-list --all --objects |
+	cut -d' ' -f1 |
+	git cat-file --batch-check |
+	awk '{print $1 " " $3 " " $2}'
+	;;
+get)
+	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-helper"
+
+test_expect_success 'setup alternate repo' '
+	git init alt-repo &&
+	(cd alt-repo &&
+	 test_commit one &&
+	 test_commit two
+	) &&
+	alt_head=`cd alt-repo && git rev-parse HEAD`
+'
+
+test_expect_success 'alt objects are missing' '
+	test_must_fail git log --format=%s $alt_head
+'
+
+test_expect_success 'helper can retrieve alt objects' '
+	test_config odb.magic.command "$HELPER" &&
+	cat >expect <<-\EOF &&
+	two
+	one
+	EOF
+	git log --format=%s $alt_head >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 02/16] external odb foreach
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects Christian Couder
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

From: Jeff King <peff@peff.net>

---
 external-odb.c | 14 ++++++++++++++
 external-odb.h |  6 ++++++
 odb-helper.c   | 15 +++++++++++++++
 odb-helper.h   |  4 ++++
 4 files changed, 39 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 1ccfa99a01..42978a3298 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -113,3 +113,17 @@ int external_odb_fetch_object(const unsigned char *sha1)
 
 	return -1;
 }
+
+int external_odb_for_each_object(each_external_object_fn fn, void *data)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next) {
+		int r = odb_helper_for_each_object(o, fn, data);
+		if (r)
+			return r;
+	}
+	return 0;
+}
diff --git a/external-odb.h b/external-odb.h
index 2397477684..cea8570a49 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -5,4 +5,10 @@ const char *external_odb_root(void);
 int external_odb_has_object(const unsigned char *sha1);
 int external_odb_fetch_object(const unsigned char *sha1);
 
+typedef int (*each_external_object_fn)(const unsigned char *sha1,
+				       enum object_type type,
+				       unsigned long size,
+				       void *data);
+int external_odb_for_each_object(each_external_object_fn, void *);
+
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index 244bc86792..2db59caa53 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -237,3 +237,18 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 
 	return 0;
 }
+
+int odb_helper_for_each_object(struct odb_helper *o,
+			       each_external_object_fn fn,
+			       void *data)
+{
+	int i;
+	for (i = 0; i < o->have_nr; i++) {
+		struct odb_helper_object *obj = &o->have[i];
+		int r = fn(obj->sha1, obj->type, obj->size, data);
+		if (r)
+			return r;
+	}
+
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 0f704f9452..8c3916d215 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -1,6 +1,8 @@
 #ifndef ODB_HELPER_H
 #define ODB_HELPER_H
 
+#include "external-odb.h"
+
 struct odb_helper {
 	const char *name;
 	const char *cmd;
@@ -21,5 +23,7 @@ struct odb_helper *odb_helper_new(const char *name, int namelen);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
+int odb_helper_for_each_object(struct odb_helper *o,
+			       each_external_object_fn, void *);
 
 #endif /* ODB_HELPER_H */
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 02/16] external odb foreach Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script Christian Couder
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 2b016173a0..fe85413725 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -10,9 +10,7 @@ write_script odb-helper <<\EOF
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
 have)
-	git rev-list --all --objects |
-	cut -d' ' -f1 |
-	git cat-file --batch-check |
+	git cat-file --batch-check --batch-all-objects |
 	awk '{print $1 " " $3 " " $2}'
 	;;
 get)
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (2 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 05/16] t0400: add test for 'put' command Christian Couder
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index fe85413725..0f1bb97f38 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -7,6 +7,10 @@ test_description='basic tests for external object databases'
 ALT_SOURCE="$PWD/alt-repo/.git"
 export ALT_SOURCE
 write_script odb-helper <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
 have)
@@ -16,6 +20,16 @@ have)
 get)
 	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
 	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
 esac
 EOF
 HELPER="\"$PWD\"/odb-helper"
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 05/16] t0400: add test for 'put' command
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (3 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 06/16] external odb: add write support Christian Couder
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 0f1bb97f38..6c6da5cf4f 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -57,4 +57,13 @@ test_expect_success 'helper can retrieve alt objects' '
 	test_cmp expect actual
 '
 
+test_expect_success 'helper can add objects to alt repo' '
+	hash=$(echo "Hello odb!" | git hash-object -w -t blob --stdin) &&
+	test -f .git/objects/$(echo $hash | sed "s#..#&/#") &&
+	size=$(git cat-file -s "$hash") &&
+	git cat-file blob "$hash" | ./odb-helper put "$hash" "$size" blob &&
+	alt_size=$(cd alt-repo && git cat-file -s "$hash") &&
+	test "$size" -eq "$alt_size"
+'
+
 test_done
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 06/16] external odb: add write support
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (4 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 05/16] t0400: add test for 'put' command Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 07/16] external-odb: accept only blobs for now Christian Couder
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 15 +++++++++++++++
 external-odb.h |  2 ++
 odb-helper.c   | 41 +++++++++++++++++++++++++++++++++++++----
 odb-helper.h   |  3 +++
 sha1_file.c    |  2 ++
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 42978a3298..bb70fe3298 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -127,3 +127,18 @@ int external_odb_for_each_object(each_external_object_fn fn, void *data)
 	}
 	return 0;
 }
+
+int external_odb_write_object(const void *buf, unsigned long len,
+			      const char *type, unsigned char *sha1)
+{
+	struct odb_helper *o;
+
+	external_odb_init();
+
+	for (o = helpers; o; o = o->next) {
+		int r = odb_helper_write_object(o, buf, len, type, sha1);
+		if (r <= 0)
+			return r;
+	}
+	return 1;
+}
diff --git a/external-odb.h b/external-odb.h
index cea8570a49..55d291d1cf 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -10,5 +10,7 @@ typedef int (*each_external_object_fn)(const unsigned char *sha1,
 				       unsigned long size,
 				       void *data);
 int external_odb_for_each_object(each_external_object_fn, void *);
+int external_odb_write_object(const void *buf, unsigned long len,
+			      const char *type, unsigned char *sha1);
 
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index 2db59caa53..7b7de7380f 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -33,9 +33,10 @@ static void prepare_helper_command(struct argv_array *argv, const char *cmd,
 	strbuf_release(&buf);
 }
 
-__attribute__((format (printf,3,4)))
+__attribute__((format (printf,4,5)))
 static int odb_helper_start(struct odb_helper *o,
 			    struct odb_helper_cmd *cmd,
+			    int use_stdin,
 			    const char *fmt, ...)
 {
 	va_list ap;
@@ -52,7 +53,10 @@ static int odb_helper_start(struct odb_helper *o,
 
 	cmd->child.argv = cmd->argv.argv;
 	cmd->child.use_shell = 1;
-	cmd->child.no_stdin = 1;
+	if (use_stdin)
+		cmd->child.in = -1;
+	else
+		cmd->child.no_stdin = 1;
 	cmd->child.out = -1;
 
 	if (start_command(&cmd->child) < 0) {
@@ -109,7 +113,7 @@ static void odb_helper_load_have(struct odb_helper *o)
 		return;
 	o->have_valid = 1;
 
-	if (odb_helper_start(o, &cmd, "have") < 0)
+	if (odb_helper_start(o, &cmd, 0, "have") < 0)
 		return;
 
 	fh = xfdopen(cmd.child.out, "r");
@@ -164,7 +168,7 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 	if (!obj)
 		return -1;
 
-	if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
+	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
 		return -1;
 
 	memset(&stream, 0, sizeof(stream));
@@ -252,3 +256,32 @@ int odb_helper_for_each_object(struct odb_helper *o,
 
 	return 0;
 }
+
+int odb_helper_write_object(struct odb_helper *o,
+			    const void *buf, unsigned long len,
+			    const char *type, unsigned char *sha1)
+{
+	struct odb_helper_cmd cmd;
+
+	if (odb_helper_start(o, &cmd, 1, "put %s %lu %s",
+			     sha1_to_hex(sha1), len, type) < 0)
+		return -1;
+
+	do {
+		int w = xwrite(cmd.child.in, buf, len);
+		if (w < 0) {
+			error("unable to write to odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.in);
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			return -1;
+		}
+		len -= w;
+	} while (len > 0);
+
+	close(cmd.child.in);
+	close(cmd.child.out);
+	odb_helper_finish(o, &cmd);
+	return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 8c3916d215..af31cc27d5 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -25,5 +25,8 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 			    int fd);
 int odb_helper_for_each_object(struct odb_helper *o,
 			       each_external_object_fn, void *);
+int odb_helper_write_object(struct odb_helper *o,
+			    const void *buf, unsigned long len,
+			    const char *type, unsigned char *sha1);
 
 #endif /* ODB_HELPER_H */
diff --git a/sha1_file.c b/sha1_file.c
index 6d68157e30..3532c1c598 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3320,6 +3320,8 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
 	 * it out into .git/objects/??/?{38} file.
 	 */
 	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
+	if (!external_odb_write_object(buf, len, type, sha1))
+		return 0;
 	if (freshen_packed_object(sha1) || freshen_loose_object(sha1))
 		return 0;
 	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 07/16] external-odb: accept only blobs for now
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (5 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 06/16] external odb: add write support Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 08/16] t0400: add test for external odb write support Christian Couder
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index bb70fe3298..6dd7b2548b 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -133,6 +133,10 @@ int external_odb_write_object(const void *buf, unsigned long len,
 {
 	struct odb_helper *o;
 
+	/* For now accept only blobs */
+	if (strcmp(type, "blob"))
+		return 1;
+
 	external_odb_init();
 
 	for (o = helpers; o; o = o->next) {
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 08/16] t0400: add test for external odb write support
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (6 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 07/16] external-odb: accept only blobs for now Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0400-external-odb.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 6c6da5cf4f..3c868cad4c 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -66,4 +66,12 @@ test_expect_success 'helper can add objects to alt repo' '
 	test "$size" -eq "$alt_size"
 '
 
+test_expect_success 'commit adds objects to alt repo' '
+	test_config odb.magic.command "$HELPER" &&
+	test_commit three &&
+	hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo && git show "$hash3") &&
+	test "$content" = "three"
+'
+
 test_done
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (7 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 08/16] t0400: add test for external odb write support Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer Christian Couder
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h        | 9 +++++++++
 environment.c  | 4 ++++
 external-odb.c | 6 ++++++
 sha1_file.c    | 3 +++
 4 files changed, 22 insertions(+)

diff --git a/cache.h b/cache.h
index b419b9b7ce..503b618a1f 100644
--- a/cache.h
+++ b/cache.h
@@ -422,6 +422,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES"
 #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS"
 #define GIT_REPLACE_REF_BASE_ENVIRONMENT "GIT_REPLACE_REF_BASE"
+#define NO_EXTERNAL_ODB_ENVIRONMENT "GIT_NO_EXTERNAL_ODB"
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
@@ -698,6 +699,14 @@ void reset_shared_repository(void);
 extern int check_replace_refs;
 extern char *git_replace_ref_base;
 
+/*
+ * Do external odbs need to be used this run?  This variable is
+ * initialized to true unless $GIT_NO_EXTERNAL_ODB is set, but it
+ * maybe set to false by some commands that do not want external
+ * odbs to be active.
+ */
+extern int use_external_odb;
+
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/environment.c b/environment.c
index 0935ec696e..8aecdd0544 100644
--- a/environment.c
+++ b/environment.c
@@ -47,6 +47,7 @@ const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 int check_replace_refs = 1;
 char *git_replace_ref_base;
+int use_external_odb = 1;
 enum eol core_eol = EOL_UNSET;
 enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
@@ -120,6 +121,7 @@ const char * const local_repo_env[] = {
 	INDEX_ENVIRONMENT,
 	NO_REPLACE_OBJECTS_ENVIRONMENT,
 	GIT_REPLACE_REF_BASE_ENVIRONMENT,
+	NO_EXTERNAL_ODB_ENVIRONMENT,
 	GIT_PREFIX_ENVIRONMENT,
 	GIT_SUPER_PREFIX_ENVIRONMENT,
 	GIT_SHALLOW_FILE_ENVIRONMENT,
@@ -185,6 +187,8 @@ static void setup_git_env(void)
 	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
 	git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
 							  : "refs/replace/");
+	if (getenv(NO_EXTERNAL_ODB_ENVIRONMENT))
+		use_external_odb = 0;
 	namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT));
 	namespace_len = strlen(namespace);
 	shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
diff --git a/external-odb.c b/external-odb.c
index 6dd7b2548b..a980fbfbf2 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -63,6 +63,9 @@ int external_odb_has_object(const unsigned char *sha1)
 {
 	struct odb_helper *o;
 
+	if (!use_external_odb)
+		return 0;
+
 	external_odb_init();
 
 	for (o = helpers; o; o = o->next)
@@ -133,6 +136,9 @@ int external_odb_write_object(const void *buf, unsigned long len,
 {
 	struct odb_helper *o;
 
+	if (!use_external_odb)
+		return 1;
+
 	/* For now accept only blobs */
 	if (strcmp(type, "blob"))
 		return 1;
diff --git a/sha1_file.c b/sha1_file.c
index 3532c1c598..92f1244205 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -556,6 +556,9 @@ void prepare_external_alt_odb(void)
 	static int linked_external;
 	const char *path;
 
+	if (!use_external_odb)
+		return;
+
 	if (linked_external)
 		return;
 
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (8 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd() Christian Couder
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0410-transfer-e-odb.sh | 136 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)
 create mode 100755 t/t0410-transfer-e-odb.sh

diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
new file mode 100755
index 0000000000..868b55db94
--- /dev/null
+++ b/t/t0410-transfer-e-odb.sh
@@ -0,0 +1,136 @@
+#!/bin/sh
+
+test_description='basic tests for transfering external ODBs'
+
+. ./test-lib.sh
+
+ORIG_SOURCE="$PWD/.git"
+export ORIG_SOURCE
+
+ALT_SOURCE1="$PWD/alt-repo1/.git"
+export ALT_SOURCE1
+write_script odb-helper1 <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+GIT_DIR=$ALT_SOURCE1; export GIT_DIR
+case "$1" in
+have)
+	git cat-file --batch-check --batch-all-objects |
+	awk '{print $1 " " $3 " " $2}'
+	;;
+get)
+	cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$ORIG_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	GIT_DIR=$ORIG_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER1="\"$PWD\"/odb-helper1"
+
+OTHER_SOURCE="$PWD/.git"
+export OTHER_SOURCE
+
+ALT_SOURCE2="$PWD/alt-repo2/.git"
+export ALT_SOURCE2
+write_script odb-helper2 <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+GIT_DIR=$ALT_SOURCE2; export GIT_DIR
+case "$1" in
+have)
+	GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
+	;;
+get)
+	OBJ_FILE="$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+	if ! test -f "$OBJ_FILE"
+	then
+		# "Download" the missing object by copying it from alt-repo1
+		OBJ_DIR=$(echo $2 | sed 's/\(..\).*/\1/')
+		OBJ_BASE=$(basename "$OBJ_FILE")
+		ALT_OBJ_DIR1="$ALT_SOURCE1/objects/$OBJ_DIR"
+		ALT_OBJ_DIR2="$ALT_SOURCE2/objects/$OBJ_DIR"
+		mkdir -p "$ALT_OBJ_DIR2" || die "Could not mkdir '$ALT_OBJ_DIR2'"
+		OBJ_SRC="$ALT_OBJ_DIR1/$OBJ_BASE"
+		cp "$OBJ_SRC" "$ALT_OBJ_DIR2" ||
+		die "Could not cp '$OBJ_SRC' into '$ALT_OBJ_DIR2'"
+	fi
+	cat "$OBJ_FILE" || die "Could not cat '$OBJ_FILE'"
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	writen=$(git hash-object -w -t "$kind" --stdin)
+	test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$OTHER_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	GIT_DIR=$OTHER_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER2="\"$PWD\"/odb-helper2"
+
+test_expect_success 'setup first alternate repo' '
+	git init alt-repo1 &&
+	test_commit zero &&
+	git config odb.magic.command "$HELPER1"
+'
+
+test_expect_success 'setup other repo and its alternate repo' '
+	git init other-repo &&
+	git init alt-repo2 &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+test_expect_success 'new blobs are put in first object store' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo1 && git show "$hash1") &&
+	test "$content" = "one" &&
+	test_commit two &&
+	hash2=$(git ls-tree HEAD | grep two.t | cut -f1 | cut -d\  -f3) &&
+	content=$(cd alt-repo1 && git show "$hash2") &&
+	test "$content" = "two"
+'
+
+test_expect_success 'other repo gets the blobs from object store' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 test_must_fail git cat-file blob "$hash2" &&
+	 git config odb.magic.command "$HELPER2" &&
+	 git cat-file blob "$hash1" &&
+	 git cat-file blob "$hash2"
+	)
+'
+
+test_expect_success 'other repo gets everything else' '
+	(cd other-repo &&
+	 git fetch origin &&
+	 content=$(git show "$hash1") &&
+	 test "$content" = "one" &&
+	 content=$(git show "$hash2") &&
+	 test "$content" = "two")
+'
+
+test_done
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd()
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (9 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 12/16] lib-httpd: add upload.sh Christian Couder
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This makes it possible to start an apache web server with different
config files.

This will be used in a later patch to pass a config file that makes
apache store external objects.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465a..2e659a8ee2 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -171,12 +171,14 @@ prepare_httpd() {
 }
 
 start_httpd() {
+	APACHE_CONF_FILE=${1-apache.conf}
+
 	prepare_httpd >&3 2>&4
 
 	trap 'code=$?; stop_httpd; (exit $code); die' EXIT
 
 	"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-		-f "$TEST_PATH/apache.conf" $HTTPD_PARA \
+		-f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA \
 		-c "Listen 127.0.0.1:$LIB_HTTPD_PORT" -k start \
 		>&3 2>&4
 	if test $? -ne 0
@@ -191,7 +193,7 @@ stop_httpd() {
 	trap 'die' EXIT
 
 	"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-		-f "$TEST_PATH/apache.conf" $HTTPD_PARA -k stop
+		-f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA -k stop
 }
 
 test_http_push_nonff () {
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 12/16] lib-httpd: add upload.sh
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (10 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd() Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 13/16] lib-httpd: add list.sh Christian Couder
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This cgi will be used to upload objects to, or to delete
objects from, an apache web server.

This way the apache server can work as an external object
database.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh        |  1 +
 t/lib-httpd/upload.sh | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)
 create mode 100644 t/lib-httpd/upload.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 2e659a8ee2..d80b004549 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
 	install_script broken-smart-http.sh
 	install_script error.sh
+	install_script upload.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh
new file mode 100644
index 0000000000..172be0f73f
--- /dev/null
+++ b/t/lib-httpd/upload.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+# In part from http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+    key=${1%=*}
+    val=${1#*=}
+
+    case "$key" in
+	"sha1") sha1="$val" ;;
+	"type") type="$val" ;;
+	"size") size="$val" ;;
+	"delete") delete=1 ;;
+	*) echo >&2 "unknown key '$key'" ;;
+    esac
+
+    shift
+done
+
+case "$REQUEST_METHOD" in
+  POST)
+    if test "$delete" = "1"
+    then
+	rm -f "$FILES_DIR/$sha1-$size-$type"
+    else
+	mkdir -p "$FILES_DIR"
+	cat >"$FILES_DIR/$sha1-$size-$type"
+    fi
+
+    echo 'Status: 204 No Content'
+    echo
+    ;;
+
+  *)
+    echo 'Status: 405 Method Not Allowed'
+    echo
+esac
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 13/16] lib-httpd: add list.sh
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (11 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 12/16] lib-httpd: add upload.sh Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf Christian Couder
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This cgi script can list Git objects that have been uploaded as
files to an apache web server. This script can also retrieve
the content of each of these files.

This will help make apache work as an external object database.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd.sh      |  1 +
 t/lib-httpd/list.sh | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)
 create mode 100644 t/lib-httpd/list.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index d80b004549..f31ea261f5 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -133,6 +133,7 @@ prepare_httpd() {
 	install_script broken-smart-http.sh
 	install_script error.sh
 	install_script upload.sh
+	install_script list.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh
new file mode 100644
index 0000000000..a54402558f
--- /dev/null
+++ b/t/lib-httpd/list.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+    key=${1%=*}
+    val=${1#*=}
+
+    case "$key" in
+	"sha1") sha1="$val" ;;
+	*) echo >&2 "unknown key '$key'" ;;
+    esac
+
+    shift
+done
+
+echo 'Status: 200 OK'
+echo
+
+if test -d "$FILES_DIR"
+then
+    if test -n "$sha1"
+    then
+	cat "$FILES_DIR/$sha1"-*
+    else
+	ls "$FILES_DIR" | tr '-' ' '
+    fi
+fi
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (12 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 13/16] lib-httpd: add list.sh Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This is an apache config file to test external object databases.
It uses the upload.sh and list.sh cgi that have been added
previously to make apache store external objects.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/lib-httpd/apache-e-odb.conf | 214 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 214 insertions(+)
 create mode 100644 t/lib-httpd/apache-e-odb.conf

diff --git a/t/lib-httpd/apache-e-odb.conf b/t/lib-httpd/apache-e-odb.conf
new file mode 100644
index 0000000000..19a1540c82
--- /dev/null
+++ b/t/lib-httpd/apache-e-odb.conf
@@ -0,0 +1,214 @@
+ServerName dummy
+PidFile httpd.pid
+DocumentRoot www
+LogFormat "%h %l %u %t \"%r\" %>s %b" common
+CustomLog access.log common
+ErrorLog error.log
+<IfModule !mod_log_config.c>
+	LoadModule log_config_module modules/mod_log_config.so
+</IfModule>
+<IfModule !mod_alias.c>
+	LoadModule alias_module modules/mod_alias.so
+</IfModule>
+<IfModule !mod_cgi.c>
+	LoadModule cgi_module modules/mod_cgi.so
+</IfModule>
+<IfModule !mod_env.c>
+	LoadModule env_module modules/mod_env.so
+</IfModule>
+<IfModule !mod_rewrite.c>
+	LoadModule rewrite_module modules/mod_rewrite.so
+</IFModule>
+<IfModule !mod_version.c>
+	LoadModule version_module modules/mod_version.so
+</IfModule>
+<IfModule !mod_headers.c>
+	LoadModule headers_module modules/mod_headers.so
+</IfModule>
+
+<IfVersion < 2.4>
+LockFile accept.lock
+</IfVersion>
+
+<IfVersion < 2.1>
+<IfModule !mod_auth.c>
+	LoadModule auth_module modules/mod_auth.so
+</IfModule>
+</IfVersion>
+
+<IfVersion >= 2.1>
+<IfModule !mod_auth_basic.c>
+	LoadModule auth_basic_module modules/mod_auth_basic.so
+</IfModule>
+<IfModule !mod_authn_file.c>
+	LoadModule authn_file_module modules/mod_authn_file.so
+</IfModule>
+<IfModule !mod_authz_user.c>
+	LoadModule authz_user_module modules/mod_authz_user.so
+</IfModule>
+<IfModule !mod_authz_host.c>
+	LoadModule authz_host_module modules/mod_authz_host.so
+</IfModule>
+</IfVersion>
+
+<IfVersion >= 2.4>
+<IfModule !mod_authn_core.c>
+	LoadModule authn_core_module modules/mod_authn_core.so
+</IfModule>
+<IfModule !mod_authz_core.c>
+	LoadModule authz_core_module modules/mod_authz_core.so
+</IfModule>
+<IfModule !mod_access_compat.c>
+	LoadModule access_compat_module modules/mod_access_compat.so
+</IfModule>
+<IfModule !mod_mpm_prefork.c>
+	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+</IfModule>
+<IfModule !mod_unixd.c>
+	LoadModule unixd_module modules/mod_unixd.so
+</IfModule>
+</IfVersion>
+
+PassEnv GIT_VALGRIND
+PassEnv GIT_VALGRIND_OPTIONS
+PassEnv GNUPGHOME
+PassEnv ASAN_OPTIONS
+PassEnv GIT_TRACE
+PassEnv GIT_CONFIG_NOSYSTEM
+
+Alias /dumb/ www/
+Alias /auth/dumb/ www/auth/dumb/
+
+<LocationMatch /smart/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
+<LocationMatch /smart_noexport/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+</LocationMatch>
+<LocationMatch /smart_custom_env/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	SetEnv GIT_COMMITTER_NAME "Custom User"
+	SetEnv GIT_COMMITTER_EMAIL custom@example.com
+</LocationMatch>
+<LocationMatch /smart_namespace/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	SetEnv GIT_NAMESPACE ns
+</LocationMatch>
+<LocationMatch /smart_cookies/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+	Header set Set-Cookie name=value
+</LocationMatch>
+<LocationMatch /smart_headers/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
+ScriptAlias /upload/ upload.sh/
+ScriptAlias /list/ list.sh/
+<Directory ${GIT_EXEC_PATH}>
+	Options FollowSymlinks
+</Directory>
+<Files upload.sh>
+  Options ExecCGI
+</Files>
+<Files list.sh>
+  Options ExecCGI
+</Files>
+<Files ${GIT_EXEC_PATH}/git-http-backend>
+	Options ExecCGI
+</Files>
+
+RewriteEngine on
+RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301]
+RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302]
+RewriteRule ^/smart-redir-auth/(.*)$ /auth/smart/$1 [R=301]
+RewriteRule ^/smart-redir-limited/(.*)/info/refs$ /smart/$1/info/refs [R=301]
+RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 [R=302]
+
+RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 [R=302]
+RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
+
+# Apache 2.2 does not understand <RequireAll>, so we use RewriteCond.
+# And as RewriteCond does not allow testing for non-matches, we match
+# the desired case first (one has abra, two has cadabra), and let it
+# pass by marking the RewriteRule as [L], "last rule, do not process
+# any other matching RewriteRules after this"), and then have another
+# RewriteRule that matches all other cases and lets them fail via '[F]',
+# "fail the request".
+RewriteCond %{HTTP:x-magic-one} =abra
+RewriteCond %{HTTP:x-magic-two} =cadabra
+RewriteRule ^/smart_headers/.* - [L]
+RewriteRule ^/smart_headers/.* - [F]
+
+<IfDefine SSL>
+LoadModule ssl_module modules/mod_ssl.so
+
+SSLCertificateFile httpd.pem
+SSLCertificateKeyFile httpd.pem
+SSLRandomSeed startup file:/dev/urandom 512
+SSLRandomSeed connect file:/dev/urandom 512
+SSLSessionCache none
+SSLMutex file:ssl_mutex
+SSLEngine On
+</IfDefine>
+
+<Location /auth/>
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</Location>
+
+<LocationMatch "^/auth-push/.*/git-receive-pack$">
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</LocationMatch>
+
+<LocationMatch "^/auth-fetch/.*/git-upload-pack$">
+	AuthType Basic
+	AuthName "git-auth"
+	AuthUserFile passwd
+	Require valid-user
+</LocationMatch>
+
+RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
+RewriteCond %{REQUEST_URI} /git-receive-pack$
+RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
+
+<Location /half-auth-complete/>
+  Order Deny,Allow
+  Deny from env=AUTHREQUIRED
+
+  AuthType Basic
+  AuthName "Git Access"
+  AuthUserFile passwd
+  Require valid-user
+  Satisfy Any
+</Location>
+
+<IfDefine DAV>
+	LoadModule dav_module modules/mod_dav.so
+	LoadModule dav_fs_module modules/mod_dav_fs.so
+
+	DAVLockDB DAVLock
+	<Location /dumb/>
+		Dav on
+	</Location>
+	<Location /auth/dumb>
+		Dav on
+	</Location>
+</IfDefine>
+
+<IfDefine SVN>
+	LoadModule dav_svn_module modules/mod_dav_svn.so
+
+	<Location /${LIB_HTTPD_SVN}>
+		DAV svn
+		SVNPath "${LIB_HTTPD_SVNPATH}"
+	</Location>
+</IfDefine>
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper'
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (13 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 21:04 ` [RFC/PATCH v3 16/16] t0420: add test with HTTP external odb Christian Couder
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This adds a configuration option odb.<helper>.plainObjects and the
corresponding boolean variable called 'store_plain_objects' in
'struct odb_helper' to make it possible for external object
databases to store object as plain objects instead of Git objects.

The existing odb_helper_fetch_object() is renamed
odb_helper_fetch_git_object() and a new odb_helper_fetch_plain_object()
is introduce to deal with external objects that are not in Git format.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 external-odb.c |   2 +
 odb-helper.c   | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 odb-helper.h   |   1 +
 3 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index a980fbfbf2..af55377281 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -36,6 +36,8 @@ static int external_odb_config(const char *var, const char *value, void *data)
 
 	if (!strcmp(key, "command"))
 		return git_config_string(&o->cmd, var, value);
+	if (!strcmp(key, "plainobjects"))
+		o->store_plain_objects = git_config_bool(var, value);
 
 	return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index 7b7de7380f..6b9fb7927a 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -153,8 +153,107 @@ int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1)
 	return !!odb_helper_lookup(o, sha1);
 }
 
-int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
-			    int fd)
+static int odb_helper_fetch_plain_object(struct odb_helper *o,
+					 const unsigned char *sha1,
+					 int fd)
+{
+	struct odb_helper_object *obj;
+	struct odb_helper_cmd cmd;
+	unsigned long total_got = 0;
+
+	char hdr[32];
+	int hdrlen;
+
+	int ret = Z_STREAM_END;
+	unsigned char compressed[4096];
+	git_zstream stream;
+	git_SHA_CTX hash;
+	unsigned char real_sha1[20];
+
+	obj = odb_helper_lookup(o, sha1);
+	if (!obj)
+		return -1;
+
+	if (odb_helper_start(o, &cmd, 0, "get %s", sha1_to_hex(sha1)) < 0)
+		return -1;
+
+	/* Set it up */
+	git_deflate_init(&stream, zlib_compression_level);
+	stream.next_out = compressed;
+	stream.avail_out = sizeof(compressed);
+	git_SHA1_Init(&hash);
+
+	/* First header.. */
+	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj->type), obj->size) + 1;
+	stream.next_in = (unsigned char *)hdr;
+	stream.avail_in = hdrlen;
+	while (git_deflate(&stream, 0) == Z_OK)
+		; /* nothing */
+	git_SHA1_Update(&hash, hdr, hdrlen);
+
+	for (;;) {
+		unsigned char buf[4096];
+		int r;
+
+		r = xread(cmd.child.out, buf, sizeof(buf));
+		if (r < 0) {
+			error("unable to read from odb helper '%s': %s",
+			      o->name, strerror(errno));
+			close(cmd.child.out);
+			odb_helper_finish(o, &cmd);
+			git_deflate_end(&stream);
+			return -1;
+		}
+		if (r == 0)
+			break;
+
+		total_got += r;
+
+		/* Then the data itself.. */
+		stream.next_in = (void *)buf;
+		stream.avail_in = r;
+		do {
+			unsigned char *in0 = stream.next_in;
+			ret = git_deflate(&stream, Z_FINISH);
+			git_SHA1_Update(&hash, in0, stream.next_in - in0);
+			write_or_die(fd, compressed, stream.next_out - compressed);
+			stream.next_out = compressed;
+			stream.avail_out = sizeof(compressed);
+		} while (ret == Z_OK);
+	}
+
+	close(cmd.child.out);
+	if (ret != Z_STREAM_END) {
+		warning("bad zlib data from odb helper '%s' for %s",
+			o->name, sha1_to_hex(sha1));
+		return -1;
+	}
+	ret = git_deflate_end_gently(&stream);
+	if (ret != Z_OK) {
+		warning("deflateEnd on object %s from odb helper '%s' failed (%d)",
+			sha1_to_hex(sha1), o->name, ret);
+		return -1;
+	}
+	git_SHA1_Final(real_sha1, &hash);
+	if (hashcmp(sha1, real_sha1)) {
+		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+		return -1;
+	}
+	if (odb_helper_finish(o, &cmd))
+		return -1;
+	if (total_got != obj->size) {
+		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
+			o->name, sha1_to_hex(sha1), total_got, obj->size);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int odb_helper_fetch_git_object(struct odb_helper *o,
+				       const unsigned char *sha1,
+				       int fd)
 {
 	struct odb_helper_object *obj;
 	struct odb_helper_cmd cmd;
@@ -242,6 +341,16 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
 	return 0;
 }
 
+int odb_helper_fetch_object(struct odb_helper *o,
+			    const unsigned char *sha1,
+			    int fd)
+{
+	if (o->store_plain_objects)
+		return odb_helper_fetch_plain_object(o, sha1, fd);
+	else
+		return odb_helper_fetch_git_object(o, sha1, fd);
+}
+
 int odb_helper_for_each_object(struct odb_helper *o,
 			       each_external_object_fn fn,
 			       void *data)
diff --git a/odb-helper.h b/odb-helper.h
index af31cc27d5..80d332139d 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -6,6 +6,7 @@
 struct odb_helper {
 	const char *name;
 	const char *cmd;
+	int store_plain_objects;
 
 	struct odb_helper_object {
 		unsigned char sha1[20];
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC/PATCH v3 16/16] t0420: add test with HTTP external odb
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (14 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
@ 2016-11-30 21:04 ` Christian Couder
  2016-11-30 22:36 ` [RFC/PATCH v3 00/16] Add initial experimental external ODB support Junio C Hamano
  2016-12-03 18:47 ` Lars Schneider
  17 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-11-30 21:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

This tests that an apache web server can be used as an
external object database and store files in their native
format instead of converting them to a Git object.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0420-transfer-http-e-odb.sh | 118 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 118 insertions(+)
 create mode 100755 t/t0420-transfer-http-e-odb.sh

diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
new file mode 100755
index 0000000000..9fb84877b5
--- /dev/null
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -0,0 +1,118 @@
+#!/bin/sh
+
+test_description='tests for transfering external objects to an HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-http-helper <<\EOF
+die() {
+	printf >&2 "%s\n" "$@"
+	exit 1
+}
+echo >&2 "odb-http-helper args:" "$@"
+case "$1" in
+have)
+	list_url="$HTTPD_URL/list/"
+	curl "$list_url" ||
+	die "curl '$list_url' failed"
+	;;
+get)
+	get_url="$HTTPD_URL/list/?sha1=$2"
+	curl "$get_url" ||
+	die "curl '$get_url' failed"
+	;;
+put)
+	sha1="$2"
+	size="$3"
+	kind="$4"
+	upload_url="$HTTPD_URL/upload/?sha1=$sha1&size=$size&type=$kind"
+	curl --data-binary @- --include "$upload_url" >out ||
+	die "curl '$upload_url' failed"
+	ref_hash=$(echo "$sha1 $size $kind" | GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+	git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+	;;
+*)
+	die "unknown command '$1'"
+	;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-http-helper"
+
+
+test_expect_success 'setup repo with a root commit and the helper' '
+	test_commit zero &&
+	git config odb.magic.command "$HELPER" &&
+	git config odb.magic.plainObjects "true"
+'
+
+test_expect_success 'setup another repo from the first one' '
+	git init other-repo &&
+	(cd other-repo &&
+	 git remote add origin .. &&
+	 git pull origin master &&
+	 git checkout master &&
+	 git log)
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME&size=123&type=blob"
+
+test_expect_success 'can upload a file' '
+	echo "Hello Apache World!" >hello_to_send.txt &&
+	echo "How are you?" >>hello_to_send.txt &&
+	curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" >out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+	curl --include "$LIST_URL" >out_list &&
+	grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+	curl --data "delete" --include "$UPLOAD_URL&delete=1" >out_delete &&
+	curl --include "$LIST_URL" >out_list2 &&
+	! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+	test_commit one &&
+	hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+	echo "$hash1-4-blob" >expected &&
+	ls "$FILES_DIR" >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+	git cat-file blob "$hash1" &&
+	git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+	(cd other-repo &&
+	 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+	 test_must_fail git cat-file blob "$hash1" &&
+	 git config odb.magic.command "$HELPER" &&
+	 git config odb.magic.plainObjects "true" &&
+	 git cat-file blob "$hash1" &&
+	 git pull origin master)
+'
+
+stop_httpd
+
+test_done
-- 
2.11.0.rc2.37.geb49ca6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (15 preceding siblings ...)
  2016-11-30 21:04 ` [RFC/PATCH v3 16/16] t0420: add test with HTTP external odb Christian Couder
@ 2016-11-30 22:36 ` Junio C Hamano
  2016-12-13 16:40   ` Christian Couder
  2016-12-03 18:47 ` Lars Schneider
  17 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2016-11-30 22:36 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> For now there should be one odb ref per blob. Each ref name should be
> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> in the external odb named <odbname>.
>
> These odb refs should all point to a blob that should be stored in the
> Git repository and contain information about the blob stored in the
> external odb. This information can be specific to the external odb.
> The repos can then share this information using commands like:
>
> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

Unless this is designed to serve only a handful of blobs, I cannot
see how this design would scale successfully.  I notice you wrote
"For now" at the beginning, but what is the envisioned way this will
evolve in the future?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 01/16] Add initial external odb support
  2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
@ 2016-11-30 23:30   ` Junio C Hamano
  2016-11-30 23:37     ` Jeff King
  2017-08-03  7:46     ` Christian Couder
  0 siblings, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2016-11-30 23:30 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> From: Jeff King <peff@peff.net>
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---

By the time the series loses RFC prefix, we'd need to see the above
three lines straightened out a bit more, e.g. a real message and a
more proper sign-off sequence.

> +static struct odb_helper *find_or_create_helper(const char *name, int len)
> +{
> +	struct odb_helper *o;
> +
> +	for (o = helpers; o; o = o->next)
> +		if (!strncmp(o->name, name, len) && !o->name[len])
> +			return o;
> +
> +	o = odb_helper_new(name, len);
> +	*helpers_tail = o;
> +	helpers_tail = &o->next;
> +
> +	return o;
> +}
> +
> +static int external_odb_config(const char *var, const char *value, void *data)
> +{
> +	struct odb_helper *o;
> +	const char *key, *dot;
> +
> +	if (!skip_prefix(var, "odb.", &key))
> +		return 0;
> +	dot = strrchr(key, '.');
> +	if (!dot)
> +		return 0;

Is this something Peff wrote long time ago?  I find it surprising
that he would write this without using parse_config_key().

> +struct odb_helper_cmd {
> +	struct argv_array argv;
> +	struct child_process child;
> +};
> +
> +static void prepare_helper_command(struct argv_array *argv, const char *cmd,
> +				   const char *fmt, va_list ap)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	strbuf_addstr(&buf, cmd);
> +	strbuf_addch(&buf, ' ');
> +	strbuf_vaddf(&buf, fmt, ap);
> +
> +	argv_array_push(argv, buf.buf);
> +	strbuf_release(&buf);

This concatenates the cmdname (like "get") and its parameters into a
single string (e.g. "get 454cb6bd52a4de614a3633e4f547af03d5c3b640")
and then places the resulting string as the sole element in argv[]
array, which I find somewhat unusual.  It at least deserves a
comment in front of the function that the callers are responsible to
ensure that the result of vaddf(fmt, ap) is properly shell-quoted.
Otherwise it is inviting quoting errors in the future (even if there
is no such errors in the current code and command set, i.e. a 40-hex
object name that "get" subcommand takes happens to require no
special shell-quoting consideration).

> +int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
> +			    int fd)
> +{
> +	struct odb_helper_object *obj;
> +	struct odb_helper_cmd cmd;
> +	unsigned long total_got;
> +	git_zstream stream;
> +	int zret = Z_STREAM_END;
> +	git_SHA_CTX hash;
> +	unsigned char real_sha1[20];
> +
> +	obj = odb_helper_lookup(o, sha1);
> +	if (!obj)
> +		return -1;
> +
> +	if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
> +		return -1;
> +
> +	memset(&stream, 0, sizeof(stream));
> +	git_inflate_init(&stream);
> +	git_SHA1_Init(&hash);
> +	total_got = 0;
> +
> +	for (;;) {
> +		unsigned char buf[4096];
> +		int r;
> +
> +		r = xread(cmd.child.out, buf, sizeof(buf));
> +		if (r < 0) {
> +			error("unable to read from odb helper '%s': %s",
> +			      o->name, strerror(errno));
> +			close(cmd.child.out);
> +			odb_helper_finish(o, &cmd);
> +			git_inflate_end(&stream);
> +			return -1;
> +		}
> +		if (r == 0)
> +			break;
> +
> +		write_or_die(fd, buf, r);
> +
> +		stream.next_in = buf;
> +		stream.avail_in = r;
> +		do {
> +			unsigned char inflated[4096];
> +			unsigned long got;
> +
> +			stream.next_out = inflated;
> +			stream.avail_out = sizeof(inflated);
> +			zret = git_inflate(&stream, Z_SYNC_FLUSH);
> +			got = sizeof(inflated) - stream.avail_out;
> +
> +			git_SHA1_Update(&hash, inflated, got);
> +			/* skip header when counting size */
> +			if (!total_got) {
> +				const unsigned char *p = memchr(inflated, '\0', got);

Does this assume that a single xread() that can result in a
short-read would not return in the middle of "header", and if so, is
that a safe assumption to make?

> +				if (p)
> +					got -= p - inflated + 1;
> +				else
> +					got = 0;
> +			}
> +			total_got += got;
> +		} while (stream.avail_in && zret == Z_OK);
> +	}
> +
> +	close(cmd.child.out);
> +	git_inflate_end(&stream);
> +	git_SHA1_Final(real_sha1, &hash);
> +	if (odb_helper_finish(o, &cmd))
> +		return -1;
> +	if (zret != Z_STREAM_END) {
> +		warning("bad zlib data from odb helper '%s' for %s",
> +			o->name, sha1_to_hex(sha1));
> +		return -1;
> +	}
> +	if (total_got != obj->size) {
> +		warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
> +			o->name, sha1_to_hex(sha1), total_got, obj->size);
> +		return -1;
> +	}
> +	if (hashcmp(real_sha1, sha1)) {
> +		warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
> +			o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
> +		return -1;
> +	}
> +
> +	return 0;
> +}

OK.  So we call the external helper with "get" command, expecting
that the helper returns the data as a zlib deflated stream, and
validate that the data matches the expected hash and the expected
size, while also saving the contents of the deflated stream to fd.
Presumably the caller opened a fd to write into a loose object?  I
do not see this code actually validating that the loose object
header, i.e. "<type> <len>\0", matches what the caller wanted to see
in obj->size and obj->type.  Shouldn't there be a check for that,
too?

I am tempted to debate myself if it is a sensible design to require
"get" to return a loose object representation, but cannot decide
without seeing the remainder of the series.  An obvious alternative
is to define the "get" request to interface with us via the raw
contents (not even deflated) and leave the deflating to us, i.e. Git
sitting on the receiving end, which would allow us to choose to
store it differently (e.g. we may want to try streaming it into its
own pack using the streaming.h API, for example).


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 01/16] Add initial external odb support
  2016-11-30 23:30   ` Junio C Hamano
@ 2016-11-30 23:37     ` Jeff King
  2017-08-03  7:48       ` Christian Couder
  2017-08-03  7:46     ` Christian Couder
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2016-11-30 23:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

On Wed, Nov 30, 2016 at 03:30:09PM -0800, Junio C Hamano wrote:

> Christian Couder <christian.couder@gmail.com> writes:
> 
> > From: Jeff King <peff@peff.net>
> >
> > Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> > ---
> 
> By the time the series loses RFC prefix, we'd need to see the above
> three lines straightened out a bit more, e.g. a real message and a
> more proper sign-off sequence.

Actually, I would assume that this patch would go away altogether and
get folded into commits that get us closer to the end state. But I
haven't read the series carefully yet, so maybe it really does work
better with this as a base.

I am perfectly OK dropping my commit count by one and getting "based on
a patch by Peff" in a cover letter or elsewhere.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
                   ` (16 preceding siblings ...)
  2016-11-30 22:36 ` [RFC/PATCH v3 00/16] Add initial experimental external ODB support Junio C Hamano
@ 2016-12-03 18:47 ` Lars Schneider
  2016-12-05 13:23   ` Jeff King
  2016-12-13 17:20   ` Christian Couder
  17 siblings, 2 replies; 30+ messages in thread
From: Lars Schneider @ 2016-12-03 18:47 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Eric Wong, Christian Couder


> On 30 Nov 2016, at 22:04, Christian Couder <christian.couder@gmail.com> wrote:
> 
> Goal
> ~~~~
> 
> Git can store its objects only in the form of loose objects in
> separate files or packed objects in a pack file.
> 
> To be able to better handle some kind of objects, for example big
> blobs, it would be nice if Git could store its objects in other object
> databases (ODB).

This is a great goal. I really hope we can use that to solve the
pain points in the current Git <--> GitLFS integration!
Thanks for working on this!

Minor nit: I feel the term "other" could be more expressive. Plus
"database" might confuse people. What do you think about
"External Object Storage" or something?


> Design
> ~~~~~~
> 
>  - "<command> have": the command should output the sha1, size and
> type of all the objects the external ODB contains, one object per
> line.

This looks impractical. If a repo has 10k external files with
100 versions each then you need to read/transfer 1m hashes (this is
not made up - I am working with Git repos than contain >>10k files
in GitLFS).

Wouldn't it be better if Git collects all hashes that it currently 
needs and then asks the external ODBs if they have them?


>  - "<command> get <sha1>": the command should then read from the
> external ODB the content of the object corresponding to <sha1> and
> output it on stdout.
> 
>  - "<command> put <sha1> <size> <type>": the command should then read
> from stdin an object and store it in the external ODB.

Based on my experience with Git clean/smudge filters I think this kind 
of single shot protocol will be a performance bottleneck as soon as 
people store more than >1000 files in the external ODB.
Maybe you can reuse my "filter process protocol" (edcc858) here?


> * Transfer
> 
> To tranfer information about the blobs stored in external ODB, some
> special refs, called "odb ref", similar as replace refs, are used.
> 
> For now there should be one odb ref per blob. Each ref name should be
> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> in the external odb named <odbname>.
> 
> These odb refs should all point to a blob that should be stored in the
> Git repository and contain information about the blob stored in the
> external odb. This information can be specific to the external odb.
> The repos can then share this information using commands like:
> 
> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

The "odbref" would point to a blob and the blob could contain anything,
right? E.g. it could contain an existing GitLFS pointer, right?

version https://git-lfs.github.com/spec/v1
oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
size 12345


> Design discussion about performance
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Yeah, it is not efficient to fork/exec a command to just read or write
> one object to or from the external ODB. Batch calls and/or using a
> daemon and/or RPC should be used instead to be able to store regular
> objects in an external ODB. But for now the external ODB would be all
> about really big files, where the cost of a fork+exec should not
> matter much. If we later want to extend usage of external ODBs, yeah
> we will probably need to design other mechanisms.

I think we should leverage the learnings from GitLFS as much as possible.
My learnings are:

(1) Fork/exec per object won't work. People have lots and lots of content
    that is not suited for Git (e.g. integration test data, images, ...).

(2) We need a good UI. I think it would be great if the average user would 
    not even need to know about ODB. Moving files explicitly with a "put"
    command seems unpractical to me. GitLFS tracks files via filename and
    that has a number of drawbacks, too. Do you see a way to define a 
    customizable metric such as "move all files to ODB X that are gzip 
    compressed larger than Y"?


> Future work
> ~~~~~~~~~~~
> 
> I think that the odb refs don't prevent a regular fetch or push from
> wanting to send the objects that are managed by an external odb. So I
> am interested in suggestions about this problem. I will take a look at
> previous discussions and how other mechanisms (shallow clone, bundle
> v3, ...) handle this.

If the ODB configuration is stored in the Git repo similar to
.gitmodules then every client that clones ODB references would be able
to resolve them, right?

Cheers,
Lars


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-12-03 18:47 ` Lars Schneider
@ 2016-12-05 13:23   ` Jeff King
  2016-12-13 17:20   ` Christian Couder
  1 sibling, 0 replies; 30+ messages in thread
From: Jeff King @ 2016-12-05 13:23 UTC (permalink / raw)
  To: Lars Schneider
  Cc: Christian Couder, git, Junio C Hamano, Nguyen Thai Ngoc Duy,
	Mike Hommey, Eric Wong, Christian Couder

On Sat, Dec 03, 2016 at 07:47:51PM +0100, Lars Schneider wrote:

> >  - "<command> have": the command should output the sha1, size and
> > type of all the objects the external ODB contains, one object per
> > line.
> 
> This looks impractical. If a repo has 10k external files with
> 100 versions each then you need to read/transfer 1m hashes (this is
> not made up - I am working with Git repos than contain >>10k files
> in GitLFS).

Are you worried about the client-to-server communication, or the pipe
between git and the helper? I had imagined that the client-to-server
communication happen infrequently and be cached.

But 1m hashes is 20MB, which is still a lot to dump over the pipe.
Another option is that Git defines a simple on-disk data structure
(e.g., a flat file of sorted 20-byte binary sha1s), and occasionally
asks the filter "please update your on-disk index".

That still leaves open the question of how the external odb script
efficiently gets updates from the server. It can use an ETag or similar
to avoid downloading an identical copy, but if one hash is added, we'd
want to know that efficiently. That is technically outside the scope of
the git<->external-odb interface, but obviously it's related. The design
of the on-disk format might be make that problem easier or harder on the
external-odb script.

> Wouldn't it be better if Git collects all hashes that it currently 
> needs and then asks the external ODBs if they have them?

I think you're going to run into latency problems when git wants to ask
"do we have this object" and expects the answer to be no. You wouldn't
want a network request for each.

And I think it would be quite complex to teach all operations to work on
a promise-like system where the answer to "do we have it" might be
"maybe; check back after you've figured out the whole batch of hashes
you're interested in".

> >  - "<command> get <sha1>": the command should then read from the
> > external ODB the content of the object corresponding to <sha1> and
> > output it on stdout.
> > 
> >  - "<command> put <sha1> <size> <type>": the command should then read
> > from stdin an object and store it in the external ODB.
> 
> Based on my experience with Git clean/smudge filters I think this kind 
> of single shot protocol will be a performance bottleneck as soon as 
> people store more than >1000 files in the external ODB.
> Maybe you can reuse my "filter process protocol" (edcc858) here?

Yeah. This interface comes from my original proposal, which used the
rationale "well, the files are big, so process startup shouldn't be a
big deal". And I don't think I wrote it down, but an implicit rationale
was "it seems to work for LFS, so it should work here too". But of
course LFS hit scaling problems, and so would this. It was one of the
reasons I was interested in making sure your filter protocol could be
used as a generic template, and I think we would want to do something
similar here.

> > * Transfer
> > 
> > To tranfer information about the blobs stored in external ODB, some
> > special refs, called "odb ref", similar as replace refs, are used.
> > 
> > For now there should be one odb ref per blob. Each ref name should be
> > refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> > in the external odb named <odbname>.
> > 
> > These odb refs should all point to a blob that should be stored in the
> > Git repository and contain information about the blob stored in the
> > external odb. This information can be specific to the external odb.
> > The repos can then share this information using commands like:
> > 
> > `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

I'd worry about scaling this part. Traditionally our refs storage does
not scale very well.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-11-30 22:36 ` [RFC/PATCH v3 00/16] Add initial experimental external ODB support Junio C Hamano
@ 2016-12-13 16:40   ` Christian Couder
  2016-12-13 20:05     ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2016-12-13 16:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

On Wed, Nov 30, 2016 at 11:36 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> For now there should be one odb ref per blob. Each ref name should be
>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>> in the external odb named <odbname>.
>>
>> These odb refs should all point to a blob that should be stored in the
>> Git repository and contain information about the blob stored in the
>> external odb. This information can be specific to the external odb.
>> The repos can then share this information using commands like:
>>
>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>
> Unless this is designed to serve only a handful of blobs, I cannot
> see how this design would scale successfully.  I notice you wrote
> "For now" at the beginning, but what is the envisioned way this will
> evolve in the future?

In general I think that having a lot of refs is really a big problem
right now in Git as many big organizations using Git are facing this
problem in one form or another.
So I think that support for a big number of refs is a separate and
important problem that should and hopefully will be solved.

My preferred way to solve it would be with something like Shawn's
RefTree. I think it would also help regarding other problems like
speeding up git protocol, tracking patch series (see git-series
discussions), tools like https://www.softwareheritage.org/, ...

If the "big number of refs" problem is not solved and many refs in
refs/odbs/<odbname>/ is a problem, it's possible to have just one ref
in refs/odbs/<odbname>/ that points to a blob that contains a list
(maybe a json list with information attached to each item) of the
blobs stored in the external odb. Though I think it would be more
complex to edit/parse this list than to deal with many refs in
refs/odbs/<odbname>/.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-12-03 18:47 ` Lars Schneider
  2016-12-05 13:23   ` Jeff King
@ 2016-12-13 17:20   ` Christian Couder
  2016-12-18 13:13     ` Lars Schneider
  1 sibling, 1 reply; 30+ messages in thread
From: Christian Couder @ 2016-12-13 17:20 UTC (permalink / raw)
  To: Lars Schneider
  Cc: git, Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Eric Wong, Christian Couder

On Sat, Dec 3, 2016 at 7:47 PM, Lars Schneider <larsxschneider@gmail.com> wrote:
>
>> On 30 Nov 2016, at 22:04, Christian Couder <christian.couder@gmail.com> wrote:
>>
>> Goal
>> ~~~~
>>
>> Git can store its objects only in the form of loose objects in
>> separate files or packed objects in a pack file.
>>
>> To be able to better handle some kind of objects, for example big
>> blobs, it would be nice if Git could store its objects in other object
>> databases (ODB).
>
> This is a great goal. I really hope we can use that to solve the
> pain points in the current Git <--> GitLFS integration!

Yeah, I hope it will help too.

> Thanks for working on this!
>
> Minor nit: I feel the term "other" could be more expressive. Plus
> "database" might confuse people. What do you think about
> "External Object Storage" or something?

In the current Git code, "DB" is already used a lot. For example in
cache.h there is:

#define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"

#define ALTERNATE_DB_ENVIRONMENT "GIT_ALTERNATE_OBJECT_DIRECTORIES"

#define INIT_DB_QUIET 0x0001
#define INIT_DB_EXIST_OK 0x0002

extern int init_db(const char *git_dir, const char *real_git_dir,
           const char *template_dir, unsigned int flags);

[...]

>>  - "<command> get <sha1>": the command should then read from the
>> external ODB the content of the object corresponding to <sha1> and
>> output it on stdout.
>>
>>  - "<command> put <sha1> <size> <type>": the command should then read
>> from stdin an object and store it in the external ODB.
>
> Based on my experience with Git clean/smudge filters I think this kind
> of single shot protocol will be a performance bottleneck as soon as
> people store more than >1000 files in the external ODB.
> Maybe you can reuse my "filter process protocol" (edcc858) here?

Yeah, I would like to do reuse your "filter process protocol" as much
as possible to improve this in the future.

>> * Transfer
>>
>> To tranfer information about the blobs stored in external ODB, some
>> special refs, called "odb ref", similar as replace refs, are used.
>>
>> For now there should be one odb ref per blob. Each ref name should be
>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>> in the external odb named <odbname>.
>>
>> These odb refs should all point to a blob that should be stored in the
>> Git repository and contain information about the blob stored in the
>> external odb. This information can be specific to the external odb.
>> The repos can then share this information using commands like:
>>
>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>
> The "odbref" would point to a blob and the blob could contain anything,
> right? E.g. it could contain an existing GitLFS pointer, right?
>
> version https://git-lfs.github.com/spec/v1
> oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
> size 12345

Yes, but I think that the sha1 should be added. So yes, it could
easily be made compatible with git LFS.

>> Design discussion about performance
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Yeah, it is not efficient to fork/exec a command to just read or write
>> one object to or from the external ODB. Batch calls and/or using a
>> daemon and/or RPC should be used instead to be able to store regular
>> objects in an external ODB. But for now the external ODB would be all
>> about really big files, where the cost of a fork+exec should not
>> matter much. If we later want to extend usage of external ODBs, yeah
>> we will probably need to design other mechanisms.
>
> I think we should leverage the learnings from GitLFS as much as possible.
> My learnings are:
>
> (1) Fork/exec per object won't work. People have lots and lots of content
>     that is not suited for Git (e.g. integration test data, images, ...).

I agree that it will not work for many people, but look at how git LFS
evolved. It first started without a good solution for those people,
and then you provided a much better solution to them.
So I am a bit reluctant to work on a complex solution reusing your
"filter process protocol" work right away.

> (2) We need a good UI. I think it would be great if the average user would
>     not even need to know about ODB. Moving files explicitly with a "put"
>     command seems unpractical to me. GitLFS tracks files via filename and
>     that has a number of drawbacks, too. Do you see a way to define a
>     customizable metric such as "move all files to ODB X that are gzip
>     compressed larger than Y"?

I think these should be defined in the config and attributes files. It
could also be possible to implement a "want" command (in the same way
as the "get", "put" and "have" commands) to ask the e-odb helper if it
wants to store a specific blob.

>> Future work
>> ~~~~~~~~~~~
>>
>> I think that the odb refs don't prevent a regular fetch or push from
>> wanting to send the objects that are managed by an external odb. So I
>> am interested in suggestions about this problem. I will take a look at
>> previous discussions and how other mechanisms (shallow clone, bundle
>> v3, ...) handle this.
>
> If the ODB configuration is stored in the Git repo similar to
> .gitmodules then every client that clones ODB references would be able
> to resolve them, right?

Yeah, but I am not sure that being able to resolve the odb refs will
prevent the big blobs from being sent.
With Git LFS, git doesn't know about the big blobs, only about the
substituted files, but that is not the case in what I am doing.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-12-13 16:40   ` Christian Couder
@ 2016-12-13 20:05     ` Junio C Hamano
  2016-12-15  9:56       ` Christian Couder
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2016-12-13 20:05 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> In general I think that having a lot of refs is really a big problem
> right now in Git as many big organizations using Git are facing this
> problem in one form or another.
> So I think that support for a big number of refs is a separate and
> important problem that should and hopefully will be solved.

But you do not have to make it worse.

Is "refs" a good match for the problem you are solving?  Or is it
merely an expedient thing to use?  I think it is the latter, judging
by your mentioning RefTree.  Whatever mechanism we choose, that will
be carved into stone in users' repositories and you'd end up having
to support it, and devise the migration path out of it if the initial
selection is too problematic.

That is why people (not just me) pointed out upfront that using refs
for this purose would not scale.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-12-13 20:05     ` Junio C Hamano
@ 2016-12-15  9:56       ` Christian Couder
  0 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2016-12-15  9:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

On Tue, Dec 13, 2016 at 9:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> In general I think that having a lot of refs is really a big problem
>> right now in Git as many big organizations using Git are facing this
>> problem in one form or another.
>> So I think that support for a big number of refs is a separate and
>> important problem that should and hopefully will be solved.
>
> But you do not have to make it worse.
>
> Is "refs" a good match for the problem you are solving?  Or is it
> merely an expedient thing to use?  I think it is the latter, judging
> by your mentioning RefTree.  Whatever mechanism we choose, that will
> be carved into stone in users' repositories and you'd end up having
> to support it, and devise the migration path out of it if the initial
> selection is too problematic.
>
> That is why people (not just me) pointed out upfront that using refs
> for this purose would not scale.

What I should perhaps have clarified in my previous answer, and also
in the documentation of the patch series, is that in what I have done
and what I propose, the external odb helper is responsible for using
and creating the refs in refs/odbs/<odbname>/.

So this helper is free to just create one ref, as it is also free to
create many refs. Git is just transmitting the refs that have been
created by this helper.

Right now people are already free to use whatever external script or
software to create whatever refs/stuff/* they want, pointing to
whatever objects they want, and have Git transmit that. And indeed I
know that it is already a problem out there, as then people often get
into trouble related to having many refs. But it is a different
problem that is not going to be solved anyway in this patch series.

So if some people want to use a specific external odb, it's their
responsibility to use an helper that will not create too many refs.
If they know that they just need their external odb to handle around
10 big files, why wouldn't they use a simple helper that creates one
odb ref per big file/blob?

On the contrary if they know that they will need to handle thousands
of big files, then, yeah, they should find or implement a helper that
will, as I suggested in my previous email, just create one ref
in refs/odbs/<odbname>/ that points to a blob that contains a list
(maybe a json list with information attached to each item) of the
blobs stored in the external odb.

For testing purposes in what I have done in the patch series, I use
only simple helpers that create one odb ref per big file/blob. So yes,
it gives a bad example, because, if people just copy this design while
they need the e-odb to handle a big number of files, then they will be
in trouble. But this does not by itself carve anything into stone.

One thing that could help is perhaps to put big warnings into the
simple helpers saying "Be careful!!! This will not scale if you want
to handle more than a small number of large files!!! You'd better use
an helper that does <this and that> if you want to handle many large
files!!! You have been warned!!!".

So I am reluctant at this point to write a complex helper just for the
purpose of showing a good example to people who want to use e-odb to
store a big number of files, as these people anyway would probably
need something like Lars' "filter process protocol" too.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
  2016-12-13 17:20   ` Christian Couder
@ 2016-12-18 13:13     ` Lars Schneider
  0 siblings, 0 replies; 30+ messages in thread
From: Lars Schneider @ 2016-12-18 13:13 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Eric Wong, Christian Couder


> On 13 Dec 2016, at 18:20, Christian Couder <christian.couder@gmail.com> wrote:
> 
> On Sat, Dec 3, 2016 at 7:47 PM, Lars Schneider <larsxschneider@gmail.com> wrote:
>> 
>>> On 30 Nov 2016, at 22:04, Christian Couder <christian.couder@gmail.com> wrote:
>>> 
>>> Goal
>>> ~~~~
>>> 
>>> Git can store its objects only in the form of loose objects in
>>> separate files or packed objects in a pack file.
>>> 
>>> To be able to better handle some kind of objects, for example big
>>> blobs, it would be nice if Git could store its objects in other object
>>> databases (ODB).
>> ...
>> 
>> Minor nit: I feel the term "other" could be more expressive. Plus
>> "database" might confuse people. What do you think about
>> "External Object Storage" or something?
> 
> In the current Git code, "DB" is already used a lot. For example in
> cache.h there is:
> ...

I am not worried about Git core developers as I don't think they would
be confused by the term "DB". I wonder if it would make sense to have
a clearer "external name" for the average Git user (== non Git devs) 
or if this would create just more confusion. 


>>> * Transfer
>>> 
>>> To tranfer information about the blobs stored in external ODB, some
>>> special refs, called "odb ref", similar as replace refs, are used.
>>> 
>>> For now there should be one odb ref per blob. Each ref name should be
>>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>>> in the external odb named <odbname>.
>>> 
>>> These odb refs should all point to a blob that should be stored in the
>>> Git repository and contain information about the blob stored in the
>>> external odb. This information can be specific to the external odb.
>>> The repos can then share this information using commands like:
>>> 
>>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>> 
>> The "odbref" would point to a blob and the blob could contain anything,
>> right? E.g. it could contain an existing GitLFS pointer, right?
>> 
>> version https://git-lfs.github.com/spec/v1
>> oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
>> size 12345
> 
> Yes, but I think that the sha1 should be added. So yes, it could
> easily be made compatible with git LFS.

What do you mean with "the sha1 should be added"? Do you suggest to add
sha1 to GitLFS?


>>> Future work
>>> ~~~~~~~~~~~
>>> 
>>> I think that the odb refs don't prevent a regular fetch or push from
>>> wanting to send the objects that are managed by an external odb. So I
>>> am interested in suggestions about this problem. I will take a look at
>>> previous discussions and how other mechanisms (shallow clone, bundle
>>> v3, ...) handle this.
>> 
>> If the ODB configuration is stored in the Git repo similar to
>> .gitmodules then every client that clones ODB references would be able
>> to resolve them, right?
> 
> Yeah, but I am not sure that being able to resolve the odb refs will
> prevent the big blobs from being sent.
> With Git LFS, git doesn't know about the big blobs, only about the
> substituted files, but that is not the case in what I am doing.

I think the biggest problem in Git are huge blobs that are not in the
head revision. In the great majority of cases you don't need these blobs
but you always have to transfer them during clone. That's what GitLFS
is solving today and what I hope your protocol could solve better in
the future!

Cheers,
Lars

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 01/16] Add initial external odb support
  2016-11-30 23:30   ` Junio C Hamano
  2016-11-30 23:37     ` Jeff King
@ 2017-08-03  7:46     ` Christian Couder
  2017-08-03  8:06       ` Jeff King
  1 sibling, 1 reply; 30+ messages in thread
From: Christian Couder @ 2017-08-03  7:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder

(I realized that I didn't answer this email about the v3 version.
Sorry about this late answer.)

On Thu, Dec 1, 2016 at 12:30 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> From: Jeff King <peff@peff.net>
>>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>
> By the time the series loses RFC prefix, we'd need to see the above
> three lines straightened out a bit more, e.g. a real message and a
> more proper sign-off sequence.

Yeah, it is better in the v5 series I will send really soon now.

>> +static struct odb_helper *find_or_create_helper(const char *name, int len)
>> +{
>> +     struct odb_helper *o;
>> +
>> +     for (o = helpers; o; o = o->next)
>> +             if (!strncmp(o->name, name, len) && !o->name[len])
>> +                     return o;
>> +
>> +     o = odb_helper_new(name, len);
>> +     *helpers_tail = o;
>> +     helpers_tail = &o->next;
>> +
>> +     return o;
>> +}
>> +
>> +static int external_odb_config(const char *var, const char *value, void *data)
>> +{
>> +     struct odb_helper *o;
>> +     const char *key, *dot;
>> +
>> +     if (!skip_prefix(var, "odb.", &key))
>> +             return 0;
>> +     dot = strrchr(key, '.');
>> +     if (!dot)
>> +             return 0;
>
> Is this something Peff wrote long time ago?  I find it surprising
> that he would write this without using parse_config_key().

parse_config_key() is used now.

>> +struct odb_helper_cmd {
>> +     struct argv_array argv;
>> +     struct child_process child;
>> +};
>> +
>> +static void prepare_helper_command(struct argv_array *argv, const char *cmd,
>> +                                const char *fmt, va_list ap)
>> +{
>> +     struct strbuf buf = STRBUF_INIT;
>> +
>> +     strbuf_addstr(&buf, cmd);
>> +     strbuf_addch(&buf, ' ');
>> +     strbuf_vaddf(&buf, fmt, ap);
>> +
>> +     argv_array_push(argv, buf.buf);
>> +     strbuf_release(&buf);
>
> This concatenates the cmdname (like "get") and its parameters into a
> single string (e.g. "get 454cb6bd52a4de614a3633e4f547af03d5c3b640")
> and then places the resulting string as the sole element in argv[]
> array, which I find somewhat unusual.  It at least deserves a
> comment in front of the function that the callers are responsible to
> ensure that the result of vaddf(fmt, ap) is properly shell-quoted.

I added a comment.

> Otherwise it is inviting quoting errors in the future (even if there
> is no such errors in the current code and command set, i.e. a 40-hex
> object name that "get" subcommand takes happens to require no
> special shell-quoting consideration).

Yeah, but I am not sure how this could be done better.

>> +int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
>> +                         int fd)
>> +{
>> +     struct odb_helper_object *obj;
>> +     struct odb_helper_cmd cmd;
>> +     unsigned long total_got;
>> +     git_zstream stream;
>> +     int zret = Z_STREAM_END;
>> +     git_SHA_CTX hash;
>> +     unsigned char real_sha1[20];
>> +
>> +     obj = odb_helper_lookup(o, sha1);
>> +     if (!obj)
>> +             return -1;
>> +
>> +     if (odb_helper_start(o, &cmd, "get %s", sha1_to_hex(sha1)) < 0)
>> +             return -1;
>> +
>> +     memset(&stream, 0, sizeof(stream));
>> +     git_inflate_init(&stream);
>> +     git_SHA1_Init(&hash);
>> +     total_got = 0;
>> +
>> +     for (;;) {
>> +             unsigned char buf[4096];
>> +             int r;
>> +
>> +             r = xread(cmd.child.out, buf, sizeof(buf));
>> +             if (r < 0) {
>> +                     error("unable to read from odb helper '%s': %s",
>> +                           o->name, strerror(errno));
>> +                     close(cmd.child.out);
>> +                     odb_helper_finish(o, &cmd);
>> +                     git_inflate_end(&stream);
>> +                     return -1;
>> +             }
>> +             if (r == 0)
>> +                     break;
>> +
>> +             write_or_die(fd, buf, r);
>> +
>> +             stream.next_in = buf;
>> +             stream.avail_in = r;
>> +             do {
>> +                     unsigned char inflated[4096];
>> +                     unsigned long got;
>> +
>> +                     stream.next_out = inflated;
>> +                     stream.avail_out = sizeof(inflated);
>> +                     zret = git_inflate(&stream, Z_SYNC_FLUSH);
>> +                     got = sizeof(inflated) - stream.avail_out;
>> +
>> +                     git_SHA1_Update(&hash, inflated, got);
>> +                     /* skip header when counting size */
>> +                     if (!total_got) {
>> +                             const unsigned char *p = memchr(inflated, '\0', got);
>
> Does this assume that a single xread() that can result in a
> short-read would not return in the middle of "header", and if so, is
> that a safe assumption to make?

I am not sure what would go wrong in case of a short read.
My guess is that as long as we test that p is not NULL below we should be fine.
As Peff wrote this code, he could probably answer much better than me.

>> +                             if (p)
>> +                                     got -= p - inflated + 1;
>> +                             else
>> +                                     got = 0;
>> +                     }
>> +                     total_got += got;
>> +             } while (stream.avail_in && zret == Z_OK);
>> +     }
>> +
>> +     close(cmd.child.out);
>> +     git_inflate_end(&stream);
>> +     git_SHA1_Final(real_sha1, &hash);
>> +     if (odb_helper_finish(o, &cmd))
>> +             return -1;
>> +     if (zret != Z_STREAM_END) {
>> +             warning("bad zlib data from odb helper '%s' for %s",
>> +                     o->name, sha1_to_hex(sha1));
>> +             return -1;
>> +     }
>> +     if (total_got != obj->size) {
>> +             warning("size mismatch from odb helper '%s' for %s (%lu != %lu)",
>> +                     o->name, sha1_to_hex(sha1), total_got, obj->size);
>> +             return -1;
>> +     }
>> +     if (hashcmp(real_sha1, sha1)) {
>> +             warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
>> +                     o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
>> +             return -1;
>> +     }
>> +
>> +     return 0;
>> +}
>
> OK.  So we call the external helper with "get" command, expecting
> that the helper returns the data as a zlib deflated stream, and
> validate that the data matches the expected hash and the expected
> size, while also saving the contents of the deflated stream to fd.
> Presumably the caller opened a fd to write into a loose object?  I
> do not see this code actually validating that the loose object
> header, i.e. "<type> <len>\0", matches what the caller wanted to see
> in obj->size and obj->type.  Shouldn't there be a check for that,
> too?

Yeah, it would be better with a check. It also looks like checking
that must take into account possible short-read that could happen (as
discussed above), so for now I will see if we need to fix the header
reading, before taking a look at this.

> I am tempted to debate myself if it is a sensible design to require
> "get" to return a loose object representation, but cannot decide
> without seeing the remainder of the series.  An obvious alternative
> is to define the "get" request to interface with us via the raw
> contents (not even deflated) and leave the deflating to us, i.e. Git
> sitting on the receiving end, which would allow us to choose to
> store it differently (e.g. we may want to try streaming it into its
> own pack using the streaming.h API, for example).

There is now both a get_raw_obj and a get_git_obj to handle both cases.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 01/16] Add initial external odb support
  2016-11-30 23:37     ` Jeff King
@ 2017-08-03  7:48       ` Christian Couder
  0 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2017-08-03  7:48 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

On Thu, Dec 1, 2016 at 12:37 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Nov 30, 2016 at 03:30:09PM -0800, Junio C Hamano wrote:
>
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>> > From: Jeff King <peff@peff.net>
>> >
>> > Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> > ---
>>
>> By the time the series loses RFC prefix, we'd need to see the above
>> three lines straightened out a bit more, e.g. a real message and a
>> more proper sign-off sequence.
>
> Actually, I would assume that this patch would go away altogether and
> get folded into commits that get us closer to the end state. But I
> haven't read the series carefully yet, so maybe it really does work
> better with this as a base.
>
> I am perfectly OK dropping my commit count by one and getting "based on
> a patch by Peff" in a cover letter or elsewhere.

Ok I have dropped your commit count and added "Helped-by: Jeff King
<peff@peff.net>" in the patches that come from your work.

Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC/PATCH v3 01/16] Add initial external odb support
  2017-08-03  7:46     ` Christian Couder
@ 2017-08-03  8:06       ` Jeff King
  0 siblings, 0 replies; 30+ messages in thread
From: Jeff King @ 2017-08-03  8:06 UTC (permalink / raw)
  To: Christian Couder
  Cc: Junio C Hamano, git, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder

On Thu, Aug 03, 2017 at 09:46:38AM +0200, Christian Couder wrote:

> >> +static int external_odb_config(const char *var, const char *value, void *data)
> >> +{
> >> +     struct odb_helper *o;
> >> +     const char *key, *dot;
> >> +
> >> +     if (!skip_prefix(var, "odb.", &key))
> >> +             return 0;
> >> +     dot = strrchr(key, '.');
> >> +     if (!dot)
> >> +             return 0;
> >
> > Is this something Peff wrote long time ago?  I find it surprising
> > that he would write this without using parse_config_key().
> 
> parse_config_key() is used now.

Yeah, I think the original was from 2012. We didn't add
parse_config_key() until 2013. Definitely worth using.

> >> +     for (;;) {
> >> +             unsigned char buf[4096];
> >> +             int r;
> >> +
> >> +             r = xread(cmd.child.out, buf, sizeof(buf));
> >> +             if (r < 0) {
> >> +                     error("unable to read from odb helper '%s': %s",
> >> +                           o->name, strerror(errno));
> >> +                     close(cmd.child.out);
> >> +                     odb_helper_finish(o, &cmd);
> >> +                     git_inflate_end(&stream);
> >> +                     return -1;
> >> +             }
> >> +             if (r == 0)
> >> +                     break;
> >> +
> >> +             write_or_die(fd, buf, r);
> >> +
> >> +             stream.next_in = buf;
> >> +             stream.avail_in = r;
> >> +             do {
> >> +                     unsigned char inflated[4096];
> >> +                     unsigned long got;
> >> +
> >> +                     stream.next_out = inflated;
> >> +                     stream.avail_out = sizeof(inflated);
> >> +                     zret = git_inflate(&stream, Z_SYNC_FLUSH);
> >> +                     got = sizeof(inflated) - stream.avail_out;
> >> +
> >> +                     git_SHA1_Update(&hash, inflated, got);
> >> +                     /* skip header when counting size */
> >> +                     if (!total_got) {
> >> +                             const unsigned char *p = memchr(inflated, '\0', got);
> [reconstructed quoted function so we can see the whole thing]
> >> +                             if (p)
> >> +                                     got -= p - inflated + 1;
> >> +                             else
> >> +                                     got = 0;
> >> +                     }
> >> +                     total_got += got;
> >> +             } while (stream.avail_in && zret == Z_OK);
> >> +     }
> >
> > Does this assume that a single xread() that can result in a
> > short-read would not return in the middle of "header", and if so, is
> > that a safe assumption to make?
> 
> I am not sure what would go wrong in case of a short read.
> My guess is that as long as we test that p is not NULL below we should be fine.
> As Peff wrote this code, he could probably answer much better than me.

I think it's OK. The idea is to suck up characters until we hit the
end-of-header NUL. So "total_got" only becomes non-zero once we've seen
that NUL. If we get a short read then that memchr() returns NULL, and we
set "got" to 0, and don't advance total_got at all. When we finally do
hit a partial read that contains the NUL, then "p" will be non-NULL, and
we'll reduce "got" as appropriate.

All that said, I agree with the bits you both said later that we should
probably be checking the actual content of the header (if we're indeed
going to keep this on-the-wire format -- see below).

> > I am tempted to debate myself if it is a sensible design to require
> > "get" to return a loose object representation, but cannot decide
> > without seeing the remainder of the series.  An obvious alternative
> > is to define the "get" request to interface with us via the raw
> > contents (not even deflated) and leave the deflating to us, i.e. Git
> > sitting on the receiving end, which would allow us to choose to
> > store it differently (e.g. we may want to try streaming it into its
> > own pack using the streaming.h API, for example).
> 
> There is now both a get_raw_obj and a get_git_obj to handle both cases.

Yeah, I don't recall why I picked the loose format as the transfer
mechanism. I guess I figured that the objects would be large, so we'd
want them in their own loose objects (not part of a pack that might have
to get rewritten). So storing and sending as a loose-object zlib stream
means we can avoid any recompression and just send it out to disk.

But that was a long time ago. I think these days we prefer to put even
large objects into packfiles. If we wanted something the client could
stream directly into storage, the pack format would probably make more
sense. But it also probably isn't the end of the world to just get raw
contents and then re-zlib them. That's a little extra overhead on the
receiving side, but it makes things nice and simple. And we're already
uncompressing and computing the sha1 to verify the incoming content
anyway.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-08-03  8:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
2016-11-30 23:30   ` Junio C Hamano
2016-11-30 23:37     ` Jeff King
2017-08-03  7:48       ` Christian Couder
2017-08-03  7:46     ` Christian Couder
2017-08-03  8:06       ` Jeff King
2016-11-30 21:04 ` [RFC/PATCH v3 02/16] external odb foreach Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 05/16] t0400: add test for 'put' command Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 06/16] external odb: add write support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 07/16] external-odb: accept only blobs for now Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 08/16] t0400: add test for external odb write support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd() Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 12/16] lib-httpd: add upload.sh Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 13/16] lib-httpd: add list.sh Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 16/16] t0420: add test with HTTP external odb Christian Couder
2016-11-30 22:36 ` [RFC/PATCH v3 00/16] Add initial experimental external ODB support Junio C Hamano
2016-12-13 16:40   ` Christian Couder
2016-12-13 20:05     ` Junio C Hamano
2016-12-15  9:56       ` Christian Couder
2016-12-03 18:47 ` Lars Schneider
2016-12-05 13:23   ` Jeff King
2016-12-13 17:20   ` Christian Couder
2016-12-18 13:13     ` Lars Schneider

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).