git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [WIP 0/2] Submodule ref backend that mirrors superproject
@ 2017-12-01 22:50 Jonathan Tan
  2017-12-01 22:50 ` [WIP 1/2] submodule: refactor acquisition of superproject info Jonathan Tan
  2017-12-01 22:50 ` [WIP 2/2] submodule: read-only super-backed ref backend Jonathan Tan
  0 siblings, 2 replies; 5+ messages in thread
From: Jonathan Tan @ 2017-12-01 22:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, jacob.keller, Jonathan Tan

I sent out an earlier email [1] that discusses the idea of a submodule
ref backend that mirrors the superproject. Basically, if this backend is
used (for example, through a configuration option), the submodule itself
will not store any "refs/..." refs, but will check the gitlink of the
commit of the ref of the same name in the superproject. For example, if
the superproject has at refs/heads/foo:

    superproject/
      sub [gitlink -> 1234...]

and at refs/heads/bar:

    superproject/
      sub [gitlink -> 5678...]

Inside sub, "git rev-parse foo" will output "1234...", and "git rev-parse
bar" will output "5678...", even though "foo" and "bar" are not defined
anywhere inside sub.

(The submodule handles refs that do not start with "refs/" - for
example, HEAD and FETCH_HEAD - like usual.)

[1] also describes what happens when the submodule attempts to write to
any "refs/..." ref.

For those interested, here's what such an implementation might look
like, and a test to demonstrate such functionality. I have partial
read-only functionality - a lot of it still remains to be done.

[1] https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b788@google.com/

Jonathan Tan (2):
  submodule: refactor acquisition of superproject info
  submodule: read-only super-backed ref backend

 Makefile                       |   1 +
 refs.c                         |  11 +-
 refs/refs-internal.h           |   1 +
 refs/sp-backend.c              | 261 +++++++++++++++++++++++++++++++++++++++++
 submodule.c                    | 107 +++++++++++------
 submodule.h                    |   5 +
 t/t1406-submodule-ref-store.sh |  26 ++++
 7 files changed, 374 insertions(+), 38 deletions(-)
 create mode 100644 refs/sp-backend.c

-- 
2.15.0.531.g2ccb3012c9-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [WIP 1/2] submodule: refactor acquisition of superproject info
  2017-12-01 22:50 [WIP 0/2] Submodule ref backend that mirrors superproject Jonathan Tan
@ 2017-12-01 22:50 ` Jonathan Tan
  2017-12-05 19:57   ` Stefan Beller
  2017-12-01 22:50 ` [WIP 2/2] submodule: read-only super-backed ref backend Jonathan Tan
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Tan @ 2017-12-01 22:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, jacob.keller, Jonathan Tan

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c | 76 +++++++++++++++++++++++++++++++++++++------------------------
 submodule.h |  3 +++
 2 files changed, 49 insertions(+), 30 deletions(-)

diff --git a/submodule.c b/submodule.c
index bb531e0e5..ce511180e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1977,16 +1977,24 @@ void absorb_git_dir_into_superproject(const char *prefix,
 	}
 }
 
-const char *get_superproject_working_tree(void)
+/*
+ * Get the full filename of the gitlink entry corresponding to the
+ * superproject having this repo as a submodule.
+ *
+ * full_name will point to an internal buffer upon success.
+ *
+ * Return 0 if successful, 1 if not (for example, if the parent
+ * directory is not a repo or an unrelated one).
+ */
+int get_superproject_entry(const char **full_name)
 {
+	static struct strbuf sb = STRBUF_INIT;
+
 	struct child_process cp = CHILD_PROCESS_INIT;
-	struct strbuf sb = STRBUF_INIT;
 	const char *one_up = real_path_if_valid("../");
 	const char *cwd = xgetcwd();
-	const char *ret = NULL;
 	const char *subpath;
 	int code;
-	ssize_t len;
 
 	if (!is_inside_work_tree())
 		/*
@@ -1994,11 +2002,15 @@ const char *get_superproject_working_tree(void)
 		 * We might have a superproject, but it is harder
 		 * to determine.
 		 */
-		return NULL;
+		return 1;
 
 	if (!one_up)
-		return NULL;
+		return 1;
 
+	/*
+	 * No need to reset sb, since relative_path() will do it if
+	 * necessary.
+	 */
 	subpath = relative_path(cwd, one_up, &sb);
 
 	prepare_submodule_repo_env(&cp.env_array);
@@ -2017,45 +2029,49 @@ const char *get_superproject_working_tree(void)
 	if (start_command(&cp))
 		die(_("could not start ls-files in .."));
 
-	len = strbuf_read(&sb, cp.out, PATH_MAX);
+	strbuf_read(&sb, cp.out, PATH_MAX);
 	close(cp.out);
 
-	if (starts_with(sb.buf, "160000")) {
-		int super_sub_len;
-		int cwd_len = strlen(cwd);
-		char *super_sub, *super_wt;
+	code = finish_command(&cp);
 
+	if (starts_with(sb.buf, "160000")) {
 		/*
 		 * There is a superproject having this repo as a submodule.
 		 * The format is <mode> SP <hash> SP <stage> TAB <full name> \0,
 		 * We're only interested in the name after the tab.
 		 */
-		super_sub = strchr(sb.buf, '\t') + 1;
-		super_sub_len = sb.buf + sb.len - super_sub - 1;
-
-		if (super_sub_len > cwd_len ||
-		    strcmp(&cwd[cwd_len - super_sub_len], super_sub))
-			die (_("BUG: returned path string doesn't match cwd?"));
-
-		super_wt = xstrdup(cwd);
-		super_wt[cwd_len - super_sub_len] = '\0';
-
-		ret = real_path(super_wt);
-		free(super_wt);
+		char *tab = strchr(sb.buf, '\t');
+		if (!tab)
+			die("BUG: ls-files returned line with no tab");
+		*full_name = tab + 1;
+		return 0;
 	}
-	strbuf_release(&sb);
-
-	code = finish_command(&cp);
 
 	if (code == 128)
 		/* '../' is not a git repository */
-		return NULL;
-	if (code == 0 && len == 0)
+		return 1;
+	if (code == 0)
 		/* There is an unrelated git repository at '../' */
+		return 1;
+	die(_("ls-tree returned unexpected return code %d"), code);
+}
+
+const char *get_superproject_working_tree(void)
+{
+	const char *full_name;
+	char *super_wt;
+	size_t len;
+	const char *ret;
+
+	if (get_superproject_entry(&full_name))
 		return NULL;
-	if (code)
-		die(_("ls-tree returned unexpected return code %d"), code);
 
+	super_wt = xstrdup(xgetcwd());
+	if (!strip_suffix(super_wt, full_name, &len))
+		die("BUG: returned path string doesn't match cwd?");
+	super_wt[len] = '\0';
+	ret = real_path(super_wt);
+	free(super_wt);
 	return ret;
 }
 
diff --git a/submodule.h b/submodule.h
index f0da0277a..29ab302cc 100644
--- a/submodule.h
+++ b/submodule.h
@@ -134,6 +134,9 @@ extern void absorb_git_dir_into_superproject(const char *prefix,
  * Return the absolute path of the working tree of the superproject, which this
  * project is a submodule of. If this repository is not a submodule of
  * another repository, return NULL.
+ *
+ * The return value, if not NULL, points to the same shared buffer as the one
+ * returned by real_path().
  */
 extern const char *get_superproject_working_tree(void);
 
-- 
2.15.0.531.g2ccb3012c9-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [WIP 2/2] submodule: read-only super-backed ref backend
  2017-12-01 22:50 [WIP 0/2] Submodule ref backend that mirrors superproject Jonathan Tan
  2017-12-01 22:50 ` [WIP 1/2] submodule: refactor acquisition of superproject info Jonathan Tan
@ 2017-12-01 22:50 ` Jonathan Tan
  2017-12-05 22:28   ` Stefan Beller
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Tan @ 2017-12-01 22:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, jacob.keller, Jonathan Tan

Note that a few major parts are still missing:
 - special handling of the current branch of the superproject
 - writing (whether "refs/..." to the superproject as an index change or
   a commit, or non-"refs/..." directly to the subproject like usual)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                       |   1 +
 refs.c                         |  11 +-
 refs/refs-internal.h           |   1 +
 refs/sp-backend.c              | 261 +++++++++++++++++++++++++++++++++++++++++
 submodule.c                    |  43 +++++--
 submodule.h                    |   2 +
 t/t1406-submodule-ref-store.sh |  26 ++++
 7 files changed, 331 insertions(+), 14 deletions(-)
 create mode 100644 refs/sp-backend.c

diff --git a/Makefile b/Makefile
index e53750ca0..74120b5d7 100644
--- a/Makefile
+++ b/Makefile
@@ -858,6 +858,7 @@ LIB_OBJS += refs/files-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
+LIB_OBJS += refs/sp-backend.o
 LIB_OBJS += ref-filter.o
 LIB_OBJS += remote.o
 LIB_OBJS += replace_object.o
diff --git a/refs.c b/refs.c
index 339d4318e..1f7922733 100644
--- a/refs.c
+++ b/refs.c
@@ -1575,12 +1575,17 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
 static struct ref_store *ref_store_init(const char *gitdir,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	if (getenv("USE_SP")) {
+		be = &refs_be_sp;
+	} else {
+		be = &refs_be_files;
+	}
+
 	if (!be)
-		die("BUG: reference backend %s is unknown", be_name);
+		die("BUG: reference backend %s is unknown", "files");
 
 	refs = be->init(gitdir, flags);
 	return refs;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index dd834314b..a8ec03d90 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -651,6 +651,7 @@ struct ref_storage_be {
 
 extern struct ref_storage_be refs_be_files;
 extern struct ref_storage_be refs_be_packed;
+extern struct ref_storage_be refs_be_sp;
 
 /*
  * A representation of the reference store for the main repository or
diff --git a/refs/sp-backend.c b/refs/sp-backend.c
new file mode 100644
index 000000000..31e8cec4b
--- /dev/null
+++ b/refs/sp-backend.c
@@ -0,0 +1,261 @@
+#include "../cache.h"
+#include "../config.h"
+#include "../refs.h"
+#include "refs-internal.h"
+#include "ref-cache.h"
+#include "packed-backend.h"
+#include "../iterator.h"
+#include "../dir-iterator.h"
+#include "../lockfile.h"
+#include "../object.h"
+#include "../dir.h"
+#include "../submodule.h"
+
+/*
+ * Future: need to be in "struct repository"
+ * when doing a full libification.
+ */
+struct sp_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	/*
+	 * Ref store of this repository (the submodule), used only for the
+	 * reflog.
+	 */
+	struct ref_store *files;
+
+	/*
+	 * Ref store of the superproject, for refs.
+	 */
+	struct ref_store *files_superproject;
+};
+
+/*
+ * Create a new submodule ref cache and add it to the internal
+ * set of caches.
+ */
+static struct ref_store *sp_init(const char *gitdir, unsigned int flags)
+{
+	struct sp_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+
+	base_ref_store_init(ref_store, &refs_be_sp);
+	refs->store_flags = flags;
+	refs->files = refs_be_files.init(gitdir, flags);
+
+	return ref_store;
+}
+
+/*
+ * Downcast ref_store to sp_ref_store. Die if ref_store is not a
+ * sp_ref_store. required_flags is compared with ref_store's
+ * store_flags to ensure the ref_store has all required capabilities.
+ * "caller" is used in any necessary error messages.
+ */
+static struct sp_ref_store *sp_downcast(struct ref_store *ref_store,
+					      unsigned int required_flags,
+					      const char *caller)
+{
+	struct sp_ref_store *refs;
+
+	if (ref_store->be != &refs_be_sp)
+		die("BUG: ref_store is type \"%s\" not \"sp\" in %s",
+		    ref_store->be->name, caller);
+
+	refs = (struct sp_ref_store *)ref_store;
+
+	if ((refs->store_flags & required_flags) != required_flags)
+		die("BUG: operation %s requires abilities 0x%x, but only have 0x%x",
+		    caller, required_flags, refs->store_flags);
+
+	return refs;
+}
+
+static int sp_read_raw_ref(struct ref_store *ref_store,
+			      const char *refname, struct object_id *oid,
+			      struct strbuf *referent, unsigned int *type)
+{
+	struct sp_ref_store *refs;
+
+	refs = sp_downcast(ref_store, REF_STORE_READ, "read_raw_ref");
+
+	if (!starts_with(refname, "refs/")) {
+		return refs_read_raw_ref(refs->files, refname, oid, referent, type);
+	}
+
+	/* read from the superproject instead */
+	return get_superproject_gitlink_oid(refname, oid);
+}
+
+static struct ref_iterator *sp_ref_iterator_begin(
+		struct ref_store *ref_store,
+		const char *prefix, unsigned int flags)
+{
+	return empty_ref_iterator_begin();
+}
+
+static int sp_pack_refs(struct ref_store *ref_store, unsigned int flags)
+{
+	/* no op */
+	return 0;
+}
+
+static int sp_delete_refs(struct ref_store *ref_store, const char *msg,
+			     struct string_list *refnames, unsigned int flags)
+{
+	/* unsupported */
+	return -1;
+}
+
+static int sp_rename_ref(struct ref_store *ref_store,
+			    const char *oldrefname, const char *newrefname,
+			    const char *logmsg)
+{
+	/* unsupported */
+	return -1;
+}
+
+static int sp_copy_ref(struct ref_store *ref_store,
+			    const char *oldrefname, const char *newrefname,
+			    const char *logmsg)
+{
+	/* unsupported */
+	return -1;
+}
+
+static int sp_create_reflog(struct ref_store *ref_store,
+			       const char *refname, int force_create,
+			       struct strbuf *err)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_WRITE, "create_reflog");
+	return refs_create_reflog(refs->files, refname, force_create, err);
+}
+
+static int sp_create_symref(struct ref_store *ref_store,
+			       const char *refname, const char *target,
+			       const char *logmsg)
+{
+	/* unsupported */
+	return -1;
+}
+
+static int sp_reflog_exists(struct ref_store *ref_store,
+			       const char *refname)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_READ, "reflog_exists");
+	return refs_reflog_exists(refs->files, refname);
+}
+
+static int sp_delete_reflog(struct ref_store *ref_store,
+			       const char *refname)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_WRITE, "delete_reflog");
+	return refs_delete_reflog(refs->files, refname);
+}
+
+static int sp_for_each_reflog_ent_reverse(struct ref_store *ref_store,
+					     const char *refname,
+					     each_reflog_ent_fn fn,
+					     void *cb_data)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_READ,
+			       "for_each_reflog_ent_reverse");
+	return refs_for_each_reflog_ent_reverse(refs->files, refname, fn, cb_data);
+}
+
+static int sp_for_each_reflog_ent(struct ref_store *ref_store,
+				     const char *refname,
+				     each_reflog_ent_fn fn, void *cb_data)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_READ,
+			       "for_each_reflog_ent");
+	return refs_for_each_reflog_ent(refs->files, refname, fn, cb_data);
+}
+
+static struct ref_iterator *sp_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_READ,
+			       "reflog_iterator_begin");
+	return refs->files->be->reflog_iterator_begin(refs->files);
+}
+
+static int sp_transaction_prepare(struct ref_store *ref_store,
+				     struct ref_transaction *transaction,
+				     struct strbuf *err)
+{
+	return -1;
+}
+
+static int sp_transaction_finish(struct ref_store *ref_store,
+				    struct ref_transaction *transaction,
+				    struct strbuf *err)
+{
+	return -1;
+}
+
+static int sp_transaction_abort(struct ref_store *ref_store,
+				   struct ref_transaction *transaction,
+				   struct strbuf *err)
+{
+	return -1;
+}
+
+static int sp_initial_transaction_commit(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *err)
+{
+	return -1;
+}
+
+static int sp_reflog_expire(struct ref_store *ref_store,
+			       const char *refname, const struct object_id *oid,
+			       unsigned int flags,
+			       reflog_expiry_prepare_fn prepare_fn,
+			       reflog_expiry_should_prune_fn should_prune_fn,
+			       reflog_expiry_cleanup_fn cleanup_fn,
+			       void *policy_cb_data)
+{
+	struct sp_ref_store *refs =
+		sp_downcast(ref_store, REF_STORE_WRITE, "reflog_expire");
+	return refs_reflog_expire(refs->files, refname, oid, flags, prepare_fn, should_prune_fn, cleanup_fn, policy_cb_data);
+}
+
+static int sp_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	return 0;
+}
+
+struct ref_storage_be refs_be_sp = {
+	NULL,
+	"sp",
+	sp_init,
+	sp_init_db,
+	sp_transaction_prepare,
+	sp_transaction_finish,
+	sp_transaction_abort,
+	sp_initial_transaction_commit,
+
+	sp_pack_refs,
+	sp_create_symref,
+	sp_delete_refs,
+	sp_rename_ref,
+	sp_copy_ref,
+
+	sp_ref_iterator_begin,
+	sp_read_raw_ref,
+
+	sp_reflog_iterator_begin,
+	sp_for_each_reflog_ent,
+	sp_for_each_reflog_ent_reverse,
+	sp_reflog_exists,
+	sp_create_reflog,
+	sp_delete_reflog,
+	sp_reflog_expire
+};
diff --git a/submodule.c b/submodule.c
index ce511180e..1ffaeec82 100644
--- a/submodule.c
+++ b/submodule.c
@@ -471,6 +471,7 @@ static void prepare_submodule_repo_env_no_git_dir(struct argv_array *out)
 void prepare_submodule_repo_env(struct argv_array *out)
 {
 	prepare_submodule_repo_env_no_git_dir(out);
+	argv_array_pushf(out, "USE_SP");
 	argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
 			 DEFAULT_GIT_DIR_ENVIRONMENT);
 }
@@ -1986,7 +1987,7 @@ void absorb_git_dir_into_superproject(const char *prefix,
  * Return 0 if successful, 1 if not (for example, if the parent
  * directory is not a repo or an unrelated one).
  */
-int get_superproject_entry(const char **full_name)
+static int get_superproject_entry(const char *ref, const char **full_name, struct object_id *oid)
 {
 	static struct strbuf sb = STRBUF_INIT;
 
@@ -2016,9 +2017,11 @@ int get_superproject_entry(const char **full_name)
 	prepare_submodule_repo_env(&cp.env_array);
 	argv_array_pop(&cp.env_array);
 
-	argv_array_pushl(&cp.args, "--literal-pathspecs", "-C", "..",
-			"ls-files", "-z", "--stage", "--full-name", "--",
-			subpath, NULL);
+	argv_array_pushl(&cp.args, "--literal-pathspecs", "-C", "..", NULL);
+	if (ref)
+		argv_array_pushl(&cp.args, "ls-tree", "-z", "--full-name", "-r", ref, subpath, NULL);
+	else
+		argv_array_pushl(&cp.args, "ls-files", "-z", "--full-name", "--stage", "--", subpath, NULL);
 	strbuf_reset(&sb);
 
 	cp.no_stdin = 1;
@@ -2037,13 +2040,24 @@ int get_superproject_entry(const char **full_name)
 	if (starts_with(sb.buf, "160000")) {
 		/*
 		 * There is a superproject having this repo as a submodule.
-		 * The format is <mode> SP <hash> SP <stage> TAB <full name> \0,
-		 * We're only interested in the name after the tab.
+		 * The format is:
+		 * [ls-tree] <mode> SP <type> SP <hash> TAB <full name> \0
+		 * [ls-files] <mode> SP <hash> SP <stage> TAB <full name> \0
 		 */
-		char *tab = strchr(sb.buf, '\t');
-		if (!tab)
-			die("BUG: ls-files returned line with no tab");
-		*full_name = tab + 1;
+		if (full_name) {
+			char *tab = strchr(sb.buf, '\t');
+			if (!tab)
+				die("BUG: ls-files returned line with no tab");
+			*full_name = tab + 1;
+		}
+		if (oid) {
+			char *space = strchr(sb.buf, ' ');
+			const char *p;
+			if (ref)
+				space = strchr(space + 1, ' ');
+			if (parse_oid_hex(space + 1, oid, &p))
+				die("BUG: Could not read OID: %s", space + 1);
+		}
 		return 0;
 	}
 
@@ -2063,7 +2077,7 @@ const char *get_superproject_working_tree(void)
 	size_t len;
 	const char *ret;
 
-	if (get_superproject_entry(&full_name))
+	if (get_superproject_entry(NULL, &full_name, NULL))
 		return NULL;
 
 	super_wt = xstrdup(xgetcwd());
@@ -2075,6 +2089,13 @@ const char *get_superproject_working_tree(void)
 	return ret;
 }
 
+int get_superproject_gitlink_oid(const char *ref, struct object_id *oid)
+{
+	if (get_superproject_entry(ref, NULL, oid))
+		return 1;
+	return 0;
+}
+
 /*
  * Put the gitdir for a submodule (given relative to the main
  * repository worktree) into `buf`, or return -1 on error.
diff --git a/submodule.h b/submodule.h
index 29ab302cc..3a836beab 100644
--- a/submodule.h
+++ b/submodule.h
@@ -140,4 +140,6 @@ extern void absorb_git_dir_into_superproject(const char *prefix,
  */
 extern const char *get_superproject_working_tree(void);
 
+extern int get_superproject_gitlink_oid(const char *ref, struct object_id *oid);
+
 #endif
diff --git a/t/t1406-submodule-ref-store.sh b/t/t1406-submodule-ref-store.sh
index c32d4cc46..46ba7a272 100755
--- a/t/t1406-submodule-ref-store.sh
+++ b/t/t1406-submodule-ref-store.sh
@@ -98,4 +98,30 @@ test_expect_success 'create-reflog() not allowed' '
 	test_must_fail $RUN create-reflog HEAD 1
 '
 
+test_expect_success 'ref backend based on superproject data' '
+	rm -rf sub super &&
+
+	git init sub &&
+	test_commit -C sub first &&
+	test_commit -C sub second &&
+
+	git init super &&
+
+	# master branch in superproject, submodule at second
+	git -C super submodule add ../sub sub &&
+	git -C super commit -m x &&
+
+	# anotherbranch in superproject, submodule at first
+	git -C super checkout -b anotherbranch &&
+	git -C super/sub checkout first &&
+	git -C super commit -a -m x &&
+
+	# Notice that rev-parse can parse "anotherbranch" and see that it
+	# points to first, even though a branch with the name "anotherbranch"
+	# is only defined in the superproject
+	git -C sub rev-parse first >expect &&
+	USE_SP=1 git -C super/sub rev-parse anotherbranch >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.15.0.531.g2ccb3012c9-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [WIP 1/2] submodule: refactor acquisition of superproject info
  2017-12-01 22:50 ` [WIP 1/2] submodule: refactor acquisition of superproject info Jonathan Tan
@ 2017-12-05 19:57   ` Stefan Beller
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2017-12-05 19:57 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Junio C Hamano, Jacob Keller

On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  submodule.c | 76 +++++++++++++++++++++++++++++++++++++------------------------
>  submodule.h |  3 +++


This patch reads very similar to [1], which was a preparation part of the series
"[RFC PATCH 0/4] git-status reports relation to superproject"; there
we also need
the same information about the superproject. I'll take a closer look
and compare these
two patches if it turns out we want to go this way long term.

[1] https://public-inbox.org/git/20171108195509.7839-3-sbeller@google.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [WIP 2/2] submodule: read-only super-backed ref backend
  2017-12-01 22:50 ` [WIP 2/2] submodule: read-only super-backed ref backend Jonathan Tan
@ 2017-12-05 22:28   ` Stefan Beller
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2017-12-05 22:28 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Junio C Hamano, Jacob Keller

On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> Note that a few major parts are still missing:
>  - special handling of the current branch of the superproject
>  - writing (whether "refs/..." to the superproject as an index change or
>    a commit, or non-"refs/..." directly to the subproject like usual)
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>


> --- a/refs.c
> +++ b/refs.c
> @@ -1575,12 +1575,17 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
>  static struct ref_store *ref_store_init(const char *gitdir,
>                                         unsigned int flags)
>  {
> -       const char *be_name = "files";
> -       struct ref_storage_be *be = find_ref_storage_backend(be_name);
> +       struct ref_storage_be *be;
>         struct ref_store *refs;
>
> +       if (getenv("USE_SP")) {
> +               be = &refs_be_sp;
> +       } else {
> +               be = &refs_be_files;
> +       }
> +
>         if (!be)
> -               die("BUG: reference backend %s is unknown", be_name);
> +               die("BUG: reference backend %s is unknown", "files");

If we pursue this approach more than just RFC-ish we would probably not use
an environment variable but have some detection logic for the
different ref backends.

For example Shawns refstable[1] proposal uses the configuration
`extensions.refStorage == reftable`, which this proposal could reuse and
set to 'indexed_superproject' or 'commit_in_superproject', depending on the
write semantics that we'd need to disucss (or do we want to offer 2 different
superproject backends having these different semantics?)

More to the code here, we could still use `find_ref_storage_backend()`
depending on the env:

    const char *be_name = "files";
    ...
+    if (getenv("USE_SP"))
+        be_name = "sp-backend";
    ...

and then we'd have to change the 'next ' pointer in refs_be_files
(in refs/files-backend.c line ~3100) to point at the superproject backend
instead of NULL.

[1] https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md#Repository-format

> +static int sp_read_raw_ref(struct ref_store *ref_store,
> +                             const char *refname, struct object_id *oid,
> +                             struct strbuf *referent, unsigned int *type)
> +{
> +       struct sp_ref_store *refs;
> +
> +       refs = sp_downcast(ref_store, REF_STORE_READ, "read_raw_ref");
> +
> +       if (!starts_with(refname, "refs/")) {
> +               return refs_read_raw_ref(refs->files, refname, oid, referent, type);
> +       }
> +
> +       /* read from the superproject instead */
> +       return get_superproject_gitlink_oid(refname, oid);
> +}

This function would need to get more sophisticated logic I would assume;
the current RFC is a read-only thing, such that it may be sufficient
to differentiate
between refs/* and specials such as [FETCH_]HEAD, but as seen in the
sp_ref_store struct definition we're already keeping a local reflog. I would
assume that remotes are also local to the submodule instead of reusing the
superprojects remote, for the sake of pulling in upstream changes of the
submodule instead of being fully integrated.



> diff --git a/submodule.c b/submodule.c
> index ce511180e..1ffaeec82 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -471,6 +471,7 @@ static void prepare_submodule_repo_env_no_git_dir(struct argv_array *out)
>  void prepare_submodule_repo_env(struct argv_array *out)
>  {
>         prepare_submodule_repo_env_no_git_dir(out);
> +       argv_array_pushf(out, "USE_SP");
>         argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
>                          DEFAULT_GIT_DIR_ENVIRONMENT);
>  }

This is an easy way to enforce all submodules use the superproject refs mode,
which makes it easy to make recursive commands (e.g. `git checkout --recursive`
would not need to pass down the exact gitlink sha1, but could pass
down the branch
name or even commit-ish "HEAD^^" and the submodule stays in perfect
sync (according
to superproject commit-ish). Once we take this further I would imagine
there is an option
in the submodule that specifies what mode it is using. Not sure if
we'd want to keep the
environment variable to force that mode from the superproject, though.

> @@ -1986,7 +1987,7 @@ void absorb_git_dir_into_superproject(const char *prefix,
>   * Return 0 if successful, 1 if not (for example, if the parent
>   * directory is not a repo or an unrelated one).
>   */
> -int get_superproject_entry(const char **full_name)
> +static int get_superproject_entry(const char *ref, const char **full_name, struct object_id *oid)

Making it static could be part of the prior patch?

When the ref is given we use `ls-tree -r` instead of `ls-files --stage`
which give similar outputs to have the parsing logic overlap below,
the difference
is whether the index is looked up or the given ref.

> @@ -2016,9 +2017,11 @@ int get_superproject_entry(const char **full_name)
>         prepare_submodule_repo_env(&cp.env_array);
>         argv_array_pop(&cp.env_array);
>
> -       argv_array_pushl(&cp.args, "--literal-pathspecs", "-C", "..",
> -                       "ls-files", "-z", "--stage", "--full-name", "--",
> -                       subpath, NULL);
> +       argv_array_pushl(&cp.args, "--literal-pathspecs", "-C", "..", NULL);
> +       if (ref)
> +               argv_array_pushl(&cp.args, "ls-tree", "-z", "--full-name", "-r", ref, subpath, NULL);

-- between ref and path?

> @@ -2063,7 +2077,7 @@ const char *get_superproject_working_tree(void)
>         size_t len;
>         const char *ret;
>
> -       if (get_superproject_entry(&full_name))
> +       if (get_superproject_entry(NULL, &full_name, NULL))
>                 return NULL;
>
>         super_wt = xstrdup(xgetcwd());
> @@ -2075,6 +2089,13 @@ const char *get_superproject_working_tree(void)
>         return ret;
>  }
>
> +int get_superproject_gitlink_oid(const char *ref, struct object_id *oid)
> +{
> +       if (get_superproject_entry(ref, NULL, oid))

I see; the get_superproject_entry function is more powerful, than what
I proposed
in the status series. It takes vastly different arguments, for
inspection, so the caller
is responsible instead of having dumber callers and two different functions

>
> +test_expect_success 'ref backend based on superproject data' '
> +       rm -rf sub super &&
> +
> +       git init sub &&
> +       test_commit -C sub first &&
> +       test_commit -C sub second &&
> +
> +       git init super &&
> +
> +       # master branch in superproject, submodule at second
> +       git -C super submodule add ../sub sub &&
> +       git -C super commit -m x &&
> +
> +       # anotherbranch in superproject, submodule at first
> +       git -C super checkout -b anotherbranch &&
> +       git -C super/sub checkout first &&
> +       git -C super commit -a -m x &&
> +
> +       # Notice that rev-parse can parse "anotherbranch" and see that it
> +       # points to first, even though a branch with the name "anotherbranch"
> +       # is only defined in the superproject
> +       git -C sub rev-parse first >expect &&
> +       USE_SP=1 git -C super/sub rev-parse anotherbranch >actual &&
> +       test_cmp expect actual
> +'

Thanks for writing this proof of concept.
As Jonathan Nieder mentioned another similar direction of this idea could be
used to decorate the branches in the submodule to indicate where the
superproject
points to.

I think we'd need to hash out the details of the write operations,
before going further here,
but this is the best long term solution so far IMHO. (Short term we
might carry local
patches that make submodules more repo like).

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-12-05 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-01 22:50 [WIP 0/2] Submodule ref backend that mirrors superproject Jonathan Tan
2017-12-01 22:50 ` [WIP 1/2] submodule: refactor acquisition of superproject info Jonathan Tan
2017-12-05 19:57   ` Stefan Beller
2017-12-01 22:50 ` [WIP 2/2] submodule: read-only super-backed ref backend Jonathan Tan
2017-12-05 22:28   ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).