git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] No more adding submodule ODB as alternate
@ 2021-09-21 16:51 Jonathan Tan
  2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
                   ` (12 more replies)
  0 siblings, 13 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

This series is on jt/add-submodule-odb-clean-up.

After this series, the entire test suite runs without ever adding a
submodule ODB as an alternate (checked by running with
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1). The code to lazily add
submodule ODBs as alternates still remains (with a trace message printed
if it happens) just in case there is a rare interaction that the test
suite doesn't cover.

This is part of my effort to support partial clone in submodules, but
the results here are also beneficial for non-partial-clone submodule
users in that access to submodule objects are now quicker (because Git
no longer needs to linearly search through alternates when accessing
these objects). It also improves code health in that it is clearer at
the call site when a submodule object is being accessed.

This patch series contains the 2 patches from my previous work on
iterating over submodule refs [1], and 7 new patches.

[1] https://lore.kernel.org/git/cover.1629933380.git.jonathantanmy@google.com/

Jonathan Tan (9):
  refs: make _advance() check struct repo, not flag
  refs: add repo paramater to _iterator_peel()
  refs iterator: support non-the_repository advance
  refs: teach refs_for_each_ref() arbitrary repos
  merge-{ort,recursive}: remove add_submodule_odb()
  object-file: only register submodule ODB if needed
  submodule: pass repo to check_has_commit()
  refs: change refs_for_each_ref_in() to take repo
  submodule: trace adding submodule ODB as alternate

 builtin/submodule--helper.c            | 16 +++--
 merge-ort.c                            | 18 ++----
 merge-recursive.c                      | 41 ++++++------
 object-file.c                          |  3 +-
 object-name.c                          |  4 +-
 refs.c                                 | 87 ++++++++++++++------------
 refs.h                                 | 12 ++--
 refs/debug.c                           |  9 +--
 refs/files-backend.c                   | 28 ++++-----
 refs/iterator.c                        | 51 ++++++++++++---
 refs/packed-backend.c                  | 24 +++----
 refs/ref-cache.c                       |  7 ++-
 refs/refs-internal.h                   | 55 +++++++++++-----
 revision.c                             | 12 ++--
 strbuf.c                               | 12 +++-
 strbuf.h                               |  6 +-
 submodule.c                            | 28 +++++++--
 t/helper/test-ref-store.c              | 20 +++---
 t/t5526-fetch-submodules.sh            |  3 +
 t/t5531-deep-submodule-push.sh         |  3 +
 t/t5545-push-options.sh                |  3 +
 t/t5572-pull-submodule.sh              |  3 +
 t/t6437-submodule-merge.sh             |  3 +
 t/t7418-submodule-sparse-gitmodules.sh |  3 +
 24 files changed, 271 insertions(+), 180 deletions(-)

-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-23  1:00   ` Junio C Hamano
  2021-09-24 18:13   ` Jeff King
  2021-09-21 16:51 ` [PATCH 2/9] refs: add repo paramater to _iterator_peel() Jonathan Tan
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Currently, ref iterators access the object store each time they advance
if and only if the boolean flag DO_FOR_EACH_INCLUDE_BROKEN is unset.
(The iterators access the object store because, if
DO_FOR_EACH_INCLUDE_BROKEN is unset, they need to attempt to resolve
each ref to determine that it is not broken.)

Also, the object store accessed is always that of the_repository, making
it impossible to iterate over a submodule's refs without
DO_FOR_EACH_INCLUDE_BROKEN (unless add_submodule_odb() is used).

As a first step in resolving both these problems, replace the
DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
is set, a NULL repository (representing access to no object store) is
used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
non-NULL repository (representing access to that repository's object
store) is used instead. Right now, the locations in which
non-the_repository support needs to be added are marked with BUG()
statements - in a future patch, these will be replaced. (NEEDSWORK: in
this RFC patch set, this has not been done)

I have considered and rejected the following design alternative:

- Change the _advance() callback to also have a repository object
  parameter, and either skip or not skip depending on whether that
  parameter is NULL. This burdens callers to have to carry this
  information along with the iterator, and such calling code may be
  unclear as to why that parameter can be NULL in some cases and cannot
  in others.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 46 +++++++++++++++++++++++--------------------
 refs/files-backend.c  | 14 ++++---------
 refs/iterator.c       | 18 ++++++++++++++++-
 refs/packed-backend.c | 10 +---------
 refs/refs-internal.h  | 27 +++++++++++++++----------
 5 files changed, 64 insertions(+), 51 deletions(-)

diff --git a/refs.c b/refs.c
index 8b9f7c3a80..49ddcdac53 100644
--- a/refs.c
+++ b/refs.c
@@ -1413,16 +1413,16 @@ int head_ref(each_ref_fn fn, void *cb_data)
 
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
-		const char *prefix, int trim, int flags)
+		const char *prefix, int trim, struct repository *repo,
+		int flags)
 {
 	struct ref_iterator *iter;
 
 	if (ref_paranoia < 0)
 		ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 0);
-	if (ref_paranoia)
-		flags |= DO_FOR_EACH_INCLUDE_BROKEN;
 
 	iter = refs->be->iterator_begin(refs, prefix, flags);
+	iter->repo = ref_paranoia ? NULL : repo;
 
 	/*
 	 * `iterator_begin()` already takes care of prefix, but we
@@ -1442,13 +1442,16 @@ struct ref_iterator *refs_ref_iterator_begin(
  * Call fn for each reference in the specified submodule for which the
  * refname begins with prefix. If trim is non-zero, then trim that
  * many characters off the beginning of each refname before passing
- * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
- * include broken references in the iteration. If fn ever returns a
+ * the refname to fn. If fn ever returns a
  * non-zero value, stop the iteration and return that value;
  * otherwise, return 0.
+ *
+ * See the documentation of refs_ref_iterator_begin() for more information on
+ * the repo parameter.
  */
 static int do_for_each_repo_ref(struct repository *r, const char *prefix,
-				each_repo_ref_fn fn, int trim, int flags,
+				each_repo_ref_fn fn, int trim,
+				struct repository *repo, int flags,
 				void *cb_data)
 {
 	struct ref_iterator *iter;
@@ -1457,7 +1460,7 @@ static int do_for_each_repo_ref(struct repository *r, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
 
 	return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
 }
@@ -1479,7 +1482,8 @@ static int do_for_each_ref_helper(struct repository *r,
 }
 
 static int do_for_each_ref(struct ref_store *refs, const char *prefix,
-			   each_ref_fn fn, int trim, int flags, void *cb_data)
+			   each_ref_fn fn, int trim, struct repository *repo,
+			   int flags, void *cb_data)
 {
 	struct ref_iterator *iter;
 	struct do_for_each_ref_help hp = { fn, cb_data };
@@ -1487,7 +1491,7 @@ static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
 
 	return do_for_each_repo_ref_iterator(the_repository, iter,
 					do_for_each_ref_helper, &hp);
@@ -1495,7 +1499,7 @@ static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 
 int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, "", fn, 0, 0, cb_data);
+	return do_for_each_ref(refs, "", fn, 0, the_repository, 0, cb_data);
 }
 
 int for_each_ref(each_ref_fn fn, void *cb_data)
@@ -1506,7 +1510,7 @@ int for_each_ref(each_ref_fn fn, void *cb_data)
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 			 each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), 0, cb_data);
+	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
 }
 
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
@@ -1518,10 +1522,10 @@ int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsig
 {
 	unsigned int flag = 0;
 
-	if (broken)
-		flag = DO_FOR_EACH_INCLUDE_BROKEN;
 	return do_for_each_ref(get_main_ref_store(the_repository),
-			       prefix, fn, 0, flag, cb_data);
+			       prefix, fn, 0,
+			       broken ? NULL : the_repository,
+			       flag, cb_data);
 }
 
 int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
@@ -1530,16 +1534,16 @@ int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
 {
 	unsigned int flag = 0;
 
-	if (broken)
-		flag = DO_FOR_EACH_INCLUDE_BROKEN;
-	return do_for_each_ref(refs, prefix, fn, 0, flag, cb_data);
+	return do_for_each_ref(refs, prefix, fn, 0,
+			       broken ? NULL : the_repository,
+			       flag, cb_data);
 }
 
 int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
 {
 	return do_for_each_repo_ref(r, git_replace_ref_base, fn,
 				    strlen(git_replace_ref_base),
-				    DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
+				    NULL, 0, cb_data);
 }
 
 int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
@@ -1548,7 +1552,7 @@ int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
 	int ret;
 	strbuf_addf(&buf, "%srefs/", get_git_namespace());
 	ret = do_for_each_ref(get_main_ref_store(the_repository),
-			      buf.buf, fn, 0, 0, cb_data);
+			      buf.buf, fn, 0, the_repository, 0, cb_data);
 	strbuf_release(&buf);
 	return ret;
 }
@@ -1556,7 +1560,7 @@ int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
 int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
 	return do_for_each_ref(refs, "", fn, 0,
-			       DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
+			       NULL, 0, cb_data);
 }
 
 int for_each_rawref(each_ref_fn fn, void *cb_data)
@@ -2263,7 +2267,7 @@ int refs_verify_refname_available(struct ref_store *refs,
 	strbuf_addch(&dirname, '/');
 
 	iter = refs_ref_iterator_begin(refs, dirname.buf, 0,
-				       DO_FOR_EACH_INCLUDE_BROKEN);
+				       NULL, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		if (skip &&
 		    string_list_has_string(skip, iter->refname))
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 677b7e4cdd..cd145301d0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -744,12 +744,6 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		    ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
 			continue;
 
-		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->iter0->refname,
-					    iter->iter0->oid,
-					    iter->iter0->flags))
-			continue;
-
 		iter->base.refname = iter->iter0->refname;
 		iter->base.oid = iter->iter0->oid;
 		iter->base.flags = iter->iter0->flags;
@@ -801,9 +795,6 @@ static struct ref_iterator *files_ref_iterator_begin(
 	struct ref_iterator *ref_iterator;
 	unsigned int required_flags = REF_STORE_READ;
 
-	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
-		required_flags |= REF_STORE_ODB;
-
 	refs = files_downcast(ref_store, required_flags, "ref_iterator_begin");
 
 	/*
@@ -836,10 +827,13 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 * references, and (if needed) do our own check for broken
 	 * ones in files_ref_iterator_advance(), after we have merged
 	 * the packed and loose references.
+	 *
+	 * Do this by not supplying any repo, regardless of whether a repo was
+	 * supplied to files_ref_iterator_begin().
 	 */
 	packed_iter = refs_ref_iterator_begin(
 			refs->packed_ref_store, prefix, 0,
-			DO_FOR_EACH_INCLUDE_BROKEN);
+			NULL, 0);
 
 	overlay_iter = overlay_ref_iterator_begin(loose_iter, packed_iter);
 
diff --git a/refs/iterator.c b/refs/iterator.c
index a89d132d4f..5af6554887 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -10,7 +10,23 @@
 
 int ref_iterator_advance(struct ref_iterator *ref_iterator)
 {
-	return ref_iterator->vtable->advance(ref_iterator);
+	int ok;
+
+	if (ref_iterator->repo && ref_iterator->repo != the_repository)
+		/*
+		 * NEEDSWORK: make ref_resolves_to_object() support
+		 * arbitrary repositories
+		 */
+		BUG("ref_iterator->repo must be NULL or the_repository");
+	while ((ok = ref_iterator->vtable->advance(ref_iterator)) == ITER_OK) {
+		if (ref_iterator->repo &&
+		    !ref_resolves_to_object(ref_iterator->refname,
+					    ref_iterator->oid,
+					    ref_iterator->flags))
+			continue;
+		return ITER_OK;
+	}
+	return ok;
 }
 
 int ref_iterator_peel(struct ref_iterator *ref_iterator,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f8aa97d799..f52d5488b8 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -863,11 +863,6 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		    ref_type(iter->base.refname) != REF_TYPE_PER_WORKTREE)
 			continue;
 
-		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
-					    iter->flags))
-			continue;
-
 		return ITER_OK;
 	}
 
@@ -922,8 +917,6 @@ static struct ref_iterator *packed_ref_iterator_begin(
 	struct ref_iterator *ref_iterator;
 	unsigned int required_flags = REF_STORE_READ;
 
-	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
-		required_flags |= REF_STORE_ODB;
 	refs = packed_downcast(ref_store, required_flags, "ref_iterator_begin");
 
 	/*
@@ -1136,8 +1129,7 @@ static int write_with_updates(struct packed_ref_store *refs,
 	 * list of refs is exhausted, set iter to NULL. When the list
 	 * of updates is exhausted, leave i set to updates->nr.
 	 */
-	iter = packed_ref_iterator_begin(&refs->base, "",
-					 DO_FOR_EACH_INCLUDE_BROKEN);
+	iter = refs_ref_iterator_begin(&refs->base, "", 0, NULL, 0);
 	if ((ok = ref_iterator_advance(iter)) != ITER_OK)
 		iter = NULL;
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 3155708345..dc0ed65686 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -245,9 +245,6 @@ int refs_rename_ref_available(struct ref_store *refs,
 /* We allow "recursive" symbolic refs. Only within reason, though */
 #define SYMREF_MAXDEPTH 5
 
-/* Include broken references in a do_for_each_ref*() iteration: */
-#define DO_FOR_EACH_INCLUDE_BROKEN 0x01
-
 /*
  * Reference iterators
  *
@@ -305,6 +302,12 @@ struct ref_iterator {
 	 */
 	unsigned int ordered : 1;
 
+	/*
+	 * See the documentation of refs_ref_iterator_begin() for more
+	 * information.
+	 */
+	struct repository *repo;
+
 	const char *refname;
 	const struct object_id *oid;
 	unsigned int flags;
@@ -349,16 +352,19 @@ int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
  * Return an iterator that goes over each reference in `refs` for
  * which the refname begins with prefix. If trim is non-zero, then
  * trim that many characters off the beginning of each refname.
- * The output is ordered by refname. The following flags are supported:
+ * The output is ordered by refname.
+ *
+ * Pass NULL as repo to include broken references in the iteration, or non-NULL
+ * to skip references that do not resolve to an object in the given repo.
  *
- * DO_FOR_EACH_INCLUDE_BROKEN: include broken references in
- *         the iteration.
+ * The following flags are supported:
  *
  * DO_FOR_EACH_PER_WORKTREE_ONLY: only produce REF_TYPE_PER_WORKTREE refs.
  */
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
-		const char *prefix, int trim, int flags);
+		const char *prefix, int trim, struct repository *repo,
+		int flags);
 
 /*
  * A callback function used to instruct merge_ref_iterator how to
@@ -446,8 +452,9 @@ void base_ref_iterator_free(struct ref_iterator *iter);
 /*
  * backend-specific implementation of ref_iterator_advance. For symrefs, the
  * function should set REF_ISSYMREF, and it should also dereference the symref
- * to provide the OID referent. If DO_FOR_EACH_INCLUDE_BROKEN is set, symrefs
- * with non-existent referents and refs pointing to non-existent object names
+ * to provide the OID referent. If a NULL repo was passed to the _begin()
+ * function that created this iterator, symrefs with non-existent referents and
+ * refs pointing to non-existent object names
  * should also be returned. If DO_FOR_EACH_PER_WORKTREE_ONLY, only
  * REF_TYPE_PER_WORKTREE refs should be returned.
  */
@@ -504,7 +511,7 @@ int do_for_each_repo_ref_iterator(struct repository *r,
  * where all reference backends will presumably store their
  * per-worktree refs.
  */
-#define DO_FOR_EACH_PER_WORKTREE_ONLY 0x02
+#define DO_FOR_EACH_PER_WORKTREE_ONLY 0x01
 
 struct ref_store;
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 2/9] refs: add repo paramater to _iterator_peel()
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
  2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 3/9] refs iterator: support non-the_repository advance Jonathan Tan
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Split the ref_iterator_peel() function into two functions: one that
returns information solely based on what the ref store contains
(success, failure, inconclusive), and one that takes a repo parameter
and accesses the object store if need be. Update the ref store's
callbacks to not access the object store, and to return
success/failure/inconclusive instead of a binary success/failure.

This makes it explicit whether a peel attempt may access the object
store of a repository.

The approach taken in this commit for peeling is different from the
approach taken in the parent commit for advancing:

- It is complicated to reuse the repo field (which determines if an
  object store is ever accessed during advancing, and if yes, which
  object store) added to ref stores in the parent commit; the files ref
  store wraps the packed ref store, and it does not want the packed ref
  store to access any object store during advancing (as described in
  files_ref_iterator_begin()) - thus repo is NULL - but it wants packed
  ref store peeling.

- Having the repo handy when peeling is not as cumbersome as it is when
  advancing. Firstly, the repo in this case is always non-NULL, and
  secondly, peeling is typically followed by reading the object, which
  requires the repo anyway.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                |  2 +-
 refs/debug.c          |  9 +++++----
 refs/files-backend.c  | 12 +++++++-----
 refs/iterator.c       | 38 +++++++++++++++++++++++++++++---------
 refs/packed-backend.c | 14 ++++++++------
 refs/ref-cache.c      |  7 ++++---
 refs/refs-internal.h  | 27 ++++++++++++++++++++++-----
 7 files changed, 76 insertions(+), 33 deletions(-)

diff --git a/refs.c b/refs.c
index 49ddcdac53..3a57893032 100644
--- a/refs.c
+++ b/refs.c
@@ -2012,7 +2012,7 @@ int peel_iterated_oid(const struct object_id *base, struct object_id *peeled)
 	if (current_ref_iter &&
 	    (current_ref_iter->oid == base ||
 	     oideq(current_ref_iter->oid, base)))
-		return ref_iterator_peel(current_ref_iter, peeled);
+		return ref_iterator_peel(current_ref_iter, the_repository, peeled);
 
 	return peel_object(base, peeled) ? -1 : 0;
 }
diff --git a/refs/debug.c b/refs/debug.c
index 1a7a9e11cf..7592de0a4a 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -198,13 +198,14 @@ static int debug_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return res;
 }
 
-static int debug_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result debug_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	struct debug_ref_iterator *diter =
 		(struct debug_ref_iterator *)ref_iterator;
-	int res = diter->iter->vtable->peel(diter->iter, peeled);
-	trace_printf_key(&trace_refs, "iterator_peel: %s: %d\n", diter->iter->refname, res);
+	enum ref_iterator_peel_result res = diter->iter->vtable->peel(diter->iter, peeled);
+	trace_printf_key(&trace_refs, "iterator_peel: %s: %d\n", diter->iter->refname, (int) res);
 	return res;
 }
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index cd145301d0..1faab1cf66 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -757,13 +757,14 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ok;
 }
 
-static int files_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result files_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	struct files_ref_iterator *iter =
 		(struct files_ref_iterator *)ref_iterator;
 
-	return ref_iterator_peel(iter->iter0, peeled);
+	return ref_iterator_peel_raw(iter->iter0, peeled);
 }
 
 static int files_ref_iterator_abort(struct ref_iterator *ref_iterator)
@@ -2105,8 +2106,9 @@ static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator)
 	return ok;
 }
 
-static int files_reflog_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result files_reflog_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	BUG("ref_iterator_peel() called for reflog_iterator");
 }
diff --git a/refs/iterator.c b/refs/iterator.c
index 5af6554887..ee6b00a7be 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -29,10 +29,27 @@ int ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ok;
 }
 
+enum ref_iterator_peel_result ref_iterator_peel_raw(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
+{
+	return ref_iterator->vtable->peel(ref_iterator, peeled);
+}
+
 int ref_iterator_peel(struct ref_iterator *ref_iterator,
+		      struct repository *repo,
 		      struct object_id *peeled)
 {
-	return ref_iterator->vtable->peel(ref_iterator, peeled);
+	enum ref_iterator_peel_result result =
+		ref_iterator_peel_raw(ref_iterator, peeled);
+
+	if (repo != the_repository)
+		/* NEEDSWORK: make peel_object() work with all repositories */
+		BUG("ref_iterator_peel() can only be used with the_repository");
+	if (result == REF_ITERATOR_PEEL_INCONCLUSIVE)
+		return peel_object(ref_iterator->oid, peeled) == PEEL_PEELED ?
+			0 : -1;
+	return result == REF_ITERATOR_PEEL_SUCCESS ? 0 : -1;
 }
 
 int ref_iterator_abort(struct ref_iterator *ref_iterator)
@@ -67,8 +84,9 @@ static int empty_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ref_iterator_abort(ref_iterator);
 }
 
-static int empty_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result empty_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	BUG("peel called for empty iterator");
 }
@@ -186,8 +204,9 @@ static int merge_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ITER_ERROR;
 }
 
-static int merge_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result merge_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	struct merge_ref_iterator *iter =
 		(struct merge_ref_iterator *)ref_iterator;
@@ -195,7 +214,7 @@ static int merge_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	if (!iter->current) {
 		BUG("peel called before advance for merge iterator");
 	}
-	return ref_iterator_peel(*iter->current, peeled);
+	return ref_iterator_peel_raw(*iter->current, peeled);
 }
 
 static int merge_ref_iterator_abort(struct ref_iterator *ref_iterator)
@@ -371,13 +390,14 @@ static int prefix_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ok;
 }
 
-static int prefix_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				    struct object_id *peeled)
+static enum ref_iterator_peel_result prefix_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	struct prefix_ref_iterator *iter =
 		(struct prefix_ref_iterator *)ref_iterator;
 
-	return ref_iterator_peel(iter->iter0, peeled);
+	return ref_iterator_peel_raw(iter->iter0, peeled);
 }
 
 static int prefix_ref_iterator_abort(struct ref_iterator *ref_iterator)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f52d5488b8..d258303696 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -872,19 +872,21 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	return ok;
 }
 
-static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result packed_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
 	struct packed_ref_iterator *iter =
 		(struct packed_ref_iterator *)ref_iterator;
 
 	if ((iter->base.flags & REF_KNOWS_PEELED)) {
 		oidcpy(peeled, &iter->peeled);
-		return is_null_oid(&iter->peeled) ? -1 : 0;
+		return is_null_oid(&iter->peeled) ?
+			REF_ITERATOR_PEEL_FAILURE : REF_ITERATOR_PEEL_SUCCESS;
 	} else if ((iter->base.flags & (REF_ISBROKEN | REF_ISSYMREF))) {
-		return -1;
+		return REF_ITERATOR_PEEL_FAILURE;
 	} else {
-		return peel_object(&iter->oid, peeled) ? -1 : 0;
+		return REF_ITERATOR_PEEL_INCONCLUSIVE;
 	}
 }
 
@@ -1210,7 +1212,7 @@ static int write_with_updates(struct packed_ref_store *refs,
 			/* Pass the old reference through. */
 
 			struct object_id peeled;
-			int peel_error = ref_iterator_peel(iter, &peeled);
+			int peel_error = ref_iterator_peel(iter, the_repository, &peeled);
 
 			if (write_packed_entry(out, iter->refname,
 					       iter->oid,
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 49d732f6db..031b613bb2 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -488,10 +488,11 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	}
 }
 
-static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
+static enum ref_iterator_peel_result cache_ref_iterator_peel(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled)
 {
-	return peel_object(ref_iterator->oid, peeled) ? -1 : 0;
+	return REF_ITERATOR_PEEL_INCONCLUSIVE;
 }
 
 static int cache_ref_iterator_abort(struct ref_iterator *ref_iterator)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index dc0ed65686..4656ef83a3 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -323,11 +323,26 @@ struct ref_iterator {
  */
 int ref_iterator_advance(struct ref_iterator *ref_iterator);
 
+enum ref_iterator_peel_result {
+	REF_ITERATOR_PEEL_SUCCESS,
+	REF_ITERATOR_PEEL_FAILURE,
+	REF_ITERATOR_PEEL_INCONCLUSIVE
+};
+
+/*
+ * Peel the reference currently being viewed by the iterator without
+ * using any information from any object store.
+ */
+enum ref_iterator_peel_result ref_iterator_peel_raw(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled);
+
 /*
- * If possible, peel the reference currently being viewed by the
- * iterator. Return 0 on success.
+ * Peel the reference currently being viewed by the iterator, using the object
+ * store if the ref store has insufficient information. Returns 0 upon success.
  */
 int ref_iterator_peel(struct ref_iterator *ref_iterator,
+		      struct repository *repo,
 		      struct object_id *peeled);
 
 /*
@@ -461,10 +476,12 @@ void base_ref_iterator_free(struct ref_iterator *iter);
 typedef int ref_iterator_advance_fn(struct ref_iterator *ref_iterator);
 
 /*
- * Peels the current ref, returning 0 for success or -1 for failure.
+ * Peels the current ref using only information from the ref store. If there is
+ * not enough information, returns REF_ITERATOR_PEEL_INCONCLUSIVE.
  */
-typedef int ref_iterator_peel_fn(struct ref_iterator *ref_iterator,
-				 struct object_id *peeled);
+typedef enum ref_iterator_peel_result ref_iterator_peel_fn(
+		struct ref_iterator *ref_iterator,
+		struct object_id *peeled);
 
 /*
  * Implementations of this function should free any resources specific
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 3/9] refs iterator: support non-the_repository advance
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
  2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
  2021-09-21 16:51 ` [PATCH 2/9] refs: add repo paramater to _iterator_peel() Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Support repositories other than the_repository when advancing through an
iterator.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c               | 3 ++-
 refs/files-backend.c | 2 +-
 refs/iterator.c      | 7 +------
 refs/refs-internal.h | 1 +
 4 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/refs.c b/refs.c
index 3a57893032..6ed64bee1b 100644
--- a/refs.c
+++ b/refs.c
@@ -255,12 +255,13 @@ int refname_is_safe(const char *refname)
  * does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags)
 {
 	if (flags & REF_ISBROKEN)
 		return 0;
-	if (!has_object_file(oid)) {
+	if (!repo_has_object_file(repo, oid)) {
 		error(_("%s does not point to a valid object!"), refname);
 		return 0;
 	}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1faab1cf66..24e5668d6c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1127,7 +1127,7 @@ static int should_pack_ref(const char *refname,
 		return 0;
 
 	/* Do not pack broken refs: */
-	if (!ref_resolves_to_object(refname, oid, ref_flags))
+	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
 		return 0;
 
 	return 1;
diff --git a/refs/iterator.c b/refs/iterator.c
index ee6b00a7be..59048523b8 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -12,15 +12,10 @@ int ref_iterator_advance(struct ref_iterator *ref_iterator)
 {
 	int ok;
 
-	if (ref_iterator->repo && ref_iterator->repo != the_repository)
-		/*
-		 * NEEDSWORK: make ref_resolves_to_object() support
-		 * arbitrary repositories
-		 */
-		BUG("ref_iterator->repo must be NULL or the_repository");
 	while ((ok = ref_iterator->vtable->advance(ref_iterator)) == ITER_OK) {
 		if (ref_iterator->repo &&
 		    !ref_resolves_to_object(ref_iterator->refname,
+					    ref_iterator->repo,
 					    ref_iterator->oid,
 					    ref_iterator->flags))
 			continue;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4656ef83a3..57ad1262ab 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -66,6 +66,7 @@ int refname_is_safe(const char *refname);
  * referred-to object does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags);
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 4/9] refs: teach refs_for_each_ref() arbitrary repos
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (2 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 3/9] refs iterator: support non-the_repository advance Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

A subsequent patch needs to perform a revision walk with --all. As seen
from handle_revision_pseudo_opt() in revision.c, refs_for_each_ref()
needs to be updated to take a repository struct and pass it to the
underlying ref iterator mechanism. This is so that refs can be checked
if they resolve to an existing object and in doing so, non-resolving
refs can be skipped over. (refs_head_ref() doesn't seem to read any
objects and doesn't need this treatment.) Update refs_for_each_ref()
accordingly.

Now that get_main_ref_store() can take repositories other than
the_repository, ensure that it sets the correct flags according to the
repository passed as an argument.

The signatures of some other functions need to be changed too for
consistency (because of handle_refs() in revision.c), so do that in this
patch too.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/submodule--helper.c | 16 ++++++++++------
 object-name.c               |  4 ++--
 refs.c                      | 34 ++++++++++++++++++----------------
 refs.h                      | 10 +++++-----
 revision.c                  | 12 ++++++------
 submodule.c                 | 10 ++++++++--
 6 files changed, 49 insertions(+), 37 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6718f202db..1cc43adfd1 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -827,15 +827,16 @@ static void status_submodule(const char *path, const struct object_id *ce_oid,
 			     displaypath);
 	} else if (!(flags & OPT_CACHED)) {
 		struct object_id oid;
-		struct ref_store *refs = get_submodule_ref_store(path);
+		struct repository subrepo;
 
-		if (!refs) {
+		if (repo_submodule_init(&subrepo, the_repository, path, null_oid())) {
 			print_status(flags, '-', path, ce_oid, displaypath);
 			goto cleanup;
 		}
-		if (refs_head_ref(refs, handle_submodule_head_ref, &oid))
+		if (refs_head_ref(&subrepo, handle_submodule_head_ref, &oid))
 			die(_("could not resolve HEAD ref inside the "
 			      "submodule '%s'"), path);
+		repo_clear(&subrepo);
 
 		print_status(flags, '+', path, &oid, displaypath);
 	} else {
@@ -1044,9 +1045,12 @@ static void generate_submodule_summary(struct summary_cb *info,
 
 	if (!info->cached && oideq(&p->oid_dst, null_oid())) {
 		if (S_ISGITLINK(p->mod_dst)) {
-			struct ref_store *refs = get_submodule_ref_store(p->sm_path);
-			if (refs)
-				refs_head_ref(refs, handle_submodule_head_ref, &p->oid_dst);
+			struct repository subrepo;
+
+			if (!repo_submodule_init(&subrepo, the_repository, p->sm_path, null_oid())) {
+				refs_head_ref(&subrepo, handle_submodule_head_ref, &p->oid_dst);
+				repo_clear(&subrepo);
+			}
 		} else if (S_ISLNK(p->mod_dst) || S_ISREG(p->mod_dst)) {
 			struct stat st;
 			int fd = open(p->sm_path, O_RDONLY);
diff --git a/object-name.c b/object-name.c
index 3263c19457..00df1c8ddb 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1822,8 +1822,8 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
 
 			cb.repo = repo;
 			cb.list = &list;
-			refs_for_each_ref(get_main_ref_store(repo), handle_one_ref, &cb);
-			refs_head_ref(get_main_ref_store(repo), handle_one_ref, &cb);
+			refs_for_each_ref(repo, handle_one_ref, &cb);
+			refs_head_ref(repo, handle_one_ref, &cb);
 			commit_list_sort_by_date(&list);
 			return get_oid_oneline(repo, name + 2, oid, list);
 		}
diff --git a/refs.c b/refs.c
index 6ed64bee1b..c04b2c1462 100644
--- a/refs.c
+++ b/refs.c
@@ -408,34 +408,34 @@ void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_li
 	for_each_rawref(warn_if_dangling_symref, &data);
 }
 
-int refs_for_each_tag_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_tag_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/tags/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/tags/", fn, cb_data);
 }
 
 int for_each_tag_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_tag_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_tag_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_branch_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/heads/", fn, cb_data);
 }
 
 int for_each_branch_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_branch_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_branch_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_remote_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/remotes/", fn, cb_data);
 }
 
 int for_each_remote_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_remote_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_remote_ref(the_repository, fn, cb_data);
 }
 
 int head_ref_namespaced(each_ref_fn fn, void *cb_data)
@@ -1395,12 +1395,12 @@ int refs_rename_ref_available(struct ref_store *refs,
 	return ok;
 }
 
-int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_head_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
 	struct object_id oid;
 	int flag;
 
-	if (!refs_read_ref_full(refs, "HEAD", RESOLVE_REF_READING,
+	if (!refs_read_ref_full(get_main_ref_store(repo), "HEAD", RESOLVE_REF_READING,
 				&oid, &flag))
 		return fn("HEAD", &oid, flag, cb_data);
 
@@ -1409,7 +1409,7 @@ int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 
 int head_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_head_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_head_ref(the_repository, fn, cb_data);
 }
 
 struct ref_iterator *refs_ref_iterator_begin(
@@ -1498,14 +1498,14 @@ static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 					do_for_each_ref_helper, &hp);
 }
 
-int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, "", fn, 0, the_repository, 0, cb_data);
+	return do_for_each_ref(get_main_ref_store(repo), "", fn, 0, repo, 0, cb_data);
 }
 
 int for_each_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_ref(the_repository, fn, cb_data);
 }
 
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
@@ -1896,13 +1896,15 @@ static struct ref_store *ref_store_init(const char *gitdir,
 
 struct ref_store *get_main_ref_store(struct repository *r)
 {
+	unsigned flags = r == the_repository ?
+		REF_STORE_ALL_CAPS : REF_STORE_READ | REF_STORE_ODB;
 	if (r->refs_private)
 		return r->refs_private;
 
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir, flags);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
diff --git a/refs.h b/refs.h
index 48970dfc7e..b53cae717d 100644
--- a/refs.h
+++ b/refs.h
@@ -316,17 +316,17 @@ typedef int each_repo_ref_fn(struct repository *r,
  * modifies the reference also returns a nonzero value to immediately
  * stop the iteration. Returned references are sorted.
  */
-int refs_head_ref(struct ref_store *refs,
+int refs_head_ref(struct repository *repo,
 		  each_ref_fn fn, void *cb_data);
-int refs_for_each_ref(struct ref_store *refs,
+int refs_for_each_ref(struct repository *repo,
 		      each_ref_fn fn, void *cb_data);
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 			 each_ref_fn fn, void *cb_data);
-int refs_for_each_tag_ref(struct ref_store *refs,
+int refs_for_each_tag_ref(struct repository *repo,
 			  each_ref_fn fn, void *cb_data);
-int refs_for_each_branch_ref(struct ref_store *refs,
+int refs_for_each_branch_ref(struct repository *repo,
 			     each_ref_fn fn, void *cb_data);
-int refs_for_each_remote_ref(struct ref_store *refs,
+int refs_for_each_remote_ref(struct repository *repo,
 			     each_ref_fn fn, void *cb_data);
 
 /* just iterates the head ref. */
diff --git a/revision.c b/revision.c
index 31fc1884d2..ec9baf9508 100644
--- a/revision.c
+++ b/revision.c
@@ -1567,7 +1567,7 @@ void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
 
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
-			int (*for_each)(struct ref_store *, each_ref_fn, void *))
+			int (*for_each)(struct repository *, each_ref_fn, void *))
 {
 	struct all_refs_cb cb;
 
@@ -1577,7 +1577,7 @@ static void handle_refs(struct ref_store *refs,
 	}
 
 	init_all_refs_cb(&cb, revs, flags);
-	for_each(refs, handle_one_ref, &cb);
+	for_each(revs->repo, handle_one_ref, &cb);
 }
 
 static void handle_one_reflog_commit(struct object_id *oid, void *cb_data)
@@ -2551,14 +2551,14 @@ static int for_each_bisect_ref(struct ref_store *refs, each_ref_fn fn,
 	return status;
 }
 
-static int for_each_bad_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+static int for_each_bad_bisect_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return for_each_bisect_ref(refs, fn, cb_data, term_bad);
+	return for_each_bisect_ref(get_main_ref_store(repo), fn, cb_data, term_bad);
 }
 
-static int for_each_good_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+static int for_each_good_bisect_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return for_each_bisect_ref(refs, fn, cb_data, term_good);
+	return for_each_bisect_ref(get_main_ref_store(repo), fn, cb_data, term_good);
 }
 
 static int handle_revision_pseudo_opt(struct rev_info *revs,
diff --git a/submodule.c b/submodule.c
index ecda0229af..bdaeb72e08 100644
--- a/submodule.c
+++ b/submodule.c
@@ -92,8 +92,14 @@ int is_staging_gitmodules_ok(struct index_state *istate)
 static int for_each_remote_ref_submodule(const char *submodule,
 					 each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_remote_ref(get_submodule_ref_store(submodule),
-					fn, cb_data);
+	struct repository subrepo;
+	int ret;
+
+	if (repo_submodule_init(&subrepo, the_repository, submodule, null_oid()))
+		return 0;
+	ret = refs_for_each_remote_ref(&subrepo, fn, cb_data);
+	repo_clear(&subrepo);
+	return ret;
 }
 
 /*
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb()
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (3 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-28  0:29   ` Elijah Newren
  2021-09-21 16:51 ` [PATCH 6/9] object-file: only register submodule ODB if needed Jonathan Tan
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates are in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 merge-ort.c                | 18 ++++-------------
 merge-recursive.c          | 41 +++++++++++++++++++-------------------
 strbuf.c                   | 12 ++++++++---
 strbuf.h                   |  6 ++++--
 t/t6437-submodule-merge.sh |  3 +++
 5 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b8efaee8e0..a4aad8f33f 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -628,6 +628,7 @@ static int err(struct merge_options *opt, const char *err, ...)
 
 static void format_commit(struct strbuf *sb,
 			  int indent,
+			  struct repository *repo,
 			  struct commit *commit)
 {
 	struct merge_remote_desc *desc;
@@ -641,7 +642,7 @@ static void format_commit(struct strbuf *sb,
 		return;
 	}
 
-	format_commit_message(commit, "%h %s", sb, &ctx);
+	repo_format_commit_message(repo, commit, "%h %s", sb, &ctx);
 	strbuf_addch(sb, '\n');
 }
 
@@ -1566,17 +1567,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		path_msg(opt, path, 0,
-			 _("Failed to merge submodule %s (not checked out)"),
-			 path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		path_msg(opt, path, 0,
 				_("Failed to merge submodule %s (not checked out)"),
@@ -1641,7 +1631,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 
 	case 1:
-		format_commit(&sb, 4,
+		format_commit(&sb, 4, &subrepo,
 			      (struct commit *)merges.objects[0].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but a possible merge "
@@ -1658,7 +1648,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 	default:
 		for (i = 0; i < merges.nr; i++)
-			format_commit(&sb, 4,
+			format_commit(&sb, 4, &subrepo,
 				      (struct commit *)merges.objects[i].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but multiple "
diff --git a/merge-recursive.c b/merge-recursive.c
index fc8ac39d8c..6e8fb39315 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -337,7 +337,9 @@ static void output(struct merge_options *opt, int v, const char *fmt, ...)
 		flush_output(opt);
 }
 
-static void output_commit_title(struct merge_options *opt, struct commit *commit)
+static void repo_output_commit_title(struct merge_options *opt,
+				     struct repository *repo,
+				     struct commit *commit)
 {
 	struct merge_remote_desc *desc;
 
@@ -346,23 +348,29 @@ static void output_commit_title(struct merge_options *opt, struct commit *commit
 	if (desc)
 		strbuf_addf(&opt->obuf, "virtual %s\n", desc->name);
 	else {
-		strbuf_add_unique_abbrev(&opt->obuf, &commit->object.oid,
-					 DEFAULT_ABBREV);
+		strbuf_repo_add_unique_abbrev(&opt->obuf, repo,
+					      &commit->object.oid,
+					      DEFAULT_ABBREV);
 		strbuf_addch(&opt->obuf, ' ');
-		if (parse_commit(commit) != 0)
+		if (repo_parse_commit(repo, commit) != 0)
 			strbuf_addstr(&opt->obuf, _("(bad commit)\n"));
 		else {
 			const char *title;
-			const char *msg = get_commit_buffer(commit, NULL);
+			const char *msg = repo_get_commit_buffer(repo, commit, NULL);
 			int len = find_commit_subject(msg, &title);
 			if (len)
 				strbuf_addf(&opt->obuf, "%.*s\n", len, title);
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(repo, commit, msg);
 		}
 	}
 	flush_output(opt);
 }
 
+static void output_commit_title(struct merge_options *opt, struct commit *commit)
+{
+	repo_output_commit_title(opt, the_repository, commit);
+}
+
 static int add_cacheinfo(struct merge_options *opt,
 			 const struct diff_filespec *blob,
 			 const char *path, int stage, int refresh, int options)
@@ -1152,14 +1160,14 @@ static int find_first_merges(struct repository *repo,
 	return result->nr;
 }
 
-static void print_commit(struct commit *commit)
+static void print_commit(struct repository *repo, struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
 	ctx.date_mode.type = DATE_NORMAL;
 	/* FIXME: Merge this with output_commit_title() */
 	assert(!merge_remote_util(commit));
-	format_commit_message(commit, " %h: %m %s", &sb, &ctx);
+	repo_format_commit_message(repo, commit, " %h: %m %s", &sb, &ctx);
 	fprintf(stderr, "%s\n", sb.buf);
 	strbuf_release(&sb);
 }
@@ -1199,15 +1207,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
 		return 0;
@@ -1232,7 +1231,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_b);
+			repo_output_commit_title(opt, &subrepo, commit_b);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1245,7 +1244,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_a);
+			repo_output_commit_title(opt, &subrepo, commit_a);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1277,7 +1276,7 @@ static int merge_submodule(struct merge_options *opt,
 	case 1:
 		output(opt, 1, _("Failed to merge submodule %s (not fast-forward)"), path);
 		output(opt, 2, _("Found a possible merge resolution for the submodule:\n"));
-		print_commit((struct commit *) merges.objects[0].item);
+		print_commit(&subrepo, (struct commit *) merges.objects[0].item);
 		output(opt, 2, _(
 		       "If this is correct simply add it to the index "
 		       "for example\n"
@@ -1290,7 +1289,7 @@ static int merge_submodule(struct merge_options *opt,
 	default:
 		output(opt, 1, _("Failed to merge submodule %s (multiple merges found)"), path);
 		for (i = 0; i < merges.nr; i++)
-			print_commit((struct commit *) merges.objects[i].item);
+			print_commit(&subrepo, (struct commit *) merges.objects[i].item);
 	}
 
 	object_array_clear(&merges);
diff --git a/strbuf.c b/strbuf.c
index c8a5789694..b22e981655 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 	strbuf_setlen(sb, sb->len + len);
 }
 
-void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
-			      int abbrev_len)
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len)
 {
 	int r;
 	strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
-	r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
+	r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
 	strbuf_setlen(sb, sb->len + r);
 }
 
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
+			      int abbrev_len)
+{
+	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
+}
+
 /*
  * Returns the length of a line, without trailing spaces.
  *
diff --git a/strbuf.h b/strbuf.h
index 5b1113abf8..2d9e01c16f 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -634,8 +634,10 @@ void strbuf_list_free(struct strbuf **list);
  * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
  * the strbuf `sb`.
  */
-void strbuf_add_unique_abbrev(struct strbuf *sb,
-			      const struct object_id *oid,
+struct repository;
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len);
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
 			      int abbrev_len);
 
 /**
diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
index e5e89c2045..178413c22f 100755
--- a/t/t6437-submodule-merge.sh
+++ b/t/t6437-submodule-merge.sh
@@ -5,6 +5,9 @@ test_description='merging with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6/9] object-file: only register submodule ODB if needed
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (4 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 object-file.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index 621b121bcb..ab861dfdeb 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1593,7 +1593,8 @@ static int do_oid_object_info_extended(struct repository *r,
 				break;
 		}
 
-		if (register_all_submodule_odb_as_alternates())
+		if (r == the_repository &&
+		    register_all_submodule_odb_as_alternates())
 			/* We added some alternates; retry */
 			continue;
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 7/9] submodule: pass repo to check_has_commit()
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (5 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 6/9] object-file: only register submodule ODB if needed Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
several tests no longer rely on add_submodule_odb(), so mark these tests
accordingly.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c                            | 16 +++++++++++++---
 t/t5526-fetch-submodules.sh            |  3 +++
 t/t5572-pull-submodule.sh              |  3 +++
 t/t7418-submodule-sparse-gitmodules.sh |  3 +++
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index bdaeb72e08..e9757376c2 100644
--- a/submodule.c
+++ b/submodule.c
@@ -917,23 +917,33 @@ struct has_commit_data {
 static int check_has_commit(const struct object_id *oid, void *data)
 {
 	struct has_commit_data *cb = data;
+	struct repository subrepo;
+	enum object_type type;
 
-	enum object_type type = oid_object_info(cb->repo, oid, NULL);
+	if (repo_submodule_init(&subrepo, cb->repo, cb->path, null_oid())) {
+		cb->result = 0;
+		goto cleanup;
+	}
+
+	type = oid_object_info(&subrepo, oid, NULL);
 
 	switch (type) {
 	case OBJ_COMMIT:
-		return 0;
+		goto cleanup;
 	case OBJ_BAD:
 		/*
 		 * Object is missing or invalid. If invalid, an error message
 		 * has already been printed.
 		 */
 		cb->result = 0;
-		return 0;
+		goto cleanup;
 	default:
 		die(_("submodule entry '%s' (%s) is a %s, not a commit"),
 		    cb->path, oid_to_hex(oid), type_name(type));
 	}
+cleanup:
+	repo_clear(&subrepo);
+	return 0;
 }
 
 static int submodule_has_commits(struct repository *r,
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index ed11569d8d..2dc75b80db 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -6,6 +6,9 @@ test_description='Recursive "git fetch" for submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 pwd=$(pwd)
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index 4f92a116e1..fa6b4cca65 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -2,6 +2,9 @@
 
 test_description='pull can handle submodules'
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh
index 3f7f271883..f87e524d6d 100755
--- a/t/t7418-submodule-sparse-gitmodules.sh
+++ b/t/t7418-submodule-sparse-gitmodules.sh
@@ -12,6 +12,9 @@ The test setup uses a sparse checkout, however the same scenario can be set up
 also by committing .gitmodules and then just removing it from the filesystem.
 '
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success 'sparse checkout setup which hides .gitmodules' '
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 8/9] refs: change refs_for_each_ref_in() to take repo
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (6 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-21 16:51 ` [PATCH 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Pass a repository to refs_for_each_ref_in() so that object accesses
during iteration (done to skip over invalid refs) are made with the
correct repository instead of relying on add_submodule_odb(). With this,
the last remaining tests no longer rely on add_submodule_odb(), so mark
them accordingly.

The test-ref-store test helper needed to be changed to reflect the new
API. For now, just pass the repository through a global variable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                         | 12 ++++++------
 refs.h                         |  2 +-
 t/helper/test-ref-store.c      | 20 +++++++++-----------
 t/t5531-deep-submodule-push.sh |  3 +++
 t/t5545-push-options.sh        |  3 +++
 5 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index c04b2c1462..b011953e32 100644
--- a/refs.c
+++ b/refs.c
@@ -410,7 +410,7 @@ void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_li
 
 int refs_for_each_tag_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/tags/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/tags/", fn, cb_data);
 }
 
 int for_each_tag_ref(each_ref_fn fn, void *cb_data)
@@ -420,7 +420,7 @@ int for_each_tag_ref(each_ref_fn fn, void *cb_data)
 
 int refs_for_each_branch_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/heads/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/heads/", fn, cb_data);
 }
 
 int for_each_branch_ref(each_ref_fn fn, void *cb_data)
@@ -430,7 +430,7 @@ int for_each_branch_ref(each_ref_fn fn, void *cb_data)
 
 int refs_for_each_remote_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/remotes/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/remotes/", fn, cb_data);
 }
 
 int for_each_remote_ref(each_ref_fn fn, void *cb_data)
@@ -1508,15 +1508,15 @@ int for_each_ref(each_ref_fn fn, void *cb_data)
 	return refs_for_each_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
+int refs_for_each_ref_in(struct repository *repo, const char *prefix,
 			 each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
+	return do_for_each_ref(get_main_ref_store(repo), prefix, fn, strlen(prefix), repo, 0, cb_data);
 }
 
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(the_repository), prefix, fn, cb_data);
+	return refs_for_each_ref_in(the_repository, prefix, fn, cb_data);
 }
 
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsigned int broken)
diff --git a/refs.h b/refs.h
index b53cae717d..fe290317ae 100644
--- a/refs.h
+++ b/refs.h
@@ -320,7 +320,7 @@ int refs_head_ref(struct repository *repo,
 		  each_ref_fn fn, void *cb_data);
 int refs_for_each_ref(struct repository *repo,
 		      each_ref_fn fn, void *cb_data);
-int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
+int refs_for_each_ref_in(struct repository *repo, const char *prefix,
 			 each_ref_fn fn, void *cb_data);
 int refs_for_each_tag_ref(struct repository *repo,
 			  each_ref_fn fn, void *cb_data);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index b314b81a45..1964cb349e 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -5,6 +5,8 @@
 #include "object-store.h"
 #include "repository.h"
 
+static struct repository *repo;
+
 static const char *notnull(const char *arg, const char *name)
 {
 	if (!arg)
@@ -24,18 +26,13 @@ static const char **get_store(const char **argv, struct ref_store **refs)
 	if (!argv[0]) {
 		die("ref store required");
 	} else if (!strcmp(argv[0], "main")) {
+		repo = the_repository;
 		*refs = get_main_ref_store(the_repository);
 	} else if (skip_prefix(argv[0], "submodule:", &gitdir)) {
-		struct strbuf sb = STRBUF_INIT;
-		int ret;
-
-		ret = strbuf_git_path_submodule(&sb, gitdir, "objects/");
-		if (ret)
-			die("strbuf_git_path_submodule failed: %d", ret);
-		add_to_alternates_memory(sb.buf);
-		strbuf_release(&sb);
-
-		*refs = get_submodule_ref_store(gitdir);
+		repo = xmalloc(sizeof(*repo));
+		if (repo_submodule_init(repo, the_repository, gitdir, null_oid()))
+			die("repo_submodule_init failed");
+		*refs = get_main_ref_store(repo);
 	} else if (skip_prefix(argv[0], "worktree:", &gitdir)) {
 		struct worktree **p, **worktrees = get_worktrees();
 
@@ -52,6 +49,7 @@ static const char **get_store(const char **argv, struct ref_store **refs)
 		if (!*p)
 			die("no such worktree: %s", gitdir);
 
+		repo = the_repository;
 		*refs = get_worktree_ref_store(*p);
 	} else
 		die("unknown backend %s", argv[0]);
@@ -113,7 +111,7 @@ static int cmd_for_each_ref(struct ref_store *refs, const char **argv)
 {
 	const char *prefix = notnull(*argv++, "prefix");
 
-	return refs_for_each_ref_in(refs, prefix, each_ref, NULL);
+	return refs_for_each_ref_in(repo, prefix, each_ref, NULL);
 }
 
 static int cmd_resolve_ref(struct ref_store *refs, const char **argv)
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index d573ca496a..3f58b515ce 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -5,6 +5,9 @@ test_description='test push with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 58c7add7ee..214228349a 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -5,6 +5,9 @@ test_description='pushing to a repository using push options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 mk_repo_pair () {
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 9/9] submodule: trace adding submodule ODB as alternate
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (7 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
@ 2021-09-21 16:51 ` Jonathan Tan
  2021-09-23 18:05 ` [PATCH 0/9] No more " Junio C Hamano
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-21 16:51 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/submodule.c b/submodule.c
index e9757376c2..77e76d3f9c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -207,6 +207,8 @@ int register_all_submodule_odb_as_alternates(void)
 		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
 	if (ret) {
 		string_list_clear(&added_submodule_odb_paths, 0);
+		trace2_data_intmax("submodule", the_repository,
+				   "register_all_submodule_odb_as_alternates/registered", ret);
 		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
 			BUG("register_all_submodule_odb_as_alternates() called");
 	}
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
@ 2021-09-23  1:00   ` Junio C Hamano
  2021-09-24 17:56     ` Jonathan Tan
  2021-09-24 18:13   ` Jeff King
  1 sibling, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-09-23  1:00 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> As a first step in resolving both these problems, replace the
> DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
> commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
> is set, a NULL repository (representing access to no object store) is
> used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
> non-NULL repository (representing access to that repository's object
> store) is used instead.

Hmph, so the lack of "include broken" is a signal to validate the
object the ref points at, and the new parameter is "if this pointer
is not NULL, then expect to find the object in this repository and
validate it" that replaces the original "validate it" with a bit
more detailed instruction (i.e. "how to validate--use the object
store associated to this repository")?

> Right now, the locations in which
> non-the_repository support needs to be added are marked with BUG()
> statements - in a future patch, these will be replaced. (NEEDSWORK: in
> this RFC patch set, this has not been done)

> - Change the _advance() callback to also have a repository object
>   parameter, and either skip or not skip depending on whether that
>   parameter is NULL. This burdens callers to have to carry this
>   information along with the iterator, and such calling code may be
>   unclear as to why that parameter can be NULL in some cases and cannot
>   in others.

Hmph.  

> diff --git a/refs.c b/refs.c
> index 8b9f7c3a80..49ddcdac53 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1413,16 +1413,16 @@ int head_ref(each_ref_fn fn, void *cb_data)
>  
>  struct ref_iterator *refs_ref_iterator_begin(
>  		struct ref_store *refs,
> -		const char *prefix, int trim, int flags)
> +		const char *prefix, int trim, struct repository *repo,
> +		int flags)
>  {
>  	struct ref_iterator *iter;
>  
>  	if (ref_paranoia < 0)
>  		ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 0);
> -	if (ref_paranoia)
> -		flags |= DO_FOR_EACH_INCLUDE_BROKEN;
>  
>  	iter = refs->be->iterator_begin(refs, prefix, flags);
> +	iter->repo = ref_paranoia ? NULL : repo;

OK.  "flags" is still kept because there are bits other than
"include broken" that need to be propagated.

> @@ -1442,13 +1442,16 @@ struct ref_iterator *refs_ref_iterator_begin(
>   * Call fn for each reference in the specified submodule for which the
>   * refname begins with prefix. If trim is non-zero, then trim that
>   * many characters off the beginning of each refname before passing
> - * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
> - * include broken references in the iteration. If fn ever returns a
> + * the refname to fn. If fn ever returns a
>   * non-zero value, stop the iteration and return that value;
>   * otherwise, return 0.
> + *
> + * See the documentation of refs_ref_iterator_begin() for more information on
> + * the repo parameter.
>   */
>  static int do_for_each_repo_ref(struct repository *r, const char *prefix,
> -				each_repo_ref_fn fn, int trim, int flags,
> +				each_repo_ref_fn fn, int trim,
> +				struct repository *repo, int flags,
>  				void *cb_data)

Confusing.  We are iterating refs that exists in the repository "r",
right?  Why do we need to have an extra "repo" parameter?  Can they
ever diverge (beyond repo could be NULL to signal now-lost "include
broken" bit wanted to convey)?  It's not like a valid caller can
pass the superproject in 'r' and a submodule in 'repo', right?

Enhancing an interface this way, and allowing an arbitrary
repository instance to be passed only to convey one bit of
information, by adding a "repo" smells like inviting bugs in the
future.

I have a feeling that the function signature for this one should
stay as before, and "repo" should be a local variable that is
initialized as

	struct repository *repo = (flags & DO_FOR_EACH_INCLUDE_BROKEN)
				? r
				: NULL;

to avoid such a future bug, but given that there is only one caller
to this helper, I do not mind

	if (repo && r != repo)
		BUG(...);

to catch any such mistake.

>  int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
>  {
>  	return do_for_each_repo_ref(r, git_replace_ref_base, fn,
>  				    strlen(git_replace_ref_base),
> -				    DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
> +				    NULL, 0, cb_data);
>  }

And this is the only such caller, if I am reading the code right.

Do we ever pass non-NULL "repo" to do_for_each_repo_ref() in future
steps?

If not, perhaps we do not even have to add "repo" as a new parameter
to do_for_each_repo_ref(), and instead always pass NULL down to
refs_ref_iterator_begin() from do_for_each_repo_ref()?

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 677b7e4cdd..cd145301d0 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -744,12 +744,6 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  		    ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
>  			continue;
>  
> -		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
> -		    !ref_resolves_to_object(iter->iter0->refname,
> -					    iter->iter0->oid,
> -					    iter->iter0->flags))
> -			continue;
> -
>  		iter->base.refname = iter->iter0->refname;
>  		iter->base.oid = iter->iter0->oid;
>  		iter->base.flags = iter->iter0->flags;
> @@ -801,9 +795,6 @@ static struct ref_iterator *files_ref_iterator_begin(
>  	struct ref_iterator *ref_iterator;
>  	unsigned int required_flags = REF_STORE_READ;
>  
> -	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
> -		required_flags |= REF_STORE_ODB;
> -
>  	refs = files_downcast(ref_store, required_flags, "ref_iterator_begin");
>  
>  	/*

Hmph, I am not sure where the lossage in these two hunks are
compensated.  Perhaps in the backend independent layer in
refs/iterator.c?  Let's read on.

> @@ -836,10 +827,13 @@ static struct ref_iterator *files_ref_iterator_begin(
>  	 * references, and (if needed) do our own check for broken
>  	 * ones in files_ref_iterator_advance(), after we have merged
>  	 * the packed and loose references.
> +	 *
> +	 * Do this by not supplying any repo, regardless of whether a repo was
> +	 * supplied to files_ref_iterator_begin().
>  	 */
>  	packed_iter = refs_ref_iterator_begin(
>  			refs->packed_ref_store, prefix, 0,
> -			DO_FOR_EACH_INCLUDE_BROKEN);
> +			NULL, 0);

OK.

> diff --git a/refs/iterator.c b/refs/iterator.c
> index a89d132d4f..5af6554887 100644
> --- a/refs/iterator.c
> +++ b/refs/iterator.c
> @@ -10,7 +10,23 @@
>  
>  int ref_iterator_advance(struct ref_iterator *ref_iterator)
>  {
> -	return ref_iterator->vtable->advance(ref_iterator);
> +	int ok;
> +
> +	if (ref_iterator->repo && ref_iterator->repo != the_repository)

OK. refs_ref_interator_begin() assigned the "repo" parameter that
tells which repository to consult to validate the objects at the tip
of refs to the .repo member of the iterator object, and we check it
here.

It is a bit surprising that ref_iterator does not know which
repository it is working in (regardless of "include broken" bit).
Do you think it will stay that way?  I have this nagging feeling
that it won't, and having "struct repository *repository" pointer
that always points at the repository the ref-store belongs to in a
ref_iterator instance would become necessary in the longer run.

In which case, this .repo member this patch adds would become a big
problem, no?  If we were to validate objects at the tip of the refs
against object store, we will always use the object store that
belongs to the iterator->repository, so the only valid states for
iterator->repo are either NULL or iterator->repository.  That again
is the same problem I pointed out already about the parameter the
do_for_each_repo_ref() helper that is inviting future bugs, it seems
to me.  Wouldn't it make more sense to add

 * iterator->repository that points at the repository in which we
   are iterating the refs

 * a bit in iterator that chooses between "do not bother checking"
   and "do check the tip of refs against the object store of
   iterator->repository

to avoid such a mess?  Perhaps we already have such a bit in the
flags word in the ref_iterator but I didn't check.

Thanks.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 0/9] No more adding submodule ODB as alternate
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (8 preceding siblings ...)
  2021-09-21 16:51 ` [PATCH 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
@ 2021-09-23 18:05 ` Junio C Hamano
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-09-23 18:05 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> This series is on jt/add-submodule-odb-clean-up.
>
> After this series, the entire test suite runs without ever adding a
> submodule ODB as an alternate (checked by running with
> GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1). The code to lazily add
> submodule ODBs as alternates still remains (with a trace message printed
> if it happens) just in case there is a rare interaction that the test
> suite doesn't cover.
>
> This is part of my effort to support partial clone in submodules, but
> the results here are also beneficial for non-partial-clone submodule
> users in that access to submodule objects are now quicker (because Git
> no longer needs to linearly search through alternates when accessing
> these objects). It also improves code health in that it is clearer at
> the call site when a submodule object is being accessed.

Nice.  One specially bad thing about the alternate odb abuse is that
it is very hard to undo once you add an odb that is not really an
alternate as if it were an alternate.  Accessing the objects from
the repository the objects should be found, keeping clear separation
between the superproject and its submodule, which is done here, is
the right thing to do.

Unconfortably excited.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-23  1:00   ` Junio C Hamano
@ 2021-09-24 17:56     ` Jonathan Tan
  2021-09-24 19:55       ` Junio C Hamano
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-24 17:56 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > As a first step in resolving both these problems, replace the
> > DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
> > commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
> > is set, a NULL repository (representing access to no object store) is
> > used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
> > non-NULL repository (representing access to that repository's object
> > store) is used instead.
> 
> Hmph, so the lack of "include broken" is a signal to validate the
> object the ref points at, and the new parameter is "if this pointer
> is not NULL, then expect to find the object in this repository and
> validate it" that replaces the original "validate it" with a bit
> more detailed instruction (i.e. "how to validate--use the object
> store associated to this repository")?

Yes.

> > @@ -1442,13 +1442,16 @@ struct ref_iterator *refs_ref_iterator_begin(
> >   * Call fn for each reference in the specified submodule for which the
> >   * refname begins with prefix. If trim is non-zero, then trim that
> >   * many characters off the beginning of each refname before passing
> > - * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
> > - * include broken references in the iteration. If fn ever returns a
> > + * the refname to fn. If fn ever returns a
> >   * non-zero value, stop the iteration and return that value;
> >   * otherwise, return 0.
> > + *
> > + * See the documentation of refs_ref_iterator_begin() for more information on
> > + * the repo parameter.
> >   */
> >  static int do_for_each_repo_ref(struct repository *r, const char *prefix,
> > -				each_repo_ref_fn fn, int trim, int flags,
> > +				each_repo_ref_fn fn, int trim,
> > +				struct repository *repo, int flags,
> >  				void *cb_data)
> 
> Confusing.  We are iterating refs that exists in the repository "r",
> right?  Why do we need to have an extra "repo" parameter?  Can they
> ever diverge (beyond repo could be NULL to signal now-lost "include
> broken" bit wanted to convey)?  It's not like a valid caller can
> pass the superproject in 'r' and a submodule in 'repo', right?
> 
> Enhancing an interface this way, and allowing an arbitrary
> repository instance to be passed only to convey one bit of
> information, by adding a "repo" smells like inviting bugs in the
> future.
> 
> I have a feeling that the function signature for this one should
> stay as before, and "repo" should be a local variable that is
> initialized as
> 
> 	struct repository *repo = (flags & DO_FOR_EACH_INCLUDE_BROKEN)
> 				? r
> 				: NULL;
> 
> to avoid such a future bug, but given that there is only one caller
> to this helper, I do not mind
> 
> 	if (repo && r != repo)
> 		BUG(...);
> 
> to catch any such mistake.

(see next answer)

> >  int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
> >  {
> >  	return do_for_each_repo_ref(r, git_replace_ref_base, fn,
> >  				    strlen(git_replace_ref_base),
> > -				    DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
> > +				    NULL, 0, cb_data);
> >  }
> 
> And this is the only such caller, if I am reading the code right.
> 
> Do we ever pass non-NULL "repo" to do_for_each_repo_ref() in future
> steps?
> 
> If not, perhaps we do not even have to add "repo" as a new parameter
> to do_for_each_repo_ref(), and instead always pass NULL down to
> refs_ref_iterator_begin() from do_for_each_repo_ref()?

do_for_each_repo_ref() does not gain future callers in future steps (so
we never pass non-NULL "repo"). I'll do this (and add a comment to
do_for_each_repo_ref() explaining that we do not check for broken refs).

> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 677b7e4cdd..cd145301d0 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -744,12 +744,6 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  		    ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
> >  			continue;
> >  
> > -		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
> > -		    !ref_resolves_to_object(iter->iter0->refname,
> > -					    iter->iter0->oid,
> > -					    iter->iter0->flags))
> > -			continue;
> > -
> >  		iter->base.refname = iter->iter0->refname;
> >  		iter->base.oid = iter->iter0->oid;
> >  		iter->base.flags = iter->iter0->flags;
> > @@ -801,9 +795,6 @@ static struct ref_iterator *files_ref_iterator_begin(
> >  	struct ref_iterator *ref_iterator;
> >  	unsigned int required_flags = REF_STORE_READ;
> >  
> > -	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
> > -		required_flags |= REF_STORE_ODB;
> > -
> >  	refs = files_downcast(ref_store, required_flags, "ref_iterator_begin");
> >  
> >  	/*
> 
> Hmph, I am not sure where the lossage in these two hunks are
> compensated.  Perhaps in the backend independent layer in
> refs/iterator.c?  Let's read on.

Yes - the first hunk is compensated in the backend independent layer,
and the second hunk was there to ensure that the first hunk would work
(and since the first hunk is removed, the second hunk no longer needs to
be there). I'll add a note in the commit message.

> > @@ -10,7 +10,23 @@
> >  
> >  int ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  {
> > -	return ref_iterator->vtable->advance(ref_iterator);
> > +	int ok;
> > +
> > +	if (ref_iterator->repo && ref_iterator->repo != the_repository)
> 
> OK. refs_ref_interator_begin() assigned the "repo" parameter that
> tells which repository to consult to validate the objects at the tip
> of refs to the .repo member of the iterator object, and we check it
> here.
> 
> It is a bit surprising that ref_iterator does not know which
> repository it is working in (regardless of "include broken" bit).
> Do you think it will stay that way?  I have this nagging feeling
> that it won't, and having "struct repository *repository" pointer
> that always points at the repository the ref-store belongs to in a
> ref_iterator instance would become necessary in the longer run.

I think it's better if it stays that way, so that callers that don't
want to access the ODB (all the callers that currently use
DO_FOR_EACH_INCLUDE_BROKEN) can be assured that the iterator won't do
that. If we had a non-NULL "struct repository *repository" parameter, a
future code change might inadvertently use it, thus causing a bug.

Right now I think that this is possible, since the only other thing that
accesses the ODB is peeling, and that is handled by the next patch
(patch 2/9). If we think that it won't stay that way in the future,
though, then I agree with you.

> In which case, this .repo member this patch adds would become a big
> problem, no?  If we were to validate objects at the tip of the refs
> against object store, we will always use the object store that
> belongs to the iterator->repository, so the only valid states for
> iterator->repo are either NULL or iterator->repository.  That again
> is the same problem I pointed out already about the parameter the
> do_for_each_repo_ref() helper that is inviting future bugs, it seems
> to me.  Wouldn't it make more sense to add
> 
>  * iterator->repository that points at the repository in which we
>    are iterating the refs
> 
>  * a bit in iterator that chooses between "do not bother checking"
>    and "do check the tip of refs against the object store of
>    iterator->repository
> 
> to avoid such a mess?  Perhaps we already have such a bit in the
> flags word in the ref_iterator but I didn't check.

If we need iterator->repository, then I agree with you. The bit could
then be DO_FOR_EACH_INCLUDE_BROKEN (which currently exists, and which I
am removing in this patch, but if we think we should keep it, then we
should keep it).

Having said all that, it may be better to have a non-NULL repository
reference in the iterator and retain DO_FOR_EACH_INCLUDE_BROKEN - at the
very least, this is a more gradual change and still leaves open the
possibility of turning the repository reference into a nullable one.
Callers that use DO_FOR_EACH_INCLUDE_BROKEN will have to deal with an
API that is unclear about what is being done with the repository object
being passed in, but that is the same as the status quo. I'll try it and
see how it goes (it will probably simplify patch 2 too).

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
  2021-09-23  1:00   ` Junio C Hamano
@ 2021-09-24 18:13   ` Jeff King
  2021-09-24 18:28     ` Jonathan Tan
  1 sibling, 1 reply; 65+ messages in thread
From: Jeff King @ 2021-09-24 18:13 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Sep 21, 2021 at 09:51:03AM -0700, Jonathan Tan wrote:

> Currently, ref iterators access the object store each time they advance
> if and only if the boolean flag DO_FOR_EACH_INCLUDE_BROKEN is unset.
> (The iterators access the object store because, if
> DO_FOR_EACH_INCLUDE_BROKEN is unset, they need to attempt to resolve
> each ref to determine that it is not broken.)
> 
> Also, the object store accessed is always that of the_repository, making
> it impossible to iterate over a submodule's refs without
> DO_FOR_EACH_INCLUDE_BROKEN (unless add_submodule_odb() is used).
>
> As a first step in resolving both these problems, replace the
> DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
> commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
> is set, a NULL repository (representing access to no object store) is
> used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
> non-NULL repository (representing access to that repository's object
> store) is used instead. Right now, the locations in which
> non-the_repository support needs to be added are marked with BUG()
> statements - in a future patch, these will be replaced. (NEEDSWORK: in
> this RFC patch set, this has not been done)

I think your goal here of passing around a repository object is good.
But rolling the meaning of DO_FOR_EACH_INCLUDE_BROKEN into an implicit
"do we have a non-NULL repository" makes things awkward, I think.

As you noticed, we can't get rid of the flags parameter entirely. We
still have DO_FOR_EACH_PER_WORKTREE_ONLY. But I also have a series which
adds another flag which pairs with INCLUDE_BROKEN. Having half of the
logic implicit in the repository pointer and half in a flag would be
weird.

I'll post that series in a moment, but what I'm wondering here is: would
it be that big a deal to just pass the repository object around, and it
is simply not used if INCLUDE_BROKEN is passed?

-Peff

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-24 18:13   ` Jeff King
@ 2021-09-24 18:28     ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-24 18:28 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git

> I think your goal here of passing around a repository object is good.
> But rolling the meaning of DO_FOR_EACH_INCLUDE_BROKEN into an implicit
> "do we have a non-NULL repository" makes things awkward, I think.
> 
> As you noticed, we can't get rid of the flags parameter entirely. We
> still have DO_FOR_EACH_PER_WORKTREE_ONLY. But I also have a series which
> adds another flag which pairs with INCLUDE_BROKEN. Having half of the
> logic implicit in the repository pointer and half in a flag would be
> weird.
> 
> I'll post that series in a moment, but what I'm wondering here is: would
> it be that big a deal to just pass the repository object around, and it
> is simply not used if INCLUDE_BROKEN is passed?

Quoting my response to Junio (sent a few minutes ago, so you might have
not seen it) [1]:

 so that callers that don't
 want to access the ODB (all the callers that currently use
 DO_FOR_EACH_INCLUDE_BROKEN) can be assured that the iterator won't do
 that. If we had a non-NULL "struct repository *repository" parameter, a
 future code change might inadvertently use it, thus causing a bug.

I'll take a look at your series when it comes out, but from what you
say, it looks like we should pass a non-nullable repository and keep the
DO_FOR_EACH_INCLUDE_BROKEN flag. I'll update this patch set to do that.

[1] https://lore.kernel.org/git/20210924175651.2918488-1-jonathantanmy@google.com/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
  2021-09-24 17:56     ` Jonathan Tan
@ 2021-09-24 19:55       ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-09-24 19:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

>> It is a bit surprising that ref_iterator does not know which
>> repository it is working in (regardless of "include broken" bit).
>> Do you think it will stay that way?  I have this nagging feeling
>> that it won't, and having "struct repository *repository" pointer
>> that always points at the repository the ref-store belongs to in a
>> ref_iterator instance would become necessary in the longer run.
>
> I think it's better if it stays that way, so that callers that don't
> want to access the ODB (all the callers that currently use
> DO_FOR_EACH_INCLUDE_BROKEN) can be assured that the iterator won't do
> that. If we had a non-NULL "struct repository *repository" parameter, a
> future code change might inadvertently use it, thus causing a bug.

Hmph, I am unlikely to be the one who is doing such a future code
change, but anybody who equates having a pointer to the repository
structure and the need to always validate the tips of refs with the
object store associated to the repository would be poor future
developers we wouldn't want their hands on our codebase.

An in-core repository has a lot more than the object store.

Besides, if we really want to have choice between "do check them"
and "ignore broken" expressed cleanly in the code, it would be much
better to be explicit about it, and a member in the context struct
whose name is ".repo" is not it[*].  A reader would say "ok, I see a
repo.  What is that thing for?  Can I reliably use it to figure out
other things about the repository this reference enumeration is
going on from it?  Ah, no, it sometimes is NULL---what crazy thing
is going on here???".

	Side note.  If it were called
	.check_ref_tips_against_the_object_store_of_this_repository,
	it would be a different story.

>> In which case, this .repo member this patch adds would become a big
>> problem, no?  If we were to validate objects at the tip of the refs
>> against object store, we will always use the object store that
>> belongs to the iterator->repository, so the only valid states for
>> iterator->repo are either NULL or iterator->repository.  That again
>> is the same problem I pointed out already about the parameter the
>> do_for_each_repo_ref() helper that is inviting future bugs, it seems
>> to me.  Wouldn't it make more sense to add
>> 
>>  * iterator->repository that points at the repository in which we
>>    are iterating the refs
>> 
>>  * a bit in iterator that chooses between "do not bother checking"
>>    and "do check the tip of refs against the object store of
>>    iterator->repository
>> 
>> to avoid such a mess?  Perhaps we already have such a bit in the
>> flags word in the ref_iterator but I didn't check.
>
> If we need iterator->repository, then I agree with you. The bit could
> then be DO_FOR_EACH_INCLUDE_BROKEN (which currently exists, and which I
> am removing in this patch, but if we think we should keep it, then we
> should keep it).

I do not care too much about the bit itself.  I have huge trouble
with the idea that representing a single bit with an entire pointer
to a repository, which can cause confusion down the line.  Those who
want to have an access to the repository does not have a reliable
way to get to it, those who do set it can mistakenly set to point at
an unrelated repository.

> Having said all that, it may be better to have a non-NULL repository
> reference in the iterator and retain DO_FOR_EACH_INCLUDE_BROKEN - at the

Yes.

> very least, this is a more gradual change and still leaves open the
> possibility of turning the repository reference into a nullable one.
> Callers that use DO_FOR_EACH_INCLUDE_BROKEN will have to deal with an
> API that is unclear about what is being done with the repository object
> being passed in, but that is the same as the status quo. I'll try it and
> see how it goes (it will probably simplify patch 2 too).

I do not think a structure member of type "struct repository" that
signals anything but "we are not operating in a repository" by being
NULL is a sane interface, so I do not see any positive value in
leaving opent he possibility at all.  The next person who would want
to _optionally_ use a repository for some other purpose (other than
"checking the validity of the object name") may be tempted to add
another member .repo2 next to your .repo---and that would be insane,
given that ref iterator will be iterating over a ref store of a
single repo at any given time.  It is much saner to have a single
"this is the repository the refstore we are iterating over is
attached to" instance, with separate bits "please do validate" and
"do whatever check the second develoepr wanted to signal the need
for with the .repo2 member".

Thanks.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb()
  2021-09-21 16:51 ` [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-09-28  0:29   ` Elijah Newren
  0 siblings, 0 replies; 65+ messages in thread
From: Elijah Newren @ 2021-09-28  0:29 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List

On Tue, Sep 21, 2021 at 9:52 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> After the parent commit and some of its ancestors, the only place
> commits are being accessed through alternates are in the user-facing

s/are in/is in/, since "place" is singular? ("the only place...is in
the user-facing")

> message formatting code. Fix those, and remove the add_submodule_odb()
> calls.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  merge-ort.c                | 18 ++++-------------
>  merge-recursive.c          | 41 +++++++++++++++++++-------------------
>  strbuf.c                   | 12 ++++++++---
>  strbuf.h                   |  6 ++++--
>  t/t6437-submodule-merge.sh |  3 +++
>  5 files changed, 40 insertions(+), 40 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index b8efaee8e0..a4aad8f33f 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -628,6 +628,7 @@ static int err(struct merge_options *opt, const char *err, ...)
>
>  static void format_commit(struct strbuf *sb,
>                           int indent,
> +                         struct repository *repo,
>                           struct commit *commit)
>  {
>         struct merge_remote_desc *desc;
> @@ -641,7 +642,7 @@ static void format_commit(struct strbuf *sb,
>                 return;
>         }
>
> -       format_commit_message(commit, "%h %s", sb, &ctx);
> +       repo_format_commit_message(repo, commit, "%h %s", sb, &ctx);
>         strbuf_addch(sb, '\n');
>  }
>
> @@ -1566,17 +1567,6 @@ static int merge_submodule(struct merge_options *opt,
>         if (is_null_oid(b))
>                 return 0;
>
> -       /*
> -        * NEEDSWORK: Remove this when all submodule object accesses are
> -        * through explicitly specified repositores.

This removes a typo too.  :-)

> -        */
> -       if (add_submodule_odb(path)) {
> -               path_msg(opt, path, 0,
> -                        _("Failed to merge submodule %s (not checked out)"),
> -                        path);
> -               return 0;
> -       }
> -
>         if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
>                 path_msg(opt, path, 0,
>                                 _("Failed to merge submodule %s (not checked out)"),
> @@ -1641,7 +1631,7 @@ static int merge_submodule(struct merge_options *opt,
>                 break;
>
>         case 1:
> -               format_commit(&sb, 4,
> +               format_commit(&sb, 4, &subrepo,
>                               (struct commit *)merges.objects[0].item);
>                 path_msg(opt, path, 0,
>                          _("Failed to merge submodule %s, but a possible merge "
> @@ -1658,7 +1648,7 @@ static int merge_submodule(struct merge_options *opt,
>                 break;
>         default:
>                 for (i = 0; i < merges.nr; i++)
> -                       format_commit(&sb, 4,
> +                       format_commit(&sb, 4, &subrepo,
>                                       (struct commit *)merges.objects[i].item);
>                 path_msg(opt, path, 0,
>                          _("Failed to merge submodule %s, but multiple "
> diff --git a/merge-recursive.c b/merge-recursive.c
> index fc8ac39d8c..6e8fb39315 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -337,7 +337,9 @@ static void output(struct merge_options *opt, int v, const char *fmt, ...)
>                 flush_output(opt);
>  }
>
> -static void output_commit_title(struct merge_options *opt, struct commit *commit)
> +static void repo_output_commit_title(struct merge_options *opt,
> +                                    struct repository *repo,
> +                                    struct commit *commit)
>  {
>         struct merge_remote_desc *desc;
>
> @@ -346,23 +348,29 @@ static void output_commit_title(struct merge_options *opt, struct commit *commit
>         if (desc)
>                 strbuf_addf(&opt->obuf, "virtual %s\n", desc->name);
>         else {
> -               strbuf_add_unique_abbrev(&opt->obuf, &commit->object.oid,
> -                                        DEFAULT_ABBREV);
> +               strbuf_repo_add_unique_abbrev(&opt->obuf, repo,
> +                                             &commit->object.oid,
> +                                             DEFAULT_ABBREV);
>                 strbuf_addch(&opt->obuf, ' ');
> -               if (parse_commit(commit) != 0)
> +               if (repo_parse_commit(repo, commit) != 0)
>                         strbuf_addstr(&opt->obuf, _("(bad commit)\n"));
>                 else {
>                         const char *title;
> -                       const char *msg = get_commit_buffer(commit, NULL);
> +                       const char *msg = repo_get_commit_buffer(repo, commit, NULL);
>                         int len = find_commit_subject(msg, &title);
>                         if (len)
>                                 strbuf_addf(&opt->obuf, "%.*s\n", len, title);
> -                       unuse_commit_buffer(commit, msg);
> +                       repo_unuse_commit_buffer(repo, commit, msg);
>                 }
>         }
>         flush_output(opt);
>  }
>
> +static void output_commit_title(struct merge_options *opt, struct commit *commit)
> +{
> +       repo_output_commit_title(opt, the_repository, commit);
> +}
> +
>  static int add_cacheinfo(struct merge_options *opt,
>                          const struct diff_filespec *blob,
>                          const char *path, int stage, int refresh, int options)
> @@ -1152,14 +1160,14 @@ static int find_first_merges(struct repository *repo,
>         return result->nr;
>  }
>
> -static void print_commit(struct commit *commit)
> +static void print_commit(struct repository *repo, struct commit *commit)
>  {
>         struct strbuf sb = STRBUF_INIT;
>         struct pretty_print_context ctx = {0};
>         ctx.date_mode.type = DATE_NORMAL;
>         /* FIXME: Merge this with output_commit_title() */
>         assert(!merge_remote_util(commit));
> -       format_commit_message(commit, " %h: %m %s", &sb, &ctx);
> +       repo_format_commit_message(repo, commit, " %h: %m %s", &sb, &ctx);
>         fprintf(stderr, "%s\n", sb.buf);
>         strbuf_release(&sb);
>  }
> @@ -1199,15 +1207,6 @@ static int merge_submodule(struct merge_options *opt,
>         if (is_null_oid(b))
>                 return 0;
>
> -       /*
> -        * NEEDSWORK: Remove this when all submodule object accesses are
> -        * through explicitly specified repositores.
> -        */
> -       if (add_submodule_odb(path)) {
> -               output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
> -               return 0;
> -       }
> -
>         if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
>                 output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
>                 return 0;
> @@ -1232,7 +1231,7 @@ static int merge_submodule(struct merge_options *opt,
>                 oidcpy(result, b);
>                 if (show(opt, 3)) {
>                         output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
> -                       output_commit_title(opt, commit_b);
> +                       repo_output_commit_title(opt, &subrepo, commit_b);
>                 } else if (show(opt, 2))
>                         output(opt, 2, _("Fast-forwarding submodule %s"), path);
>                 else
> @@ -1245,7 +1244,7 @@ static int merge_submodule(struct merge_options *opt,
>                 oidcpy(result, a);
>                 if (show(opt, 3)) {
>                         output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
> -                       output_commit_title(opt, commit_a);
> +                       repo_output_commit_title(opt, &subrepo, commit_a);
>                 } else if (show(opt, 2))
>                         output(opt, 2, _("Fast-forwarding submodule %s"), path);
>                 else
> @@ -1277,7 +1276,7 @@ static int merge_submodule(struct merge_options *opt,
>         case 1:
>                 output(opt, 1, _("Failed to merge submodule %s (not fast-forward)"), path);
>                 output(opt, 2, _("Found a possible merge resolution for the submodule:\n"));
> -               print_commit((struct commit *) merges.objects[0].item);
> +               print_commit(&subrepo, (struct commit *) merges.objects[0].item);
>                 output(opt, 2, _(
>                        "If this is correct simply add it to the index "
>                        "for example\n"
> @@ -1290,7 +1289,7 @@ static int merge_submodule(struct merge_options *opt,
>         default:
>                 output(opt, 1, _("Failed to merge submodule %s (multiple merges found)"), path);
>                 for (i = 0; i < merges.nr; i++)
> -                       print_commit((struct commit *) merges.objects[i].item);
> +                       print_commit(&subrepo, (struct commit *) merges.objects[i].item);
>         }
>
>         object_array_clear(&merges);
> diff --git a/strbuf.c b/strbuf.c
> index c8a5789694..b22e981655 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
>         strbuf_setlen(sb, sb->len + len);
>  }
>
> -void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
> -                             int abbrev_len)
> +void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
> +                                  const struct object_id *oid, int abbrev_len)
>  {
>         int r;
>         strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
> -       r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
> +       r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
>         strbuf_setlen(sb, sb->len + r);
>  }
>
> +void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
> +                             int abbrev_len)
> +{
> +       strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
> +}
> +
>  /*
>   * Returns the length of a line, without trailing spaces.
>   *
> diff --git a/strbuf.h b/strbuf.h
> index 5b1113abf8..2d9e01c16f 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -634,8 +634,10 @@ void strbuf_list_free(struct strbuf **list);
>   * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
>   * the strbuf `sb`.
>   */
> -void strbuf_add_unique_abbrev(struct strbuf *sb,
> -                             const struct object_id *oid,
> +struct repository;
> +void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
> +                                  const struct object_id *oid, int abbrev_len);
> +void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
>                               int abbrev_len);
>
>  /**
> diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
> index e5e89c2045..178413c22f 100755
> --- a/t/t6437-submodule-merge.sh
> +++ b/t/t6437-submodule-merge.sh
> @@ -5,6 +5,9 @@ test_description='merging with submodules'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
> +export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
> +
>  . ./test-lib.sh
>  . "$TEST_DIRECTORY"/lib-merge.sh
>
> --
> 2.33.0.464.g1972c5931b-goog

Modulo the minor grammar error in the commit message; this looks good to me:

Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 0/9] No more adding submodule ODB as alternate
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (9 preceding siblings ...)
  2021-09-23 18:05 ` [PATCH 0/9] No more " Junio C Hamano
@ 2021-09-28 20:10 ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
                     ` (8 more replies)
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
  12 siblings, 9 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

This is on a merge of jk/ref-paranoia and jt/add-submodule-odb-clean-up.

As requested, I've rebased this on jk/ref-paranoia and updated the ref
iterator code to no longer remove the DO_FOR_EACH_INCLUDE_BROKEN flag.
I've also changed how I handled the new repository field - instead of
storing it in the backend-independent struct ref_iterator, I now have
each backend handling it. This is a smaller change from the status quo
(each backend having implicit dependence on the_repository -> each
backend having explicit dependence on a repo).

The first 3 patches are rewritten, and the last 5 patches are the same
as before. Patch 4 is also the same as before, except that a change to
do_for_each_ref() to add a repo parameter was previously done in patch 1
of the v1 patchset and is no longer done in the corresponding patch of
this v2 patchset, so that needed to be done there.

Jonathan Tan (9):
  refs: plumb repo param in begin-iterator functions
  refs: teach arbitrary repo support to iterators
  refs: peeling non-the_repository iterators is BUG
  refs: teach refs_for_each_ref() arbitrary repos
  merge-{ort,recursive}: remove add_submodule_odb()
  object-file: only register submodule ODB if needed
  submodule: pass repo to check_has_commit()
  refs: change refs_for_each_ref_in() to take repo
  submodule: trace adding submodule ODB as alternate

 builtin/submodule--helper.c            | 16 ++++---
 merge-ort.c                            | 18 ++------
 merge-recursive.c                      | 41 ++++++++---------
 object-file.c                          |  3 +-
 object-name.c                          |  4 +-
 refs.c                                 | 63 ++++++++++++++------------
 refs.h                                 | 12 ++---
 refs/debug.c                           |  4 +-
 refs/files-backend.c                   | 13 ++++--
 refs/packed-backend.c                  | 17 +++++--
 refs/ref-cache.c                       | 10 ++++
 refs/ref-cache.h                       |  1 +
 refs/refs-internal.h                   |  4 +-
 revision.c                             | 12 ++---
 strbuf.c                               | 12 +++--
 strbuf.h                               |  6 ++-
 submodule.c                            | 28 ++++++++++--
 t/helper/test-ref-store.c              | 20 ++++----
 t/t5526-fetch-submodules.sh            |  3 ++
 t/t5531-deep-submodule-push.sh         |  3 ++
 t/t5545-push-options.sh                |  3 ++
 t/t5572-pull-submodule.sh              |  3 ++
 t/t6437-submodule-merge.sh             |  3 ++
 t/t7418-submodule-sparse-gitmodules.sh |  3 ++
 24 files changed, 186 insertions(+), 116 deletions(-)

Range-diff against v1:
 1:  493fff7f47 <  -:  ---------- refs: make _advance() check struct repo, not flag
 2:  e404b5eb1a <  -:  ---------- refs: add repo paramater to _iterator_peel()
 -:  ---------- >  1:  e364b13a37 refs: plumb repo param in begin-iterator functions
 3:  3ed77eedb8 !  2:  ec153eff7b refs iterator: support non-the_repository advance
    @@ Metadata
     Author: Jonathan Tan <jonathantanmy@google.com>
     
      ## Commit message ##
    -    refs iterator: support non-the_repository advance
    +    refs: teach arbitrary repo support to iterators
     
    -    Support repositories other than the_repository when advancing through an
    -    iterator.
    +    Note that should_pack_ref() is called when writing refs, which is only
    +    supported for the_repository, hence the_repository is hardcoded there.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
    @@ refs.c: int refname_is_safe(const char *refname)
      	}
     
      ## refs/files-backend.c ##
    +@@ refs/files-backend.c: struct files_ref_iterator {
    + 	struct ref_iterator base;
    + 
    + 	struct ref_iterator *iter0;
    ++	struct repository *repo;
    + 	unsigned int flags;
    + };
    + 
    +@@ refs/files-backend.c: static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
    + 
    + 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
    + 		    !ref_resolves_to_object(iter->iter0->refname,
    ++					    iter->repo,
    + 					    iter->iter0->oid,
    + 					    iter->iter0->flags))
    + 			continue;
    +@@ refs/files-backend.c: static struct ref_iterator *files_ref_iterator_begin(
    + 	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
    + 			       overlay_iter->ordered);
    + 	iter->iter0 = overlay_iter;
    ++	iter->repo = repo;
    + 	iter->flags = flags;
    + 
    + 	return ref_iterator;
     @@ refs/files-backend.c: static int should_pack_ref(const char *refname,
      		return 0;
      
    @@ refs/files-backend.c: static int should_pack_ref(const char *refname,
      
      	return 1;
     
    - ## refs/iterator.c ##
    -@@ refs/iterator.c: int ref_iterator_advance(struct ref_iterator *ref_iterator)
    - {
    - 	int ok;
    + ## refs/packed-backend.c ##
    +@@ refs/packed-backend.c: struct packed_ref_iterator {
    + 	struct object_id oid, peeled;
    + 	struct strbuf refname_buf;
    + 
    ++	struct repository *repo;
    + 	unsigned int flags;
    + };
      
    --	if (ref_iterator->repo && ref_iterator->repo != the_repository)
    --		/*
    --		 * NEEDSWORK: make ref_resolves_to_object() support
    --		 * arbitrary repositories
    --		 */
    --		BUG("ref_iterator->repo must be NULL or the_repository");
    - 	while ((ok = ref_iterator->vtable->advance(ref_iterator)) == ITER_OK) {
    - 		if (ref_iterator->repo &&
    - 		    !ref_resolves_to_object(ref_iterator->refname,
    -+					    ref_iterator->repo,
    - 					    ref_iterator->oid,
    - 					    ref_iterator->flags))
    +@@ refs/packed-backend.c: static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
      			continue;
    + 
    + 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
    +-		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
    +-					    iter->flags))
    ++		    !ref_resolves_to_object(iter->base.refname, iter->repo,
    ++					    &iter->oid, iter->flags))
    + 			continue;
    + 
    + 		return ITER_OK;
    +@@ refs/packed-backend.c: static struct ref_iterator *packed_ref_iterator_begin(
    + 
    + 	iter->base.oid = &iter->oid;
    + 
    ++	iter->repo = repo;
    + 	iter->flags = flags;
    + 
    + 	if (prefix && *prefix)
     
      ## refs/refs-internal.h ##
     @@ refs/refs-internal.h: int refname_is_safe(const char *refname);
 -:  ---------- >  3:  dd1a8871f4 refs: peeling non-the_repository iterators is BUG
 4:  f3a45fba84 !  4:  da0c9c2d44 refs: teach refs_for_each_ref() arbitrary repos
    @@ refs.c: int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
      }
      
      struct ref_iterator *refs_ref_iterator_begin(
    +@@ refs.c: static int do_for_each_ref_helper(struct repository *r,
    + 
    + static int do_for_each_ref(struct ref_store *refs, const char *prefix,
    + 			   each_ref_fn fn, int trim,
    ++			   struct repository *repo,
    + 			   enum do_for_each_ref_flags flags, void *cb_data)
    + {
    + 	struct ref_iterator *iter;
     @@ refs.c: static int do_for_each_ref(struct ref_store *refs, const char *prefix,
    + 	if (!refs)
    + 		return 0;
    + 
    +-	iter = refs_ref_iterator_begin(refs, prefix, trim, the_repository, flags);
    ++	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
    + 
    +-	return do_for_each_repo_ref_iterator(the_repository, iter,
    ++	return do_for_each_repo_ref_iterator(repo, iter,
      					do_for_each_ref_helper, &hp);
      }
      
     -int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
     +int refs_for_each_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
      {
    --	return do_for_each_ref(refs, "", fn, 0, the_repository, 0, cb_data);
    +-	return do_for_each_ref(refs, "", fn, 0, 0, cb_data);
     +	return do_for_each_ref(get_main_ref_store(repo), "", fn, 0, repo, 0, cb_data);
      }
      
    @@ refs.c: static int do_for_each_ref(struct ref_store *refs, const char *prefix,
      }
      
      int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
    + 			 each_ref_fn fn, void *cb_data)
    + {
    +-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), 0, cb_data);
    ++	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
    + }
    + 
    + int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
    +@@ refs.c: int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
    + int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data)
    + {
    + 	return do_for_each_ref(get_main_ref_store(the_repository),
    +-			       prefix, fn, 0, 0, cb_data);
    ++			       prefix, fn, 0, the_repository, 0, cb_data);
    + }
    + 
    + int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
    + 			     each_ref_fn fn, void *cb_data)
    + {
    +-	return do_for_each_ref(refs, prefix, fn, 0, 0, cb_data);
    ++	return do_for_each_ref(refs, prefix, fn, 0, the_repository, 0, cb_data);
    + }
    + 
    + int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
    +@@ refs.c: int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
    + 	int ret;
    + 	strbuf_addf(&buf, "%srefs/", get_git_namespace());
    + 	ret = do_for_each_ref(get_main_ref_store(the_repository),
    +-			      buf.buf, fn, 0, 0, cb_data);
    ++			      buf.buf, fn, 0, the_repository, 0, cb_data);
    + 	strbuf_release(&buf);
    + 	return ret;
    + }
    + 
    + int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
    + {
    +-	return do_for_each_ref(refs, "", fn, 0,
    ++	return do_for_each_ref(refs, "", fn, 0, the_repository,
    + 			       DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
    + }
    + 
     @@ refs.c: static struct ref_store *ref_store_init(const char *gitdir,
      
      struct ref_store *get_main_ref_store(struct repository *r)
 5:  0655a321bd !  5:  dd70820d66 merge-{ort,recursive}: remove add_submodule_odb()
    @@ Commit message
         merge-{ort,recursive}: remove add_submodule_odb()
     
         After the parent commit and some of its ancestors, the only place
    -    commits are being accessed through alternates are in the user-facing
    +    commits are being accessed through alternates is in the user-facing
         message formatting code. Fix those, and remove the add_submodule_odb()
         calls.
     
 6:  a62741e779 =  6:  9c5ce004b2 object-file: only register submodule ODB if needed
 7:  20adc937b7 =  7:  1fca3b1a25 submodule: pass repo to check_has_commit()
 8:  efebc4e97d !  8:  7b5087a14d refs: change refs_for_each_ref_in() to take repo
    @@ refs.c: int for_each_ref(each_ref_fn fn, void *cb_data)
     +	return refs_for_each_ref_in(the_repository, prefix, fn, cb_data);
      }
      
    - int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsigned int broken)
    + int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data)
     
      ## refs.h ##
     @@ refs.h: int refs_head_ref(struct repository *repo,
 9:  933c505de8 =  9:  cef2a97840 submodule: trace adding submodule ODB as alternate
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 22:24     ` Junio C Hamano
  2021-09-28 20:10   ` [PATCH v2 2/9] refs: teach arbitrary repo support to iterators Jonathan Tan
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories, plumb a repository parameter in all functions
that create iterators. There are no changes to program logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 9 +++++----
 refs/debug.c          | 4 ++--
 refs/files-backend.c  | 3 ++-
 refs/packed-backend.c | 8 +++++++-
 refs/refs-internal.h  | 3 ++-
 5 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/refs.c b/refs.c
index 2be0d0f057..6f7b3447a7 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,6 +1414,7 @@ int head_ref(each_ref_fn fn, void *cb_data)
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
 		const char *prefix, int trim,
+		struct repository *repo,
 		enum do_for_each_ref_flags flags)
 {
 	struct ref_iterator *iter;
@@ -1429,7 +1430,7 @@ struct ref_iterator *refs_ref_iterator_begin(
 		}
 	}
 
-	iter = refs->be->iterator_begin(refs, prefix, flags);
+	iter = refs->be->iterator_begin(refs, prefix, repo, flags);
 
 	/*
 	 * `iterator_begin()` already takes care of prefix, but we
@@ -1464,7 +1465,7 @@ static int do_for_each_repo_ref(struct repository *r, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, r, flags);
 
 	return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
 }
@@ -1495,7 +1496,7 @@ static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, the_repository, flags);
 
 	return do_for_each_repo_ref_iterator(the_repository, iter,
 					do_for_each_ref_helper, &hp);
@@ -2260,7 +2261,7 @@ int refs_verify_refname_available(struct ref_store *refs,
 	strbuf_addstr(&dirname, refname + dirname.len);
 	strbuf_addch(&dirname, '/');
 
-	iter = refs_ref_iterator_begin(refs, dirname.buf, 0,
+	iter = refs_ref_iterator_begin(refs, dirname.buf, 0, the_repository,
 				       DO_FOR_EACH_INCLUDE_BROKEN);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		if (skip &&
diff --git a/refs/debug.c b/refs/debug.c
index 1a7a9e11cf..753d5da893 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -224,11 +224,11 @@ static struct ref_iterator_vtable debug_ref_iterator_vtable = {
 
 static struct ref_iterator *
 debug_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
-			 unsigned int flags)
+			 struct repository *repo, unsigned int flags)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
 	struct ref_iterator *res =
-		drefs->refs->be->iterator_begin(drefs->refs, prefix, flags);
+		drefs->refs->be->iterator_begin(drefs->refs, prefix, repo, flags);
 	struct debug_ref_iterator *diter = xcalloc(1, sizeof(*diter));
 	base_ref_iterator_init(&diter->base, &debug_ref_iterator_vtable, 1);
 	diter->iter = res;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1148c0cf09..f0cbea41c9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -798,7 +798,7 @@ static struct ref_iterator_vtable files_ref_iterator_vtable = {
 
 static struct ref_iterator *files_ref_iterator_begin(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags)
+		const char *prefix, struct repository *repo, unsigned int flags)
 {
 	struct files_ref_store *refs;
 	struct ref_iterator *loose_iter, *packed_iter, *overlay_iter;
@@ -844,6 +844,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 */
 	packed_iter = refs_ref_iterator_begin(
 			refs->packed_ref_store, prefix, 0,
+			repo,
 			DO_FOR_EACH_INCLUDE_BROKEN);
 
 	overlay_iter = overlay_ref_iterator_begin(loose_iter, packed_iter);
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f8aa97d799..94fb1042a2 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -913,7 +913,7 @@ static struct ref_iterator_vtable packed_ref_iterator_vtable = {
 
 static struct ref_iterator *packed_ref_iterator_begin(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags)
+		const char *prefix, struct repository *repo, unsigned int flags)
 {
 	struct packed_ref_store *refs;
 	struct snapshot *snapshot;
@@ -1135,8 +1135,14 @@ static int write_with_updates(struct packed_ref_store *refs,
 	 * of the lists each time through the loop. When the current
 	 * list of refs is exhausted, set iter to NULL. When the list
 	 * of updates is exhausted, leave i set to updates->nr.
+	 *
+	 * Note that the repository does not matter since
+	 * DO_FOR_EACH_INCLUDE_BROKEN means that we do not access any objects,
+	 * but the_repository here makes the most sense because we only support
+	 * writing refs to the main repository.
 	 */
 	iter = packed_ref_iterator_begin(&refs->base, "",
+					 the_repository,
 					 DO_FOR_EACH_INCLUDE_BROKEN);
 	if ((ok = ref_iterator_advance(iter)) != ITER_OK)
 		iter = NULL;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 96911fb26e..9440be51da 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -382,6 +382,7 @@ int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
 		const char *prefix, int trim,
+		struct repository *repo,
 		enum do_for_each_ref_flags flags);
 
 /*
@@ -583,7 +584,7 @@ typedef int copy_ref_fn(struct ref_store *ref_store,
  */
 typedef struct ref_iterator *ref_iterator_begin_fn(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags);
+		const char *prefix, struct repository *repo, unsigned int flags);
 
 /* reflog functions */
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 2/9] refs: teach arbitrary repo support to iterators
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 22:35     ` Junio C Hamano
  2021-09-28 20:10   ` [PATCH v2 3/9] refs: peeling non-the_repository iterators is BUG Jonathan Tan
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 3 ++-
 refs/files-backend.c  | 5 ++++-
 refs/packed-backend.c | 6 ++++--
 refs/refs-internal.h  | 1 +
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 6f7b3447a7..5163e064ae 100644
--- a/refs.c
+++ b/refs.c
@@ -255,12 +255,13 @@ int refname_is_safe(const char *refname)
  * does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags)
 {
 	if (flags & REF_ISBROKEN)
 		return 0;
-	if (!has_object_file(oid)) {
+	if (!repo_has_object_file(repo, oid)) {
 		error(_("%s does not point to a valid object!"), refname);
 		return 0;
 	}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f0cbea41c9..4d883d9a89 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -730,6 +730,7 @@ struct files_ref_iterator {
 	struct ref_iterator base;
 
 	struct ref_iterator *iter0;
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -751,6 +752,7 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
 		    !ref_resolves_to_object(iter->iter0->refname,
+					    iter->repo,
 					    iter->iter0->oid,
 					    iter->iter0->flags))
 			continue;
@@ -854,6 +856,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
 			       overlay_iter->ordered);
 	iter->iter0 = overlay_iter;
+	iter->repo = repo;
 	iter->flags = flags;
 
 	return ref_iterator;
@@ -1138,7 +1141,7 @@ static int should_pack_ref(const char *refname,
 		return 0;
 
 	/* Do not pack broken refs: */
-	if (!ref_resolves_to_object(refname, oid, ref_flags))
+	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
 		return 0;
 
 	return 1;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 94fb1042a2..55c8bd3081 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -776,6 +776,7 @@ struct packed_ref_iterator {
 	struct object_id oid, peeled;
 	struct strbuf refname_buf;
 
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -864,8 +865,8 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			continue;
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
-					    iter->flags))
+		    !ref_resolves_to_object(iter->base.refname, iter->repo,
+					    &iter->oid, iter->flags))
 			continue;
 
 		return ITER_OK;
@@ -954,6 +955,7 @@ static struct ref_iterator *packed_ref_iterator_begin(
 
 	iter->base.oid = &iter->oid;
 
+	iter->repo = repo;
 	iter->flags = flags;
 
 	if (prefix && *prefix)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 9440be51da..e7b0a0a658 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -66,6 +66,7 @@ int refname_is_safe(const char *refname);
  * referred-to object does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags);
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 3/9] refs: peeling non-the_repository iterators is BUG
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 2/9] refs: teach arbitrary repo support to iterators Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs/files-backend.c  |  5 +++--
 refs/packed-backend.c |  3 +++
 refs/ref-cache.c      | 10 ++++++++++
 refs/ref-cache.h      |  1 +
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4d883d9a89..3289beb03e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -831,7 +831,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 */
 
 	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
-					      prefix, 1);
+					      prefix, repo, 1);
 
 	/*
 	 * The packed-refs file might contain broken references, for
@@ -1164,7 +1164,8 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 
 	packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
-	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
+	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL,
+					the_repository, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		/*
 		 * If the loose reference can be packed, add an entry
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 55c8bd3081..e2f57a013e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -884,6 +884,9 @@ static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	struct packed_ref_iterator *iter =
 		(struct packed_ref_iterator *)ref_iterator;
 
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
+
 	if ((iter->base.flags & REF_KNOWS_PEELED)) {
 		oidcpy(peeled, &iter->peeled);
 		return is_null_oid(&iter->peeled) ? -1 : 0;
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 49d732f6db..97a6ac349e 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -435,6 +435,8 @@ struct cache_ref_iterator {
 	 * on from there.)
 	 */
 	struct cache_ref_iterator_level *levels;
+
+	struct repository *repo;
 };
 
 static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
@@ -491,6 +493,11 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
 				   struct object_id *peeled)
 {
+	struct cache_ref_iterator *iter =
+		(struct cache_ref_iterator *)ref_iterator;
+
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
 	return peel_object(ref_iterator->oid, peeled) ? -1 : 0;
 }
 
@@ -513,6 +520,7 @@ static struct ref_iterator_vtable cache_ref_iterator_vtable = {
 
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir)
 {
 	struct ref_dir *dir;
@@ -547,5 +555,7 @@ struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 		level->prefix_state = PREFIX_CONTAINS_DIR;
 	}
 
+	iter->repo = repo;
+
 	return ref_iterator;
 }
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index 3bfb89d2b3..7877bf86ed 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -238,6 +238,7 @@ struct ref_entry *find_ref_entry(struct ref_dir *dir, const char *refname);
  */
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir);
 
 #endif /* REFS_REF_CACHE_H */
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 3/9] refs: peeling non-the_repository iterators is BUG Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 22:49     ` Junio C Hamano
  2021-09-28 20:10   ` [PATCH v2 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

A subsequent patch needs to perform a revision walk with --all. As seen
from handle_revision_pseudo_opt() in revision.c, refs_for_each_ref()
needs to be updated to take a repository struct and pass it to the
underlying ref iterator mechanism. This is so that refs can be checked
if they resolve to an existing object and in doing so, non-resolving
refs can be skipped over. (refs_head_ref() doesn't seem to read any
objects and doesn't need this treatment.) Update refs_for_each_ref()
accordingly.

Now that get_main_ref_store() can take repositories other than
the_repository, ensure that it sets the correct flags according to the
repository passed as an argument.

The signatures of some other functions need to be changed too for
consistency (because of handle_refs() in revision.c), so do that in this
patch too.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/submodule--helper.c | 16 +++++++-----
 object-name.c               |  4 +--
 refs.c                      | 49 ++++++++++++++++++++-----------------
 refs.h                      | 10 ++++----
 revision.c                  | 12 ++++-----
 submodule.c                 | 10 ++++++--
 6 files changed, 57 insertions(+), 44 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 88ce6be69c..d951b7acc5 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -801,15 +801,16 @@ static void status_submodule(const char *path, const struct object_id *ce_oid,
 			     displaypath);
 	} else if (!(flags & OPT_CACHED)) {
 		struct object_id oid;
-		struct ref_store *refs = get_submodule_ref_store(path);
+		struct repository subrepo;
 
-		if (!refs) {
+		if (repo_submodule_init(&subrepo, the_repository, path, null_oid())) {
 			print_status(flags, '-', path, ce_oid, displaypath);
 			goto cleanup;
 		}
-		if (refs_head_ref(refs, handle_submodule_head_ref, &oid))
+		if (refs_head_ref(&subrepo, handle_submodule_head_ref, &oid))
 			die(_("could not resolve HEAD ref inside the "
 			      "submodule '%s'"), path);
+		repo_clear(&subrepo);
 
 		print_status(flags, '+', path, &oid, displaypath);
 	} else {
@@ -1018,9 +1019,12 @@ static void generate_submodule_summary(struct summary_cb *info,
 
 	if (!info->cached && oideq(&p->oid_dst, null_oid())) {
 		if (S_ISGITLINK(p->mod_dst)) {
-			struct ref_store *refs = get_submodule_ref_store(p->sm_path);
-			if (refs)
-				refs_head_ref(refs, handle_submodule_head_ref, &p->oid_dst);
+			struct repository subrepo;
+
+			if (!repo_submodule_init(&subrepo, the_repository, p->sm_path, null_oid())) {
+				refs_head_ref(&subrepo, handle_submodule_head_ref, &p->oid_dst);
+				repo_clear(&subrepo);
+			}
 		} else if (S_ISLNK(p->mod_dst) || S_ISREG(p->mod_dst)) {
 			struct stat st;
 			int fd = open(p->sm_path, O_RDONLY);
diff --git a/object-name.c b/object-name.c
index fdff4601b2..f3012b5ec3 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1822,8 +1822,8 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
 
 			cb.repo = repo;
 			cb.list = &list;
-			refs_for_each_ref(get_main_ref_store(repo), handle_one_ref, &cb);
-			refs_head_ref(get_main_ref_store(repo), handle_one_ref, &cb);
+			refs_for_each_ref(repo, handle_one_ref, &cb);
+			refs_head_ref(repo, handle_one_ref, &cb);
 			commit_list_sort_by_date(&list);
 			return get_oid_oneline(repo, name + 2, oid, list);
 		}
diff --git a/refs.c b/refs.c
index 5163e064ae..15a3aa47cf 100644
--- a/refs.c
+++ b/refs.c
@@ -408,34 +408,34 @@ void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_li
 	for_each_rawref(warn_if_dangling_symref, &data);
 }
 
-int refs_for_each_tag_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_tag_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/tags/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/tags/", fn, cb_data);
 }
 
 int for_each_tag_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_tag_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_tag_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_branch_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/heads/", fn, cb_data);
 }
 
 int for_each_branch_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_branch_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_branch_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_remote_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data);
+	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/remotes/", fn, cb_data);
 }
 
 int for_each_remote_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_remote_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_remote_ref(the_repository, fn, cb_data);
 }
 
 int head_ref_namespaced(each_ref_fn fn, void *cb_data)
@@ -1395,12 +1395,12 @@ int refs_rename_ref_available(struct ref_store *refs,
 	return ok;
 }
 
-int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_head_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
 	struct object_id oid;
 	int flag;
 
-	if (!refs_read_ref_full(refs, "HEAD", RESOLVE_REF_READING,
+	if (!refs_read_ref_full(get_main_ref_store(repo), "HEAD", RESOLVE_REF_READING,
 				&oid, &flag))
 		return fn("HEAD", &oid, flag, cb_data);
 
@@ -1409,7 +1409,7 @@ int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 
 int head_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_head_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_head_ref(the_repository, fn, cb_data);
 }
 
 struct ref_iterator *refs_ref_iterator_begin(
@@ -1489,6 +1489,7 @@ static int do_for_each_ref_helper(struct repository *r,
 
 static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 			   each_ref_fn fn, int trim,
+			   struct repository *repo,
 			   enum do_for_each_ref_flags flags, void *cb_data)
 {
 	struct ref_iterator *iter;
@@ -1497,26 +1498,26 @@ static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, the_repository, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
 
-	return do_for_each_repo_ref_iterator(the_repository, iter,
+	return do_for_each_repo_ref_iterator(repo, iter,
 					do_for_each_ref_helper, &hp);
 }
 
-int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+int refs_for_each_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, "", fn, 0, 0, cb_data);
+	return do_for_each_ref(get_main_ref_store(repo), "", fn, 0, repo, 0, cb_data);
 }
 
 int for_each_ref(each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref(get_main_ref_store(the_repository), fn, cb_data);
+	return refs_for_each_ref(the_repository, fn, cb_data);
 }
 
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 			 each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), 0, cb_data);
+	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
 }
 
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
@@ -1527,13 +1528,13 @@ int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data)
 {
 	return do_for_each_ref(get_main_ref_store(the_repository),
-			       prefix, fn, 0, 0, cb_data);
+			       prefix, fn, 0, the_repository, 0, cb_data);
 }
 
 int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
 			     each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, 0, 0, cb_data);
+	return do_for_each_ref(refs, prefix, fn, 0, the_repository, 0, cb_data);
 }
 
 int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
@@ -1549,14 +1550,14 @@ int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
 	int ret;
 	strbuf_addf(&buf, "%srefs/", get_git_namespace());
 	ret = do_for_each_ref(get_main_ref_store(the_repository),
-			      buf.buf, fn, 0, 0, cb_data);
+			      buf.buf, fn, 0, the_repository, 0, cb_data);
 	strbuf_release(&buf);
 	return ret;
 }
 
 int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, "", fn, 0,
+	return do_for_each_ref(refs, "", fn, 0, the_repository,
 			       DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
 }
 
@@ -1891,13 +1892,15 @@ static struct ref_store *ref_store_init(const char *gitdir,
 
 struct ref_store *get_main_ref_store(struct repository *r)
 {
+	unsigned flags = r == the_repository ?
+		REF_STORE_ALL_CAPS : REF_STORE_READ | REF_STORE_ODB;
 	if (r->refs_private)
 		return r->refs_private;
 
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir, flags);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
diff --git a/refs.h b/refs.h
index 10e7696a64..e9ecb5e54e 100644
--- a/refs.h
+++ b/refs.h
@@ -316,17 +316,17 @@ typedef int each_repo_ref_fn(struct repository *r,
  * modifies the reference also returns a nonzero value to immediately
  * stop the iteration. Returned references are sorted.
  */
-int refs_head_ref(struct ref_store *refs,
+int refs_head_ref(struct repository *repo,
 		  each_ref_fn fn, void *cb_data);
-int refs_for_each_ref(struct ref_store *refs,
+int refs_for_each_ref(struct repository *repo,
 		      each_ref_fn fn, void *cb_data);
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 			 each_ref_fn fn, void *cb_data);
-int refs_for_each_tag_ref(struct ref_store *refs,
+int refs_for_each_tag_ref(struct repository *repo,
 			  each_ref_fn fn, void *cb_data);
-int refs_for_each_branch_ref(struct ref_store *refs,
+int refs_for_each_branch_ref(struct repository *repo,
 			     each_ref_fn fn, void *cb_data);
-int refs_for_each_remote_ref(struct ref_store *refs,
+int refs_for_each_remote_ref(struct repository *repo,
 			     each_ref_fn fn, void *cb_data);
 
 /* just iterates the head ref. */
diff --git a/revision.c b/revision.c
index 3ad217f2ff..cd34e12b2e 100644
--- a/revision.c
+++ b/revision.c
@@ -1565,7 +1565,7 @@ void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
 
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
-			int (*for_each)(struct ref_store *, each_ref_fn, void *))
+			int (*for_each)(struct repository *, each_ref_fn, void *))
 {
 	struct all_refs_cb cb;
 
@@ -1575,7 +1575,7 @@ static void handle_refs(struct ref_store *refs,
 	}
 
 	init_all_refs_cb(&cb, revs, flags);
-	for_each(refs, handle_one_ref, &cb);
+	for_each(revs->repo, handle_one_ref, &cb);
 }
 
 static void handle_one_reflog_commit(struct object_id *oid, void *cb_data)
@@ -2553,14 +2553,14 @@ static int for_each_bisect_ref(struct ref_store *refs, each_ref_fn fn,
 	return status;
 }
 
-static int for_each_bad_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+static int for_each_bad_bisect_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return for_each_bisect_ref(refs, fn, cb_data, term_bad);
+	return for_each_bisect_ref(get_main_ref_store(repo), fn, cb_data, term_bad);
 }
 
-static int for_each_good_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
+static int for_each_good_bisect_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return for_each_bisect_ref(refs, fn, cb_data, term_good);
+	return for_each_bisect_ref(get_main_ref_store(repo), fn, cb_data, term_good);
 }
 
 static int handle_revision_pseudo_opt(struct rev_info *revs,
diff --git a/submodule.c b/submodule.c
index 62beb8fd5f..bc3ec4a242 100644
--- a/submodule.c
+++ b/submodule.c
@@ -92,8 +92,14 @@ int is_staging_gitmodules_ok(struct index_state *istate)
 static int for_each_remote_ref_submodule(const char *submodule,
 					 each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_remote_ref(get_submodule_ref_store(submodule),
-					fn, cb_data);
+	struct repository subrepo;
+	int ret;
+
+	if (repo_submodule_init(&subrepo, the_repository, submodule, null_oid()))
+		return 0;
+	ret = refs_for_each_remote_ref(&subrepo, fn, cb_data);
+	repo_clear(&subrepo);
+	return ret;
 }
 
 /*
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 5/9] merge-{ort,recursive}: remove add_submodule_odb()
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (3 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 6/9] object-file: only register submodule ODB if needed Jonathan Tan
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 merge-ort.c                | 18 ++++-------------
 merge-recursive.c          | 41 +++++++++++++++++++-------------------
 strbuf.c                   | 12 ++++++++---
 strbuf.h                   |  6 ++++--
 t/t6437-submodule-merge.sh |  3 +++
 5 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b88475475d..fbc5c204c1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -609,6 +609,7 @@ static int err(struct merge_options *opt, const char *err, ...)
 
 static void format_commit(struct strbuf *sb,
 			  int indent,
+			  struct repository *repo,
 			  struct commit *commit)
 {
 	struct merge_remote_desc *desc;
@@ -622,7 +623,7 @@ static void format_commit(struct strbuf *sb,
 		return;
 	}
 
-	format_commit_message(commit, "%h %s", sb, &ctx);
+	repo_format_commit_message(repo, commit, "%h %s", sb, &ctx);
 	strbuf_addch(sb, '\n');
 }
 
@@ -1578,17 +1579,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		path_msg(opt, path, 0,
-			 _("Failed to merge submodule %s (not checked out)"),
-			 path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		path_msg(opt, path, 0,
 				_("Failed to merge submodule %s (not checked out)"),
@@ -1653,7 +1643,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 
 	case 1:
-		format_commit(&sb, 4,
+		format_commit(&sb, 4, &subrepo,
 			      (struct commit *)merges.objects[0].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but a possible merge "
@@ -1670,7 +1660,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 	default:
 		for (i = 0; i < merges.nr; i++)
-			format_commit(&sb, 4,
+			format_commit(&sb, 4, &subrepo,
 				      (struct commit *)merges.objects[i].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but multiple "
diff --git a/merge-recursive.c b/merge-recursive.c
index 5a2d8a60c0..80594153f1 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,7 +334,9 @@ static void output(struct merge_options *opt, int v, const char *fmt, ...)
 		flush_output(opt);
 }
 
-static void output_commit_title(struct merge_options *opt, struct commit *commit)
+static void repo_output_commit_title(struct merge_options *opt,
+				     struct repository *repo,
+				     struct commit *commit)
 {
 	struct merge_remote_desc *desc;
 
@@ -343,23 +345,29 @@ static void output_commit_title(struct merge_options *opt, struct commit *commit
 	if (desc)
 		strbuf_addf(&opt->obuf, "virtual %s\n", desc->name);
 	else {
-		strbuf_add_unique_abbrev(&opt->obuf, &commit->object.oid,
-					 DEFAULT_ABBREV);
+		strbuf_repo_add_unique_abbrev(&opt->obuf, repo,
+					      &commit->object.oid,
+					      DEFAULT_ABBREV);
 		strbuf_addch(&opt->obuf, ' ');
-		if (parse_commit(commit) != 0)
+		if (repo_parse_commit(repo, commit) != 0)
 			strbuf_addstr(&opt->obuf, _("(bad commit)\n"));
 		else {
 			const char *title;
-			const char *msg = get_commit_buffer(commit, NULL);
+			const char *msg = repo_get_commit_buffer(repo, commit, NULL);
 			int len = find_commit_subject(msg, &title);
 			if (len)
 				strbuf_addf(&opt->obuf, "%.*s\n", len, title);
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(repo, commit, msg);
 		}
 	}
 	flush_output(opt);
 }
 
+static void output_commit_title(struct merge_options *opt, struct commit *commit)
+{
+	repo_output_commit_title(opt, the_repository, commit);
+}
+
 static int add_cacheinfo(struct merge_options *opt,
 			 const struct diff_filespec *blob,
 			 const char *path, int stage, int refresh, int options)
@@ -1149,14 +1157,14 @@ static int find_first_merges(struct repository *repo,
 	return result->nr;
 }
 
-static void print_commit(struct commit *commit)
+static void print_commit(struct repository *repo, struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
 	ctx.date_mode.type = DATE_NORMAL;
 	/* FIXME: Merge this with output_commit_title() */
 	assert(!merge_remote_util(commit));
-	format_commit_message(commit, " %h: %m %s", &sb, &ctx);
+	repo_format_commit_message(repo, commit, " %h: %m %s", &sb, &ctx);
 	fprintf(stderr, "%s\n", sb.buf);
 	strbuf_release(&sb);
 }
@@ -1196,15 +1204,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
 		return 0;
@@ -1229,7 +1228,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_b);
+			repo_output_commit_title(opt, &subrepo, commit_b);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1242,7 +1241,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_a);
+			repo_output_commit_title(opt, &subrepo, commit_a);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1274,7 +1273,7 @@ static int merge_submodule(struct merge_options *opt,
 	case 1:
 		output(opt, 1, _("Failed to merge submodule %s (not fast-forward)"), path);
 		output(opt, 2, _("Found a possible merge resolution for the submodule:\n"));
-		print_commit((struct commit *) merges.objects[0].item);
+		print_commit(&subrepo, (struct commit *) merges.objects[0].item);
 		output(opt, 2, _(
 		       "If this is correct simply add it to the index "
 		       "for example\n"
@@ -1287,7 +1286,7 @@ static int merge_submodule(struct merge_options *opt,
 	default:
 		output(opt, 1, _("Failed to merge submodule %s (multiple merges found)"), path);
 		for (i = 0; i < merges.nr; i++)
-			print_commit((struct commit *) merges.objects[i].item);
+			print_commit(&subrepo, (struct commit *) merges.objects[i].item);
 	}
 
 	object_array_clear(&merges);
diff --git a/strbuf.c b/strbuf.c
index c8a5789694..b22e981655 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 	strbuf_setlen(sb, sb->len + len);
 }
 
-void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
-			      int abbrev_len)
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len)
 {
 	int r;
 	strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
-	r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
+	r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
 	strbuf_setlen(sb, sb->len + r);
 }
 
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
+			      int abbrev_len)
+{
+	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
+}
+
 /*
  * Returns the length of a line, without trailing spaces.
  *
diff --git a/strbuf.h b/strbuf.h
index 5b1113abf8..2d9e01c16f 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -634,8 +634,10 @@ void strbuf_list_free(struct strbuf **list);
  * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
  * the strbuf `sb`.
  */
-void strbuf_add_unique_abbrev(struct strbuf *sb,
-			      const struct object_id *oid,
+struct repository;
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len);
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
 			      int abbrev_len);
 
 /**
diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
index e5e89c2045..178413c22f 100755
--- a/t/t6437-submodule-merge.sh
+++ b/t/t6437-submodule-merge.sh
@@ -5,6 +5,9 @@ test_description='merging with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 6/9] object-file: only register submodule ODB if needed
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (4 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 object-file.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index be4f94ecf3..2b988b7c36 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1614,7 +1614,8 @@ static int do_oid_object_info_extended(struct repository *r,
 				break;
 		}
 
-		if (register_all_submodule_odb_as_alternates())
+		if (r == the_repository &&
+		    register_all_submodule_odb_as_alternates())
 			/* We added some alternates; retry */
 			continue;
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 7/9] submodule: pass repo to check_has_commit()
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (5 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 6/9] object-file: only register submodule ODB if needed Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
several tests no longer rely on add_submodule_odb(), so mark these tests
accordingly.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c                            | 16 +++++++++++++---
 t/t5526-fetch-submodules.sh            |  3 +++
 t/t5572-pull-submodule.sh              |  3 +++
 t/t7418-submodule-sparse-gitmodules.sh |  3 +++
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index bc3ec4a242..992890321a 100644
--- a/submodule.c
+++ b/submodule.c
@@ -934,23 +934,33 @@ struct has_commit_data {
 static int check_has_commit(const struct object_id *oid, void *data)
 {
 	struct has_commit_data *cb = data;
+	struct repository subrepo;
+	enum object_type type;
 
-	enum object_type type = oid_object_info(cb->repo, oid, NULL);
+	if (repo_submodule_init(&subrepo, cb->repo, cb->path, null_oid())) {
+		cb->result = 0;
+		goto cleanup;
+	}
+
+	type = oid_object_info(&subrepo, oid, NULL);
 
 	switch (type) {
 	case OBJ_COMMIT:
-		return 0;
+		goto cleanup;
 	case OBJ_BAD:
 		/*
 		 * Object is missing or invalid. If invalid, an error message
 		 * has already been printed.
 		 */
 		cb->result = 0;
-		return 0;
+		goto cleanup;
 	default:
 		die(_("submodule entry '%s' (%s) is a %s, not a commit"),
 		    cb->path, oid_to_hex(oid), type_name(type));
 	}
+cleanup:
+	repo_clear(&subrepo);
+	return 0;
 }
 
 static int submodule_has_commits(struct repository *r,
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index ed11569d8d..2dc75b80db 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -6,6 +6,9 @@ test_description='Recursive "git fetch" for submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 pwd=$(pwd)
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index 4f92a116e1..fa6b4cca65 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -2,6 +2,9 @@
 
 test_description='pull can handle submodules'
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh
index 3f7f271883..f87e524d6d 100755
--- a/t/t7418-submodule-sparse-gitmodules.sh
+++ b/t/t7418-submodule-sparse-gitmodules.sh
@@ -12,6 +12,9 @@ The test setup uses a sparse checkout, however the same scenario can be set up
 also by committing .gitmodules and then just removing it from the filesystem.
 '
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success 'sparse checkout setup which hides .gitmodules' '
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 8/9] refs: change refs_for_each_ref_in() to take repo
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (6 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  2021-09-28 20:10   ` [PATCH v2 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

Pass a repository to refs_for_each_ref_in() so that object accesses
during iteration (done to skip over invalid refs) are made with the
correct repository instead of relying on add_submodule_odb(). With this,
the last remaining tests no longer rely on add_submodule_odb(), so mark
them accordingly.

The test-ref-store test helper needed to be changed to reflect the new
API. For now, just pass the repository through a global variable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                         | 12 ++++++------
 refs.h                         |  2 +-
 t/helper/test-ref-store.c      | 20 +++++++++-----------
 t/t5531-deep-submodule-push.sh |  3 +++
 t/t5545-push-options.sh        |  3 +++
 5 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index 15a3aa47cf..5b0937ac20 100644
--- a/refs.c
+++ b/refs.c
@@ -410,7 +410,7 @@ void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_li
 
 int refs_for_each_tag_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/tags/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/tags/", fn, cb_data);
 }
 
 int for_each_tag_ref(each_ref_fn fn, void *cb_data)
@@ -420,7 +420,7 @@ int for_each_tag_ref(each_ref_fn fn, void *cb_data)
 
 int refs_for_each_branch_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/heads/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/heads/", fn, cb_data);
 }
 
 int for_each_branch_ref(each_ref_fn fn, void *cb_data)
@@ -430,7 +430,7 @@ int for_each_branch_ref(each_ref_fn fn, void *cb_data)
 
 int refs_for_each_remote_ref(struct repository *repo, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(repo), "refs/remotes/", fn, cb_data);
+	return refs_for_each_ref_in(repo, "refs/remotes/", fn, cb_data);
 }
 
 int for_each_remote_ref(each_ref_fn fn, void *cb_data)
@@ -1514,15 +1514,15 @@ int for_each_ref(each_ref_fn fn, void *cb_data)
 	return refs_for_each_ref(the_repository, fn, cb_data);
 }
 
-int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
+int refs_for_each_ref_in(struct repository *repo, const char *prefix,
 			 each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
+	return do_for_each_ref(get_main_ref_store(repo), prefix, fn, strlen(prefix), repo, 0, cb_data);
 }
 
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
 {
-	return refs_for_each_ref_in(get_main_ref_store(the_repository), prefix, fn, cb_data);
+	return refs_for_each_ref_in(the_repository, prefix, fn, cb_data);
 }
 
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data)
diff --git a/refs.h b/refs.h
index e9ecb5e54e..458d8eddde 100644
--- a/refs.h
+++ b/refs.h
@@ -320,7 +320,7 @@ int refs_head_ref(struct repository *repo,
 		  each_ref_fn fn, void *cb_data);
 int refs_for_each_ref(struct repository *repo,
 		      each_ref_fn fn, void *cb_data);
-int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
+int refs_for_each_ref_in(struct repository *repo, const char *prefix,
 			 each_ref_fn fn, void *cb_data);
 int refs_for_each_tag_ref(struct repository *repo,
 			  each_ref_fn fn, void *cb_data);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index b314b81a45..1964cb349e 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -5,6 +5,8 @@
 #include "object-store.h"
 #include "repository.h"
 
+static struct repository *repo;
+
 static const char *notnull(const char *arg, const char *name)
 {
 	if (!arg)
@@ -24,18 +26,13 @@ static const char **get_store(const char **argv, struct ref_store **refs)
 	if (!argv[0]) {
 		die("ref store required");
 	} else if (!strcmp(argv[0], "main")) {
+		repo = the_repository;
 		*refs = get_main_ref_store(the_repository);
 	} else if (skip_prefix(argv[0], "submodule:", &gitdir)) {
-		struct strbuf sb = STRBUF_INIT;
-		int ret;
-
-		ret = strbuf_git_path_submodule(&sb, gitdir, "objects/");
-		if (ret)
-			die("strbuf_git_path_submodule failed: %d", ret);
-		add_to_alternates_memory(sb.buf);
-		strbuf_release(&sb);
-
-		*refs = get_submodule_ref_store(gitdir);
+		repo = xmalloc(sizeof(*repo));
+		if (repo_submodule_init(repo, the_repository, gitdir, null_oid()))
+			die("repo_submodule_init failed");
+		*refs = get_main_ref_store(repo);
 	} else if (skip_prefix(argv[0], "worktree:", &gitdir)) {
 		struct worktree **p, **worktrees = get_worktrees();
 
@@ -52,6 +49,7 @@ static const char **get_store(const char **argv, struct ref_store **refs)
 		if (!*p)
 			die("no such worktree: %s", gitdir);
 
+		repo = the_repository;
 		*refs = get_worktree_ref_store(*p);
 	} else
 		die("unknown backend %s", argv[0]);
@@ -113,7 +111,7 @@ static int cmd_for_each_ref(struct ref_store *refs, const char **argv)
 {
 	const char *prefix = notnull(*argv++, "prefix");
 
-	return refs_for_each_ref_in(refs, prefix, each_ref, NULL);
+	return refs_for_each_ref_in(repo, prefix, each_ref, NULL);
 }
 
 static int cmd_resolve_ref(struct ref_store *refs, const char **argv)
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index d573ca496a..3f58b515ce 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -5,6 +5,9 @@ test_description='test push with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 58c7add7ee..214228349a 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -5,6 +5,9 @@ test_description='pushing to a repository using push options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 mk_repo_pair () {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 9/9] submodule: trace adding submodule ODB as alternate
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
                     ` (7 preceding siblings ...)
  2021-09-28 20:10   ` [PATCH v2 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
@ 2021-09-28 20:10   ` Jonathan Tan
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-28 20:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, newren

Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/submodule.c b/submodule.c
index 992890321a..188552531b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -207,6 +207,8 @@ int register_all_submodule_odb_as_alternates(void)
 		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
 	if (ret) {
 		string_list_clear(&added_submodule_odb_paths, 0);
+		trace2_data_intmax("submodule", the_repository,
+				   "register_all_submodule_odb_as_alternates/registered", ret);
 		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
 			BUG("register_all_submodule_odb_as_alternates() called");
 	}
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions
  2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
@ 2021-09-28 22:24     ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-09-28 22:24 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, newren

Jonathan Tan <jonathantanmy@google.com> writes:

> In preparation for the next 2 patches that adds (partial) support for
> arbitrary repositories, plumb a repository parameter in all functions
> that create iterators. There are no changes to program logic.

Makes sense.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 2/9] refs: teach arbitrary repo support to iterators
  2021-09-28 20:10   ` [PATCH v2 2/9] refs: teach arbitrary repo support to iterators Jonathan Tan
@ 2021-09-28 22:35     ` Junio C Hamano
  2021-09-29 17:04       ` Jonathan Tan
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-09-28 22:35 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, newren

Jonathan Tan <jonathantanmy@google.com> writes:

> Note that should_pack_ref() is called when writing refs, which is only
> supported for the_repository, hence the_repository is hardcoded there.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  refs.c                | 3 ++-
>  refs/files-backend.c  | 5 ++++-
>  refs/packed-backend.c | 6 ++++--
>  refs/refs-internal.h  | 1 +
>  4 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 6f7b3447a7..5163e064ae 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -255,12 +255,13 @@ int refname_is_safe(const char *refname)
>   * does not exist, emit a warning and return false.
>   */
>  int ref_resolves_to_object(const char *refname,
> +			   struct repository *repo,
>  			   const struct object_id *oid,
>  			   unsigned int flags)
>  {
>  	if (flags & REF_ISBROKEN)
>  		return 0;
> -	if (!has_object_file(oid)) {
> +	if (!repo_has_object_file(repo, oid)) {
>  		error(_("%s does not point to a valid object!"), refname);
>  		return 0;
>  	}

OK.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index f0cbea41c9..4d883d9a89 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -730,6 +730,7 @@ struct files_ref_iterator {
>  	struct ref_iterator base;
>  
>  	struct ref_iterator *iter0;
> +	struct repository *repo;
>  	unsigned int flags;
>  };

> @@ -776,6 +776,7 @@ struct packed_ref_iterator {
>  	struct object_id oid, peeled;
>  	struct strbuf refname_buf;
>  
> +	struct repository *repo;
>  	unsigned int flags;
>  };

The two steps so far seems to give the necessary information to code
paths that want them, so it is not wrong per-se, but this makes me
wonder a few things.

 - There may be multiple ref backends and iterators corresponding to
   them.  Is it reasonable to assume that there are backends that do
   not need "repo"?  Otherwise, shouldn't this be added to the base
   class "struct ref_iterator base"?

 - The iterator_begin() and other functions have been taught to take
   the repository in addition to the ref_store in the previous step,
   but

   . Doesn't iterator iterate over a single ref_store?  Shouldn't it
     have a pointer to the ref_store it is iterating over?

   . Doesn't a ref_store belong to a single repository?  Shouldn't
     it have a pointer to the repository it is part of?

   If the answers to both are 'yes', then we wouldn't need to add a
   repository pointer as a new parameter to functions that already
   took a ref store.

In other words, I am wondering if the right pieces of information
are stored in the right structure.

Thanks.



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos
  2021-09-28 20:10   ` [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
@ 2021-09-28 22:49     ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-09-28 22:49 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, newren

Jonathan Tan <jonathantanmy@google.com> writes:

>  	} else if (!(flags & OPT_CACHED)) {
>  		struct object_id oid;
> -		struct ref_store *refs = get_submodule_ref_store(path);
> +		struct repository subrepo;
>  
> -		if (!refs) {
> +		if (repo_submodule_init(&subrepo, the_repository, path, null_oid())) {
>  			print_status(flags, '-', path, ce_oid, displaypath);
>  			goto cleanup;
>  		}
> -		if (refs_head_ref(refs, handle_submodule_head_ref, &oid))
> +		if (refs_head_ref(&subrepo, handle_submodule_head_ref, &oid))
>  			die(_("could not resolve HEAD ref inside the "
>  			      "submodule '%s'"), path);
> +		repo_clear(&subrepo);

While this makes perfect sense, if we extended the ref_store to know
what repository it belongs to, I suspect that we don't have to
change anything in a "user" codepath like this one.
get_submodule_ref_store() would repare a ref store that is bound to
the submodule repository, refs_head_ref() and other helpers that
take a ref_store would not have to gain an extra "repository"
parameter (because it is known via the ref_store) and does the
iteration in the right repository, etc...

> @@ -1018,9 +1019,12 @@ static void generate_submodule_summary(struct summary_cb *info,
>  
>  	if (!info->cached && oideq(&p->oid_dst, null_oid())) {
>  		if (S_ISGITLINK(p->mod_dst)) {
> -			struct ref_store *refs = get_submodule_ref_store(p->sm_path);
> -			if (refs)
> -				refs_head_ref(refs, handle_submodule_head_ref, &p->oid_dst);
> +			struct repository subrepo;
> +
> +			if (!repo_submodule_init(&subrepo, the_repository, p->sm_path, null_oid())) {
> +				refs_head_ref(&subrepo, handle_submodule_head_ref, &p->oid_dst);
> +				repo_clear(&subrepo);
> +			}

The story looks the same here, too.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 2/9] refs: teach arbitrary repo support to iterators
  2021-09-28 22:35     ` Junio C Hamano
@ 2021-09-29 17:04       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 17:04 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff, newren

> The two steps so far seems to give the necessary information to code
> paths that want them, so it is not wrong per-se, but this makes me
> wonder a few things.
> 
>  - There may be multiple ref backends and iterators corresponding to
>    them.  Is it reasonable to assume that there are backends that do
>    not need "repo"?  Otherwise, shouldn't this be added to the base
>    class "struct ref_iterator base"?

All backends need repos, but not all iterators need backends - there is
a merge_ref_iterator and a prefix_ref_iterator, for example.

>  - The iterator_begin() and other functions have been taught to take
>    the repository in addition to the ref_store in the previous step,
>    but
> 
>    . Doesn't iterator iterate over a single ref_store?  Shouldn't it
>      have a pointer to the ref_store it is iterating over?

No - as above, merge_ref_iterator, for example, does not iterate over a
ref_store but combines the results of 2 other iterators.

>    . Doesn't a ref_store belong to a single repository?  Shouldn't
>      it have a pointer to the repository it is part of?
> 
>    If the answers to both are 'yes', then we wouldn't need to add a
>    repository pointer as a new parameter to functions that already
>    took a ref store.
> 
> In other words, I am wondering if the right pieces of information
> are stored in the right structure.
> 
> Thanks.

A ref_store does belong to a single repository. The reason why it
doesn't have a pointer to that repository is probably because struct
ref_store (00eebe351c ("refs: create a base class "ref_store" for
files_ref_store", 2016-09-09)) predates struct repository (359efeffc1
("repository: introduce the repository object", 2017-06-23)). I've been
avoiding introducing a pointer to the repository in struct ref_store to
avoid unnecessary coupling, but it is looking more and more necessary,
as you mention in your reply [1] to another patch about how this would
eliminate certain other "user" codepaths needing to know about the repo.
I'll take a look at introducing a pointer to the repo in struct
ref_store and report back with my findings.

[1] https://lore.kernel.org/git/xmqqh7e4iacw.fsf@gitster.g/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 0/7] No more adding submodule ODB as alternate
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (10 preceding siblings ...)
  2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
@ 2021-09-29 23:06 ` Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
                     ` (7 more replies)
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
  12 siblings, 8 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

This is on a merge of jk/ref-paranoia and jt/add-submodule-odb-clean-up
(same as v2).

Here's the same patch set except that the repo is plumbed into the ref
stores. (Iterators currently do not have any reference to their ref
stores, so some of them still need repo fields. But because the ref
stores now know their repos, calling code does not need to pass a repo
when these iterators are instantiated.)

As you can see from the shorter patch list, this eliminates the need for
some patches.

Jonathan Tan (7):
  refs: plumb repo into ref stores
  refs: teach arbitrary repo support to iterators
  refs: peeling non-the_repository iterators is BUG
  merge-{ort,recursive}: remove add_submodule_odb()
  object-file: only register submodule ODB if needed
  submodule: pass repo to check_has_commit()
  submodule: trace adding submodule ODB as alternate

 merge-ort.c                            | 18 +++--------
 merge-recursive.c                      | 41 +++++++++++++-------------
 object-file.c                          |  3 +-
 refs.c                                 | 32 +++++++++++++++-----
 refs/files-backend.c                   | 16 ++++++----
 refs/packed-backend.c                  | 13 ++++++--
 refs/packed-backend.h                  |  3 +-
 refs/ref-cache.c                       | 10 +++++++
 refs/ref-cache.h                       |  1 +
 refs/refs-internal.h                   | 11 +++++--
 strbuf.c                               | 12 ++++++--
 strbuf.h                               |  6 ++--
 submodule.c                            | 18 +++++++++--
 t/t5526-fetch-submodules.sh            |  3 ++
 t/t5531-deep-submodule-push.sh         |  3 ++
 t/t5545-push-options.sh                |  3 ++
 t/t5572-pull-submodule.sh              |  3 ++
 t/t6437-submodule-merge.sh             |  3 ++
 t/t7418-submodule-sparse-gitmodules.sh |  3 ++
 19 files changed, 139 insertions(+), 63 deletions(-)

Range-diff against v2:
 1:  e364b13a37 <  -:  ---------- refs: plumb repo param in begin-iterator functions
 -:  ---------- >  1:  8067397538 refs: plumb repo into ref stores
 2:  ec153eff7b !  2:  c8799d408f refs: teach arbitrary repo support to iterators
    @@ refs/files-backend.c: static struct ref_iterator *files_ref_iterator_begin(
      	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
      			       overlay_iter->ordered);
      	iter->iter0 = overlay_iter;
    -+	iter->repo = repo;
    ++	iter->repo = ref_store->repo;
      	iter->flags = flags;
      
      	return ref_iterator;
    @@ refs/packed-backend.c: static struct ref_iterator *packed_ref_iterator_begin(
      
      	iter->base.oid = &iter->oid;
      
    -+	iter->repo = repo;
    ++	iter->repo = ref_store->repo;
      	iter->flags = flags;
      
      	if (prefix && *prefix)
 3:  dd1a8871f4 !  3:  e7fb60b7e7 refs: peeling non-the_repository iterators is BUG
    @@ refs/files-backend.c: static struct ref_iterator *files_ref_iterator_begin(
      
      	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
     -					      prefix, 1);
    -+					      prefix, repo, 1);
    ++					      prefix, ref_store->repo, 1);
      
      	/*
      	 * The packed-refs file might contain broken references, for
 4:  da0c9c2d44 <  -:  ---------- refs: teach refs_for_each_ref() arbitrary repos
 5:  dd70820d66 =  4:  e4a1be22c8 merge-{ort,recursive}: remove add_submodule_odb()
 6:  9c5ce004b2 =  5:  0200f1880b object-file: only register submodule ODB if needed
 7:  1fca3b1a25 !  6:  7a6a1ee5f9 submodule: pass repo to check_has_commit()
    @@ Commit message
     
         Pass the repo explicitly when calling check_has_commit() to avoid
         relying on add_submodule_odb(). With this commit and the parent commit,
    -    several tests no longer rely on add_submodule_odb(), so mark these tests
    -    accordingly.
    +    the last remaining tests no longer rely on add_submodule_odb(), so mark
    +    these tests accordingly.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
    @@ t/t5526-fetch-submodules.sh: test_description='Recursive "git fetch" for submodu
      
      pwd=$(pwd)
     
    + ## t/t5531-deep-submodule-push.sh ##
    +@@ t/t5531-deep-submodule-push.sh: test_description='test push with submodules'
    + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    + 
    ++GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
    ++export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
    ++
    + . ./test-lib.sh
    + 
    + test_expect_success setup '
    +
    + ## t/t5545-push-options.sh ##
    +@@ t/t5545-push-options.sh: test_description='pushing to a repository using push options'
    + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    + 
    ++GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
    ++export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
    ++
    + . ./test-lib.sh
    + 
    + mk_repo_pair () {
    +
      ## t/t5572-pull-submodule.sh ##
     @@
      
 8:  7b5087a14d <  -:  ---------- refs: change refs_for_each_ref_in() to take repo
 9:  cef2a97840 =  7:  e4b6ee2186 submodule: trace adding submodule ODB as alternate
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 1/7] refs: plumb repo into ref stores
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-09-30 11:13     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
                       ` (2 more replies)
  2021-09-29 23:06   ` [PATCH v3 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
                     ` (6 subsequent siblings)
  7 siblings, 3 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories to ref iterators, plumb a repository into all ref
stores. There are no changes to program logic.

(The repository is plumbed into the ref stores instead of directly into
the ref iterators themselves, so that existing code that operates on ref
stores do not need to be modified to also handle repositories.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 29 ++++++++++++++++++++++-------
 refs/files-backend.c  |  6 ++++--
 refs/packed-backend.c |  4 +++-
 refs/packed-backend.h |  3 ++-
 refs/refs-internal.h  | 10 ++++++++--
 5 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index 2be0d0f057..9c4e388153 100644
--- a/refs.c
+++ b/refs.c
@@ -1873,7 +1873,8 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(struct repository *repo,
+					const char *gitdir,
 					unsigned int flags)
 {
 	const char *be_name = "files";
@@ -1883,7 +1884,7 @@ static struct ref_store *ref_store_init(const char *gitdir,
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
-	refs = be->init(gitdir, flags);
+	refs = be->init(repo, gitdir, flags);
 	return refs;
 }
 
@@ -1895,7 +1896,7 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1925,6 +1926,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 	struct ref_store *refs;
 	char *to_free = NULL;
 	size_t len;
+	struct repository *subrepo;
 
 	if (!submodule)
 		return NULL;
@@ -1950,8 +1952,19 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 	if (submodule_to_gitdir(&submodule_sb, submodule))
 		goto done;
 
-	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	subrepo = xmalloc(sizeof(*subrepo));
+	/*
+	 * NEEDSWORK: Make get_submodule_ref_store() work with arbitrary
+	 * superprojects other than the_repository. This probably should be
+	 * done by making it take a struct repository * parameter instead of a
+	 * submodule path.
+	 */
+	if (repo_submodule_init(subrepo, the_repository, submodule,
+				null_oid())) {
+		free(subrepo);
+		goto done;
+	}
+	refs = ref_store_init(subrepo, submodule_sb.buf,
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1977,10 +1990,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 		return refs;
 
 	if (wt->id)
-		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
+		refs = ref_store_init(the_repository,
+				      git_common_path("worktrees/%s", wt->id),
 				      REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(the_repository,
+				      get_git_common_dir(),
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1148c0cf09..6a481e968f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -79,13 +79,15 @@ static void clear_loose_ref_cache(struct files_ref_store *refs)
  * Create a new submodule ref cache and add it to the internal
  * set of caches.
  */
-static struct ref_store *files_ref_store_create(const char *gitdir,
+static struct ref_store *files_ref_store_create(struct repository *repo,
+						const char *gitdir,
 						unsigned int flags)
 {
 	struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
 	struct ref_store *ref_store = (struct ref_store *)refs;
 	struct strbuf sb = STRBUF_INIT;
 
+	ref_store->repo = repo;
 	ref_store->gitdir = xstrdup(gitdir);
 	base_ref_store_init(ref_store, &refs_be_files);
 	refs->store_flags = flags;
@@ -93,7 +95,7 @@ static struct ref_store *files_ref_store_create(const char *gitdir,
 	get_common_dir_noenv(&sb, gitdir);
 	refs->gitcommondir = strbuf_detach(&sb, NULL);
 	strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir);
-	refs->packed_ref_store = packed_ref_store_create(sb.buf, flags);
+	refs->packed_ref_store = packed_ref_store_create(repo, sb.buf, flags);
 	strbuf_release(&sb);
 
 	chdir_notify_reparent("files-backend $GIT_DIR", &refs->base.gitdir);
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f8aa97d799..ea3493b24e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -193,13 +193,15 @@ static int release_snapshot(struct snapshot *snapshot)
 	}
 }
 
-struct ref_store *packed_ref_store_create(const char *path,
+struct ref_store *packed_ref_store_create(struct repository *repo,
+					  const char *path,
 					  unsigned int store_flags)
 {
 	struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
 	struct ref_store *ref_store = (struct ref_store *)refs;
 
 	base_ref_store_init(ref_store, &refs_be_packed);
+	ref_store->repo = repo;
 	ref_store->gitdir = xstrdup(path);
 	refs->store_flags = store_flags;
 
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index a01a0aff9c..942c908771 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -12,7 +12,8 @@ struct ref_transaction;
  * even among packed refs.
  */
 
-struct ref_store *packed_ref_store_create(const char *path,
+struct ref_store *packed_ref_store_create(struct repository *repo,
+					  const char *path,
 					  unsigned int store_flags);
 
 /*
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 96911fb26e..d28440c9cc 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -539,7 +539,8 @@ struct ref_store;
  * should call base_ref_store_init() to initialize the shared part of
  * the ref_store and to record the ref_store for later lookup.
  */
-typedef struct ref_store *ref_store_init_fn(const char *gitdir,
+typedef struct ref_store *ref_store_init_fn(struct repository *repo,
+					    const char *gitdir,
 					    unsigned int flags);
 
 typedef int ref_init_db_fn(struct ref_store *refs, struct strbuf *err);
@@ -697,7 +698,12 @@ struct ref_store {
 	/* The backend describing this ref_store's storage scheme: */
 	const struct ref_storage_be *be;
 
-	/* The gitdir that this ref_store applies to: */
+	struct repository *repo;
+
+	/*
+	 * The gitdir that this ref_store applies to. Note that this is not
+	 * necessarily repo->gitdir if the repo has multiple worktrees.
+	 */
 	char *gitdir;
 };
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 2/7] refs: teach arbitrary repo support to iterators
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-10-07 19:31     ` Glen Choo
  2021-09-29 23:06   ` [PATCH v3 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 3 ++-
 refs/files-backend.c  | 5 ++++-
 refs/packed-backend.c | 6 ++++--
 refs/refs-internal.h  | 1 +
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 9c4e388153..c07aeff6f4 100644
--- a/refs.c
+++ b/refs.c
@@ -255,12 +255,13 @@ int refname_is_safe(const char *refname)
  * does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags)
 {
 	if (flags & REF_ISBROKEN)
 		return 0;
-	if (!has_object_file(oid)) {
+	if (!repo_has_object_file(repo, oid)) {
 		error(_("%s does not point to a valid object!"), refname);
 		return 0;
 	}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6a481e968f..3f213d24b0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -732,6 +732,7 @@ struct files_ref_iterator {
 	struct ref_iterator base;
 
 	struct ref_iterator *iter0;
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -753,6 +754,7 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
 		    !ref_resolves_to_object(iter->iter0->refname,
+					    iter->repo,
 					    iter->iter0->oid,
 					    iter->iter0->flags))
 			continue;
@@ -855,6 +857,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
 			       overlay_iter->ordered);
 	iter->iter0 = overlay_iter;
+	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
 	return ref_iterator;
@@ -1139,7 +1142,7 @@ static int should_pack_ref(const char *refname,
 		return 0;
 
 	/* Do not pack broken refs: */
-	if (!ref_resolves_to_object(refname, oid, ref_flags))
+	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
 		return 0;
 
 	return 1;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ea3493b24e..63f78bbaea 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -778,6 +778,7 @@ struct packed_ref_iterator {
 	struct object_id oid, peeled;
 	struct strbuf refname_buf;
 
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -866,8 +867,8 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			continue;
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
-					    iter->flags))
+		    !ref_resolves_to_object(iter->base.refname, iter->repo,
+					    &iter->oid, iter->flags))
 			continue;
 
 		return ITER_OK;
@@ -956,6 +957,7 @@ static struct ref_iterator *packed_ref_iterator_begin(
 
 	iter->base.oid = &iter->oid;
 
+	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
 	if (prefix && *prefix)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index d28440c9cc..500d77864d 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -66,6 +66,7 @@ int refname_is_safe(const char *refname);
  * referred-to object does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags);
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 3/7] refs: peeling non-the_repository iterators is BUG
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs/files-backend.c  |  5 +++--
 refs/packed-backend.c |  3 +++
 refs/ref-cache.c      | 10 ++++++++++
 refs/ref-cache.h      |  1 +
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3f213d24b0..8ee6ac2103 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -833,7 +833,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 */
 
 	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
-					      prefix, 1);
+					      prefix, ref_store->repo, 1);
 
 	/*
 	 * The packed-refs file might contain broken references, for
@@ -1165,7 +1165,8 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 
 	packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
-	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
+	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL,
+					the_repository, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		/*
 		 * If the loose reference can be packed, add an entry
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 63f78bbaea..2161218719 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -886,6 +886,9 @@ static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	struct packed_ref_iterator *iter =
 		(struct packed_ref_iterator *)ref_iterator;
 
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
+
 	if ((iter->base.flags & REF_KNOWS_PEELED)) {
 		oidcpy(peeled, &iter->peeled);
 		return is_null_oid(&iter->peeled) ? -1 : 0;
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 49d732f6db..97a6ac349e 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -435,6 +435,8 @@ struct cache_ref_iterator {
 	 * on from there.)
 	 */
 	struct cache_ref_iterator_level *levels;
+
+	struct repository *repo;
 };
 
 static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
@@ -491,6 +493,11 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
 				   struct object_id *peeled)
 {
+	struct cache_ref_iterator *iter =
+		(struct cache_ref_iterator *)ref_iterator;
+
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
 	return peel_object(ref_iterator->oid, peeled) ? -1 : 0;
 }
 
@@ -513,6 +520,7 @@ static struct ref_iterator_vtable cache_ref_iterator_vtable = {
 
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir)
 {
 	struct ref_dir *dir;
@@ -547,5 +555,7 @@ struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 		level->prefix_state = PREFIX_CONTAINS_DIR;
 	}
 
+	iter->repo = repo;
+
 	return ref_iterator;
 }
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index 3bfb89d2b3..7877bf86ed 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -238,6 +238,7 @@ struct ref_entry *find_ref_entry(struct ref_dir *dir, const char *refname);
  */
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir);
 
 #endif /* REFS_REF_CACHE_H */
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb()
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-09-29 23:06   ` [PATCH v3 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-10-07 18:34     ` Josh Steadmon
  2021-09-29 23:06   ` [PATCH v3 5/7] object-file: only register submodule ODB if needed Jonathan Tan
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 merge-ort.c                | 18 ++++-------------
 merge-recursive.c          | 41 +++++++++++++++++++-------------------
 strbuf.c                   | 12 ++++++++---
 strbuf.h                   |  6 ++++--
 t/t6437-submodule-merge.sh |  3 +++
 5 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b88475475d..fbc5c204c1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -609,6 +609,7 @@ static int err(struct merge_options *opt, const char *err, ...)
 
 static void format_commit(struct strbuf *sb,
 			  int indent,
+			  struct repository *repo,
 			  struct commit *commit)
 {
 	struct merge_remote_desc *desc;
@@ -622,7 +623,7 @@ static void format_commit(struct strbuf *sb,
 		return;
 	}
 
-	format_commit_message(commit, "%h %s", sb, &ctx);
+	repo_format_commit_message(repo, commit, "%h %s", sb, &ctx);
 	strbuf_addch(sb, '\n');
 }
 
@@ -1578,17 +1579,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		path_msg(opt, path, 0,
-			 _("Failed to merge submodule %s (not checked out)"),
-			 path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		path_msg(opt, path, 0,
 				_("Failed to merge submodule %s (not checked out)"),
@@ -1653,7 +1643,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 
 	case 1:
-		format_commit(&sb, 4,
+		format_commit(&sb, 4, &subrepo,
 			      (struct commit *)merges.objects[0].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but a possible merge "
@@ -1670,7 +1660,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 	default:
 		for (i = 0; i < merges.nr; i++)
-			format_commit(&sb, 4,
+			format_commit(&sb, 4, &subrepo,
 				      (struct commit *)merges.objects[i].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but multiple "
diff --git a/merge-recursive.c b/merge-recursive.c
index 5a2d8a60c0..80594153f1 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,7 +334,9 @@ static void output(struct merge_options *opt, int v, const char *fmt, ...)
 		flush_output(opt);
 }
 
-static void output_commit_title(struct merge_options *opt, struct commit *commit)
+static void repo_output_commit_title(struct merge_options *opt,
+				     struct repository *repo,
+				     struct commit *commit)
 {
 	struct merge_remote_desc *desc;
 
@@ -343,23 +345,29 @@ static void output_commit_title(struct merge_options *opt, struct commit *commit
 	if (desc)
 		strbuf_addf(&opt->obuf, "virtual %s\n", desc->name);
 	else {
-		strbuf_add_unique_abbrev(&opt->obuf, &commit->object.oid,
-					 DEFAULT_ABBREV);
+		strbuf_repo_add_unique_abbrev(&opt->obuf, repo,
+					      &commit->object.oid,
+					      DEFAULT_ABBREV);
 		strbuf_addch(&opt->obuf, ' ');
-		if (parse_commit(commit) != 0)
+		if (repo_parse_commit(repo, commit) != 0)
 			strbuf_addstr(&opt->obuf, _("(bad commit)\n"));
 		else {
 			const char *title;
-			const char *msg = get_commit_buffer(commit, NULL);
+			const char *msg = repo_get_commit_buffer(repo, commit, NULL);
 			int len = find_commit_subject(msg, &title);
 			if (len)
 				strbuf_addf(&opt->obuf, "%.*s\n", len, title);
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(repo, commit, msg);
 		}
 	}
 	flush_output(opt);
 }
 
+static void output_commit_title(struct merge_options *opt, struct commit *commit)
+{
+	repo_output_commit_title(opt, the_repository, commit);
+}
+
 static int add_cacheinfo(struct merge_options *opt,
 			 const struct diff_filespec *blob,
 			 const char *path, int stage, int refresh, int options)
@@ -1149,14 +1157,14 @@ static int find_first_merges(struct repository *repo,
 	return result->nr;
 }
 
-static void print_commit(struct commit *commit)
+static void print_commit(struct repository *repo, struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
 	ctx.date_mode.type = DATE_NORMAL;
 	/* FIXME: Merge this with output_commit_title() */
 	assert(!merge_remote_util(commit));
-	format_commit_message(commit, " %h: %m %s", &sb, &ctx);
+	repo_format_commit_message(repo, commit, " %h: %m %s", &sb, &ctx);
 	fprintf(stderr, "%s\n", sb.buf);
 	strbuf_release(&sb);
 }
@@ -1196,15 +1204,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
 		return 0;
@@ -1229,7 +1228,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_b);
+			repo_output_commit_title(opt, &subrepo, commit_b);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1242,7 +1241,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_a);
+			repo_output_commit_title(opt, &subrepo, commit_a);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1274,7 +1273,7 @@ static int merge_submodule(struct merge_options *opt,
 	case 1:
 		output(opt, 1, _("Failed to merge submodule %s (not fast-forward)"), path);
 		output(opt, 2, _("Found a possible merge resolution for the submodule:\n"));
-		print_commit((struct commit *) merges.objects[0].item);
+		print_commit(&subrepo, (struct commit *) merges.objects[0].item);
 		output(opt, 2, _(
 		       "If this is correct simply add it to the index "
 		       "for example\n"
@@ -1287,7 +1286,7 @@ static int merge_submodule(struct merge_options *opt,
 	default:
 		output(opt, 1, _("Failed to merge submodule %s (multiple merges found)"), path);
 		for (i = 0; i < merges.nr; i++)
-			print_commit((struct commit *) merges.objects[i].item);
+			print_commit(&subrepo, (struct commit *) merges.objects[i].item);
 	}
 
 	object_array_clear(&merges);
diff --git a/strbuf.c b/strbuf.c
index c8a5789694..b22e981655 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 	strbuf_setlen(sb, sb->len + len);
 }
 
-void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
-			      int abbrev_len)
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len)
 {
 	int r;
 	strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
-	r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
+	r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
 	strbuf_setlen(sb, sb->len + r);
 }
 
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
+			      int abbrev_len)
+{
+	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
+}
+
 /*
  * Returns the length of a line, without trailing spaces.
  *
diff --git a/strbuf.h b/strbuf.h
index 5b1113abf8..2d9e01c16f 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -634,8 +634,10 @@ void strbuf_list_free(struct strbuf **list);
  * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
  * the strbuf `sb`.
  */
-void strbuf_add_unique_abbrev(struct strbuf *sb,
-			      const struct object_id *oid,
+struct repository;
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len);
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
 			      int abbrev_len);
 
 /**
diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
index e5e89c2045..178413c22f 100755
--- a/t/t6437-submodule-merge.sh
+++ b/t/t6437-submodule-merge.sh
@@ -5,6 +5,9 @@ test_description='merging with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 5/7] object-file: only register submodule ODB if needed
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
                     ` (3 preceding siblings ...)
  2021-09-29 23:06   ` [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-10-07 18:34     ` Josh Steadmon
  2021-09-29 23:06   ` [PATCH v3 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 object-file.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index be4f94ecf3..2b988b7c36 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1614,7 +1614,8 @@ static int do_oid_object_info_extended(struct repository *r,
 				break;
 		}
 
-		if (register_all_submodule_odb_as_alternates())
+		if (r == the_repository &&
+		    register_all_submodule_odb_as_alternates())
 			/* We added some alternates; retry */
 			continue;
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 6/7] submodule: pass repo to check_has_commit()
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
                     ` (4 preceding siblings ...)
  2021-09-29 23:06   ` [PATCH v3 5/7] object-file: only register submodule ODB if needed Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-09-29 23:06   ` [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
  2021-10-07 18:32   ` [PATCH v3 0/7] No more " Josh Steadmon
  7 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
the last remaining tests no longer rely on add_submodule_odb(), so mark
these tests accordingly.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c                            | 16 +++++++++++++---
 t/t5526-fetch-submodules.sh            |  3 +++
 t/t5531-deep-submodule-push.sh         |  3 +++
 t/t5545-push-options.sh                |  3 +++
 t/t5572-pull-submodule.sh              |  3 +++
 t/t7418-submodule-sparse-gitmodules.sh |  3 +++
 6 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 62beb8fd5f..4bf552b0e5 100644
--- a/submodule.c
+++ b/submodule.c
@@ -928,23 +928,33 @@ struct has_commit_data {
 static int check_has_commit(const struct object_id *oid, void *data)
 {
 	struct has_commit_data *cb = data;
+	struct repository subrepo;
+	enum object_type type;
 
-	enum object_type type = oid_object_info(cb->repo, oid, NULL);
+	if (repo_submodule_init(&subrepo, cb->repo, cb->path, null_oid())) {
+		cb->result = 0;
+		goto cleanup;
+	}
+
+	type = oid_object_info(&subrepo, oid, NULL);
 
 	switch (type) {
 	case OBJ_COMMIT:
-		return 0;
+		goto cleanup;
 	case OBJ_BAD:
 		/*
 		 * Object is missing or invalid. If invalid, an error message
 		 * has already been printed.
 		 */
 		cb->result = 0;
-		return 0;
+		goto cleanup;
 	default:
 		die(_("submodule entry '%s' (%s) is a %s, not a commit"),
 		    cb->path, oid_to_hex(oid), type_name(type));
 	}
+cleanup:
+	repo_clear(&subrepo);
+	return 0;
 }
 
 static int submodule_has_commits(struct repository *r,
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index ed11569d8d..2dc75b80db 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -6,6 +6,9 @@ test_description='Recursive "git fetch" for submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 pwd=$(pwd)
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index d573ca496a..3f58b515ce 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -5,6 +5,9 @@ test_description='test push with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 58c7add7ee..214228349a 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -5,6 +5,9 @@ test_description='pushing to a repository using push options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 mk_repo_pair () {
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index 4f92a116e1..fa6b4cca65 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -2,6 +2,9 @@
 
 test_description='pull can handle submodules'
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh
index 3f7f271883..f87e524d6d 100755
--- a/t/t7418-submodule-sparse-gitmodules.sh
+++ b/t/t7418-submodule-sparse-gitmodules.sh
@@ -12,6 +12,9 @@ The test setup uses a sparse checkout, however the same scenario can be set up
 also by committing .gitmodules and then just removing it from the filesystem.
 '
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success 'sparse checkout setup which hides .gitmodules' '
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
                     ` (5 preceding siblings ...)
  2021-09-29 23:06   ` [PATCH v3 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
@ 2021-09-29 23:06   ` Jonathan Tan
  2021-10-07 18:34     ` Josh Steadmon
  2021-10-07 18:32   ` [PATCH v3 0/7] No more " Josh Steadmon
  7 siblings, 1 reply; 65+ messages in thread
From: Jonathan Tan @ 2021-09-29 23:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 submodule.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/submodule.c b/submodule.c
index 4bf552b0e5..61575e5a56 100644
--- a/submodule.c
+++ b/submodule.c
@@ -201,6 +201,8 @@ int register_all_submodule_odb_as_alternates(void)
 		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
 	if (ret) {
 		string_list_clear(&added_submodule_odb_paths, 0);
+		trace2_data_intmax("submodule", the_repository,
+				   "register_all_submodule_odb_as_alternates/registered", ret);
 		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
 			BUG("register_all_submodule_odb_as_alternates() called");
 	}
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH] fixup! refs: plumb repo into ref stores
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
@ 2021-09-30 11:13     ` Carlo Marcelo Arenas Belón
  2021-10-06 17:42     ` Glen Choo
  2021-10-07 18:33     ` [PATCH v3 1/7] " Josh Steadmon
  2 siblings, 0 replies; 65+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2021-09-30 11:13 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Carlo Marcelo Arenas Belón

fails hdr-check in CI[1]

[1] https://github.com/carenas/git/runs/3754992076
---
 refs/packed-backend.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index 942c908771..f61a73ec25 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -1,6 +1,7 @@
 #ifndef REFS_PACKED_BACKEND_H
 #define REFS_PACKED_BACKEND_H
 
+struct repository;
 struct ref_transaction;
 
 /*
-- 
2.33.0.955.gee03ddbf0e


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH] fixup! refs: plumb repo into ref stores
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
  2021-09-30 11:13     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
@ 2021-10-06 17:42     ` Glen Choo
  2021-10-08 20:05       ` Jonathan Tan
  2021-10-08 20:07       ` Jonathan Tan
  2021-10-07 18:33     ` [PATCH v3 1/7] " Josh Steadmon
  2 siblings, 2 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-06 17:42 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

If we are plumbing repo into ref stores, it makes sense to get rid of
the_repository in refs/files-backend.c and use ref_store.repo instead.

Signed-off-by: Glen Choo <chooglen@google.com>
---
In [1], I made some changes to refs/files-backend.c to get rid of
the_repository and accept struct repository as a parameter instead. But,
if we're changing ref stores to contain their own repository, it makes
sense to use this new interface.

I think the most natural place for this is this series. Let me know what
you think :)

[1] https://lore.kernel.org/git/20210921232529.81811-2-chooglen@google.com/

 refs/files-backend.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 9d50fc91f8..0358268aba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1347,7 +1347,7 @@ static int rename_tmp_log(struct files_ref_store *refs, const char *newrefname)
 	return ret;
 }
 
-static int write_ref_to_lockfile(struct ref_lock *lock,
+static int write_ref_to_lockfile(struct repository *repo, struct ref_lock *lock,
 				 const struct object_id *oid, struct strbuf *err);
 static int commit_ref_update(struct files_ref_store *refs,
 			     struct ref_lock *lock,
@@ -1465,7 +1465,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 	}
 	oidcpy(&lock->old_oid, &orig_oid);
 
-	if (write_ref_to_lockfile(lock, &orig_oid, &err) ||
+	if (write_ref_to_lockfile(ref_store->repo, lock, &orig_oid, &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, logmsg, &err)) {
 		error("unable to write current sha1 into %s: %s", newrefname, err.buf);
 		strbuf_release(&err);
@@ -1485,7 +1485,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 
 	flag = log_all_ref_updates;
 	log_all_ref_updates = LOG_REFS_NONE;
-	if (write_ref_to_lockfile(lock, &orig_oid, &err) ||
+	if (write_ref_to_lockfile(ref_store->repo, lock, &orig_oid, &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, NULL, &err)) {
 		error("unable to write current sha1 into %s: %s", oldrefname, err.buf);
 		strbuf_release(&err);
@@ -1720,14 +1720,14 @@ static int files_log_ref_write(struct files_ref_store *refs,
  * Write oid into the open lockfile, then close the lockfile. On
  * errors, rollback the lockfile, fill in *err and return -1.
  */
-static int write_ref_to_lockfile(struct ref_lock *lock,
+static int write_ref_to_lockfile(struct repository *repo, struct ref_lock *lock,
 				 const struct object_id *oid, struct strbuf *err)
 {
 	static char term = '\n';
 	struct object *o;
 	int fd;
 
-	o = parse_object(the_repository, oid);
+	o = parse_object(repo, oid);
 	if (!o) {
 		strbuf_addf(err,
 			    "trying to write ref '%s' with nonexistent object %s",
@@ -2531,7 +2531,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 			 * The reference already has the desired
 			 * value, so we don't need to write it.
 			 */
-		} else if (write_ref_to_lockfile(lock, &update->new_oid,
+		} else if (write_ref_to_lockfile(refs->base.repo, lock,
+						 &update->new_oid,
 						 err)) {
 			char *write_err = strbuf_detach(err, NULL);
 
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/7] No more adding submodule ODB as alternate
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
                     ` (6 preceding siblings ...)
  2021-09-29 23:06   ` [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
@ 2021-10-07 18:32   ` Josh Steadmon
  7 siblings, 0 replies; 65+ messages in thread
From: Josh Steadmon @ 2021-10-07 18:32 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git



On 2021.09.29 16:06, Jonathan Tan wrote:
> This is on a merge of jk/ref-paranoia and jt/add-submodule-odb-clean-up
> (same as v2).
> 
> Here's the same patch set except that the repo is plumbed into the ref
> stores. (Iterators currently do not have any reference to their ref
> stores, so some of them still need repo fields. But because the ref
> stores now know their repos, calling code does not need to pass a repo
> when these iterators are instantiated.)
> 
> As you can see from the shorter patch list, this eliminates the need for
> some patches.
> 
> Jonathan Tan (7):
>   refs: plumb repo into ref stores
>   refs: teach arbitrary repo support to iterators
>   refs: peeling non-the_repository iterators is BUG
>   merge-{ort,recursive}: remove add_submodule_odb()
>   object-file: only register submodule ODB if needed
>   submodule: pass repo to check_has_commit()
>   submodule: trace adding submodule ODB as alternate

This is a summary of discussion from our Review Club meeting today;
feedback comes from Jonathan Nieder, Emily Shaffer, Glen Choo, and
me.

(Side note for the list: we're trying to be more open with our Review
Club meetings. If you're interested in having video-chat discussions of
patch series with us every other week, please get in touch with me.)

Summary: this series looks good to me, with a few nits (replied to the
patches directly for these). So, once those are addressed, this series
has my
Reviewed-by: Josh Steadmon <steadmon@google.com>

Eliminating submodules as alternates has (at least) a couple of nice
benefits, namely:
* we don't have to scan through the alternates one by one looking
  for missing objects
* having submodule objects show up as alternates makes it tricky to
  support partial clones of submodules, since the partial clone
  machinery doesn't know to talk to the submodule remote to find its
  objects

This series builds on the approach in
jt/grep-wo-submodule-odb-as-alternate, which recently graduated to
master. Specifically, we plumb `struct repository` pointers to various
places that previously relied on `the_repository`, then we use
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB to show that no code paths in the
test suite still rely on registering submodules as alternates.

Specifically to this series, we're adding struct repository pointers to
`struct ref_store` and various iterators. This seems reasonable as we
don't keep a lot of these structs around, so the additional memory usage
isn't much of a concern.

Thanks for this series, really glad to see the end of submodule
alternates!

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 1/7] refs: plumb repo into ref stores
  2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
  2021-09-30 11:13     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
  2021-10-06 17:42     ` Glen Choo
@ 2021-10-07 18:33     ` Josh Steadmon
  2021-10-08 20:08       ` Jonathan Tan
  2 siblings, 1 reply; 65+ messages in thread
From: Josh Steadmon @ 2021-10-07 18:33 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 2021.09.29 16:06, Jonathan Tan wrote:
> In preparation for the next 2 patches that adds (partial) support for
> arbitrary repositories to ref iterators, plumb a repository into all ref
> stores. There are no changes to program logic.
> 
> (The repository is plumbed into the ref stores instead of directly into
> the ref iterators themselves, so that existing code that operates on ref
> stores do not need to be modified to also handle repositories.)

The second paragraph is a bit confusing, as in patch 2 we do end up
adding repository pointers into various iterators. Could you clarify
this a bit?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb()
  2021-09-29 23:06   ` [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-10-07 18:34     ` Josh Steadmon
  2021-10-08 20:19       ` Jonathan Tan
  0 siblings, 1 reply; 65+ messages in thread
From: Josh Steadmon @ 2021-10-07 18:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 2021.09.29 16:06, Jonathan Tan wrote:
> After the parent commit and some of its ancestors, the only place
> commits are being accessed through alternates is in the user-facing
> message formatting code. Fix those, and remove the add_submodule_odb()
> calls.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  merge-ort.c                | 18 ++++-------------
>  merge-recursive.c          | 41 +++++++++++++++++++-------------------
>  strbuf.c                   | 12 ++++++++---
>  strbuf.h                   |  6 ++++--
>  t/t6437-submodule-merge.sh |  3 +++
>  5 files changed, 40 insertions(+), 40 deletions(-)
> 
> diff --git a/strbuf.c b/strbuf.c
> index c8a5789694..b22e981655 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
>  	strbuf_setlen(sb, sb->len + len);
>  }
>  
> -void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
> -			      int abbrev_len)
> +void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
> +				   const struct object_id *oid, int abbrev_len)
>  {
>  	int r;
>  	strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
> -	r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
> +	r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
>  	strbuf_setlen(sb, sb->len + r);
>  }
>  
> +void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
> +			      int abbrev_len)
> +{
> +	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
> +}
> +

Should strbuf_add_unique_abbrev() be inlined and moved to the header?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 5/7] object-file: only register submodule ODB if needed
  2021-09-29 23:06   ` [PATCH v3 5/7] object-file: only register submodule ODB if needed Jonathan Tan
@ 2021-10-07 18:34     ` Josh Steadmon
  2021-10-08 20:22       ` Jonathan Tan
  0 siblings, 1 reply; 65+ messages in thread
From: Josh Steadmon @ 2021-10-07 18:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 2021.09.29 16:06, Jonathan Tan wrote:
> In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
> 2021-09-08), Git was taught to add all known submodule ODBs as
> alternates when attempting to read an object that doesn't exist, as a
> fallback for when a submodule object is read as if it were in
> the_repository. However, this behavior wasn't restricted to happen only
> when reading from the_repository. Fix this.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  object-file.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/object-file.c b/object-file.c
> index be4f94ecf3..2b988b7c36 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1614,7 +1614,8 @@ static int do_oid_object_info_extended(struct repository *r,
>  				break;
>  		}
>  
> -		if (register_all_submodule_odb_as_alternates())
> +		if (r == the_repository &&
> +		    register_all_submodule_odb_as_alternates())
>  			/* We added some alternates; retry */
>  			continue;
>  
> -- 
> 2.33.0.685.g46640cef36-goog
> 

It looks like this is just a small bugfix, but can you expand on the
implications here? What happens if r != the_repository?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate
  2021-09-29 23:06   ` [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
@ 2021-10-07 18:34     ` Josh Steadmon
  2021-10-08 20:23       ` Jonathan Tan
  0 siblings, 1 reply; 65+ messages in thread
From: Josh Steadmon @ 2021-10-07 18:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 2021.09.29 16:06, Jonathan Tan wrote:
> Submodule ODBs are never added as alternates during the execution of the
> test suite, but there may be a rare interaction that the test suite does
> not have coverage of. Add a trace message when this happens, so that
> users who trace their commands can notice such occurrences.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  submodule.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/submodule.c b/submodule.c
> index 4bf552b0e5..61575e5a56 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -201,6 +201,8 @@ int register_all_submodule_odb_as_alternates(void)
>  		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
>  	if (ret) {
>  		string_list_clear(&added_submodule_odb_paths, 0);
> +		trace2_data_intmax("submodule", the_repository,
> +				   "register_all_submodule_odb_as_alternates/registered", ret);
>  		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
>  			BUG("register_all_submodule_odb_as_alternates() called");
>  	}
> -- 
> 2.33.0.685.g46640cef36-goog
> 

Can you also update the GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB note in
t/README about tracing in this case?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 2/7] refs: teach arbitrary repo support to iterators
  2021-09-29 23:06   ` [PATCH v3 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
@ 2021-10-07 19:31     ` Glen Choo
  2021-10-08 20:12       ` Jonathan Tan
  0 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-07 19:31 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Jonathan Tan

> @@ -1139,7 +1142,7 @@ static int should_pack_ref(const char *refname,
>  		return 0;
>  
>  	/* Do not pack broken refs: */
> -	if (!ref_resolves_to_object(refname, oid, ref_flags))
> +	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
>  		return 0;

Nit: It would be nice to have a NEEDSWORK the explain why the_repository is
hardcoded instead of the commit message. Without having something in the
code, it is a little surprising to see the_repository when we are so
close to removing the_repository from refs/ altogether.

This in the spirit of your check in patch 3, where we explicitly warn
readers when non-the_repository is not supported.

  +	if (iter->repo != the_repository)
  +		BUG("peeling for non-the_repository is not supported");


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] fixup! refs: plumb repo into ref stores
  2021-10-06 17:42     ` Glen Choo
@ 2021-10-08 20:05       ` Jonathan Tan
  2021-10-08 20:07       ` Jonathan Tan
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:05 UTC (permalink / raw)
  To: chooglen; +Cc: git, Jonathan Tan

> If we are plumbing repo into ref stores, it makes sense to get rid of
> the_repository in refs/files-backend.c and use ref_store.repo instead.

Doing that would mean changing some functions (e.g. should_pack_ref())
to take a ref store or repository parameter, and as it is, I'm not sure
if there are still implicit references to the_repository. I think that
all these can be done in a separate patch set.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] fixup! refs: plumb repo into ref stores
  2021-10-06 17:42     ` Glen Choo
  2021-10-08 20:05       ` Jonathan Tan
@ 2021-10-08 20:07       ` Jonathan Tan
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:07 UTC (permalink / raw)
  To: chooglen; +Cc: git, Jonathan Tan

> If we are plumbing repo into ref stores, it makes sense to get rid of
> the_repository in refs/files-backend.c and use ref_store.repo instead.
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> In [1], I made some changes to refs/files-backend.c to get rid of
> the_repository and accept struct repository as a parameter instead. But,
> if we're changing ref stores to contain their own repository, it makes
> sense to use this new interface.
> 
> I think the most natural place for this is this series. Let me know what
> you think :)
> 
> [1] https://lore.kernel.org/git/20210921232529.81811-2-chooglen@google.com/

[snip]

> @@ -1347,7 +1347,7 @@ static int rename_tmp_log(struct files_ref_store *refs, const char *newrefname)
>  	return ret;
>  }
>  
> -static int write_ref_to_lockfile(struct ref_lock *lock,
> +static int write_ref_to_lockfile(struct repository *repo, struct ref_lock *lock,

Ah sorry, I didn't see that you already did this. I don't think that
it's natural to do it here - it's probably better to do it in another
patch set that also verifies that there are no implicit references to
the_repository.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 1/7] refs: plumb repo into ref stores
  2021-10-07 18:33     ` [PATCH v3 1/7] " Josh Steadmon
@ 2021-10-08 20:08       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:08 UTC (permalink / raw)
  To: steadmon; +Cc: jonathantanmy, git

> On 2021.09.29 16:06, Jonathan Tan wrote:
> > In preparation for the next 2 patches that adds (partial) support for
> > arbitrary repositories to ref iterators, plumb a repository into all ref
> > stores. There are no changes to program logic.
> > 
> > (The repository is plumbed into the ref stores instead of directly into
> > the ref iterators themselves, so that existing code that operates on ref
> > stores do not need to be modified to also handle repositories.)
> 
> The second paragraph is a bit confusing, as in patch 2 we do end up
> adding repository pointers into various iterators. Could you clarify
> this a bit?

Ah, good catch. I don't think the paragraph is adding much, so I'll just
remove it.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 2/7] refs: teach arbitrary repo support to iterators
  2021-10-07 19:31     ` Glen Choo
@ 2021-10-08 20:12       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:12 UTC (permalink / raw)
  To: chooglen; +Cc: jonathantanmy, git

> > @@ -1139,7 +1142,7 @@ static int should_pack_ref(const char *refname,
> >  		return 0;
> >  
> >  	/* Do not pack broken refs: */
> > -	if (!ref_resolves_to_object(refname, oid, ref_flags))
> > +	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
> >  		return 0;
> 
> Nit: It would be nice to have a NEEDSWORK the explain why the_repository is
> hardcoded instead of the commit message. Without having something in the
> code, it is a little surprising to see the_repository when we are so
> close to removing the_repository from refs/ altogether.
> 
> This in the spirit of your check in patch 3, where we explicitly warn
> readers when non-the_repository is not supported.
> 
>   +	if (iter->repo != the_repository)
>   +		BUG("peeling for non-the_repository is not supported");

This doesn't bring us closer or further from removing the_repository
from refs/ - originally, this call to ref_resolves_to_object()
implicitly referenced the_repository, and now it explicitly does, so
there's no increase or decrease in usage of the_repository. I think that
the true NEEDSWORK is that the entire ref writing system operates only
on the_repository, and this is probably not the place to note it.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb()
  2021-10-07 18:34     ` Josh Steadmon
@ 2021-10-08 20:19       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:19 UTC (permalink / raw)
  To: steadmon; +Cc: jonathantanmy, git

> > +void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
> > +			      int abbrev_len)
> > +{
> > +	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
> > +}
> > +
> 
> Should strbuf_add_unique_abbrev() be inlined and moved to the header?

I felt that it wasn't worth writing '#include "repository.h"' in
strbuf.h just so that I could inline it. (The function signature just
uses "struct repository" opaquely so "struct repository;" is fine, but
the function definition itself will require full information about
the_repository so we would need to include the file.)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 5/7] object-file: only register submodule ODB if needed
  2021-10-07 18:34     ` Josh Steadmon
@ 2021-10-08 20:22       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:22 UTC (permalink / raw)
  To: steadmon; +Cc: jonathantanmy, git

> > @@ -1614,7 +1614,8 @@ static int do_oid_object_info_extended(struct repository *r,
> >  				break;
> >  		}
> >  
> > -		if (register_all_submodule_odb_as_alternates())
> > +		if (r == the_repository &&
> > +		    register_all_submodule_odb_as_alternates())
> >  			/* We added some alternates; retry */
> >  			continue;
> >  
> > -- 
> > 2.33.0.685.g46640cef36-goog
> > 
> 
> It looks like this is just a small bugfix, but can you expand on the
> implications here? What happens if r != the_repository?

The purpose of the quoted block (before and after the check added in
this patch) is so that reading submodule objects by reading them as if
they were in the_repository works, and it follows that this is only
necessary if we pass the_repository (because we are reading them as if
they were in the_repository). I'll add a comment explaining this.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate
  2021-10-07 18:34     ` Josh Steadmon
@ 2021-10-08 20:23       ` Jonathan Tan
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 20:23 UTC (permalink / raw)
  To: steadmon; +Cc: jonathantanmy, git

> On 2021.09.29 16:06, Jonathan Tan wrote:
> > Submodule ODBs are never added as alternates during the execution of the
> > test suite, but there may be a rare interaction that the test suite does
> > not have coverage of. Add a trace message when this happens, so that
> > users who trace their commands can notice such occurrences.
> > 
> > Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> > ---
> >  submodule.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/submodule.c b/submodule.c
> > index 4bf552b0e5..61575e5a56 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -201,6 +201,8 @@ int register_all_submodule_odb_as_alternates(void)
> >  		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
> >  	if (ret) {
> >  		string_list_clear(&added_submodule_odb_paths, 0);
> > +		trace2_data_intmax("submodule", the_repository,
> > +				   "register_all_submodule_odb_as_alternates/registered", ret);
> >  		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
> >  			BUG("register_all_submodule_odb_as_alternates() called");
> >  	}
> > -- 
> > 2.33.0.685.g46640cef36-goog
> > 
> 
> Can you also update the GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB note in
> t/README about tracing in this case?

Good catch - I'll do that.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 0/7] No more adding submodule ODB as alternate
  2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
                   ` (11 preceding siblings ...)
  2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
@ 2021-10-08 21:08 ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 1/7] refs: plumb repo into ref stores Jonathan Tan
                     ` (8 more replies)
  12 siblings, 9 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon

Thanks everyone for your reviews. Here's an updated patch set, including
Carlo's fixup squashed.

Jonathan Tan (7):
  refs: plumb repo into ref stores
  refs: teach arbitrary repo support to iterators
  refs: peeling non-the_repository iterators is BUG
  merge-{ort,recursive}: remove add_submodule_odb()
  object-file: only register submodule ODB if needed
  submodule: pass repo to check_has_commit()
  submodule: trace adding submodule ODB as alternate

 merge-ort.c                            | 18 +++--------
 merge-recursive.c                      | 41 +++++++++++++-------------
 object-file.c                          |  9 +++++-
 refs.c                                 | 32 +++++++++++++++-----
 refs/files-backend.c                   | 16 ++++++----
 refs/packed-backend.c                  | 13 ++++++--
 refs/packed-backend.h                  |  4 ++-
 refs/ref-cache.c                       | 10 +++++++
 refs/ref-cache.h                       |  1 +
 refs/refs-internal.h                   | 11 +++++--
 strbuf.c                               | 12 ++++++--
 strbuf.h                               |  6 ++--
 submodule.c                            | 18 +++++++++--
 t/README                               |  7 ++---
 t/t5526-fetch-submodules.sh            |  3 ++
 t/t5531-deep-submodule-push.sh         |  3 ++
 t/t5545-push-options.sh                |  3 ++
 t/t5572-pull-submodule.sh              |  3 ++
 t/t6437-submodule-merge.sh             |  3 ++
 t/t7418-submodule-sparse-gitmodules.sh |  3 ++
 20 files changed, 148 insertions(+), 68 deletions(-)

Range-diff against v3:
1:  878c4dd288 ! 1:  f050191d4c refs: plumb repo into ref stores
    @@ Commit message
         arbitrary repositories to ref iterators, plumb a repository into all ref
         stores. There are no changes to program logic.
     
    -    (The repository is plumbed into the ref stores instead of directly into
    -    the ref iterators themselves, so that existing code that operates on ref
    -    stores do not need to be modified to also handle repositories.)
    -
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    @@ refs/packed-backend.c: static int release_snapshot(struct snapshot *snapshot)
      
     
      ## refs/packed-backend.h ##
    +@@
    + #ifndef REFS_PACKED_BACKEND_H
    + #define REFS_PACKED_BACKEND_H
    + 
    ++struct repository;
    + struct ref_transaction;
    + 
    + /*
     @@ refs/packed-backend.h: struct ref_transaction;
       * even among packed refs.
       */
2:  7180f622b1 = 2:  6418256919 refs: teach arbitrary repo support to iterators
3:  1a2e2e3e08 = 3:  d624c198d6 refs: peeling non-the_repository iterators is BUG
4:  89347503af = 4:  f3df7a31cb merge-{ort,recursive}: remove add_submodule_odb()
5:  17d6c0a793 ! 5:  78473b0f89 object-file: only register submodule ODB if needed
    @@ object-file.c: static int do_oid_object_info_extended(struct repository *r,
      		}
      
     -		if (register_all_submodule_odb_as_alternates())
    ++		/*
    ++		 * If r is the_repository, this might be an attempt at
    ++		 * accessing a submodule object as if it were in the_repository
    ++		 * (having called add_submodule_odb() on that submodule's ODB).
    ++		 * If any such ODBs exist, register them and try again.
    ++		 */
     +		if (r == the_repository &&
     +		    register_all_submodule_odb_as_alternates())
      			/* We added some alternates; retry */
6:  1eb2dda2dc = 6:  f4241ea2e7 submodule: pass repo to check_has_commit()
7:  36e741dda8 ! 7:  8922bf48a2 submodule: trace adding submodule ODB as alternate
    @@ submodule.c: int register_all_submodule_odb_as_alternates(void)
      		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
      			BUG("register_all_submodule_odb_as_alternates() called");
      	}
    +
    + ## t/README ##
    +@@ t/README: GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=<boolean>, when true, makes
    + registering submodule ODBs as alternates a fatal action. Support for
    + this environment variable can be removed once the migration to
    + explicitly providing repositories when accessing submodule objects is
    +-complete (in which case we might want to replace this with a trace2
    +-call so that users can make it visible if accessing submodule objects
    +-without an explicit repository still happens) or needs to be abandoned
    +-for whatever reason (in which case the migrated codepaths still retain
    +-their performance benefits).
    ++complete or needs to be abandoned for whatever reason (in which case the
    ++migrated codepaths still retain their performance benefits).
    + 
    + Naming Tests
    + ------------
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 1/7] refs: plumb repo into ref stores
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories to ref iterators, plumb a repository into all ref
stores. There are no changes to program logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c                | 29 ++++++++++++++++++++++-------
 refs/files-backend.c  |  6 ++++--
 refs/packed-backend.c |  4 +++-
 refs/packed-backend.h |  4 +++-
 refs/refs-internal.h  | 10 ++++++++--
 5 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index 2be0d0f057..9c4e388153 100644
--- a/refs.c
+++ b/refs.c
@@ -1873,7 +1873,8 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(struct repository *repo,
+					const char *gitdir,
 					unsigned int flags)
 {
 	const char *be_name = "files";
@@ -1883,7 +1884,7 @@ static struct ref_store *ref_store_init(const char *gitdir,
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
-	refs = be->init(gitdir, flags);
+	refs = be->init(repo, gitdir, flags);
 	return refs;
 }
 
@@ -1895,7 +1896,7 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1925,6 +1926,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 	struct ref_store *refs;
 	char *to_free = NULL;
 	size_t len;
+	struct repository *subrepo;
 
 	if (!submodule)
 		return NULL;
@@ -1950,8 +1952,19 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 	if (submodule_to_gitdir(&submodule_sb, submodule))
 		goto done;
 
-	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	subrepo = xmalloc(sizeof(*subrepo));
+	/*
+	 * NEEDSWORK: Make get_submodule_ref_store() work with arbitrary
+	 * superprojects other than the_repository. This probably should be
+	 * done by making it take a struct repository * parameter instead of a
+	 * submodule path.
+	 */
+	if (repo_submodule_init(subrepo, the_repository, submodule,
+				null_oid())) {
+		free(subrepo);
+		goto done;
+	}
+	refs = ref_store_init(subrepo, submodule_sb.buf,
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1977,10 +1990,12 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 		return refs;
 
 	if (wt->id)
-		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
+		refs = ref_store_init(the_repository,
+				      git_common_path("worktrees/%s", wt->id),
 				      REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(the_repository,
+				      get_git_common_dir(),
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1148c0cf09..6a481e968f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -79,13 +79,15 @@ static void clear_loose_ref_cache(struct files_ref_store *refs)
  * Create a new submodule ref cache and add it to the internal
  * set of caches.
  */
-static struct ref_store *files_ref_store_create(const char *gitdir,
+static struct ref_store *files_ref_store_create(struct repository *repo,
+						const char *gitdir,
 						unsigned int flags)
 {
 	struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
 	struct ref_store *ref_store = (struct ref_store *)refs;
 	struct strbuf sb = STRBUF_INIT;
 
+	ref_store->repo = repo;
 	ref_store->gitdir = xstrdup(gitdir);
 	base_ref_store_init(ref_store, &refs_be_files);
 	refs->store_flags = flags;
@@ -93,7 +95,7 @@ static struct ref_store *files_ref_store_create(const char *gitdir,
 	get_common_dir_noenv(&sb, gitdir);
 	refs->gitcommondir = strbuf_detach(&sb, NULL);
 	strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir);
-	refs->packed_ref_store = packed_ref_store_create(sb.buf, flags);
+	refs->packed_ref_store = packed_ref_store_create(repo, sb.buf, flags);
 	strbuf_release(&sb);
 
 	chdir_notify_reparent("files-backend $GIT_DIR", &refs->base.gitdir);
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f8aa97d799..ea3493b24e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -193,13 +193,15 @@ static int release_snapshot(struct snapshot *snapshot)
 	}
 }
 
-struct ref_store *packed_ref_store_create(const char *path,
+struct ref_store *packed_ref_store_create(struct repository *repo,
+					  const char *path,
 					  unsigned int store_flags)
 {
 	struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
 	struct ref_store *ref_store = (struct ref_store *)refs;
 
 	base_ref_store_init(ref_store, &refs_be_packed);
+	ref_store->repo = repo;
 	ref_store->gitdir = xstrdup(path);
 	refs->store_flags = store_flags;
 
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index a01a0aff9c..f61a73ec25 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -1,6 +1,7 @@
 #ifndef REFS_PACKED_BACKEND_H
 #define REFS_PACKED_BACKEND_H
 
+struct repository;
 struct ref_transaction;
 
 /*
@@ -12,7 +13,8 @@ struct ref_transaction;
  * even among packed refs.
  */
 
-struct ref_store *packed_ref_store_create(const char *path,
+struct ref_store *packed_ref_store_create(struct repository *repo,
+					  const char *path,
 					  unsigned int store_flags);
 
 /*
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 96911fb26e..d28440c9cc 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -539,7 +539,8 @@ struct ref_store;
  * should call base_ref_store_init() to initialize the shared part of
  * the ref_store and to record the ref_store for later lookup.
  */
-typedef struct ref_store *ref_store_init_fn(const char *gitdir,
+typedef struct ref_store *ref_store_init_fn(struct repository *repo,
+					    const char *gitdir,
 					    unsigned int flags);
 
 typedef int ref_init_db_fn(struct ref_store *refs, struct strbuf *err);
@@ -697,7 +698,12 @@ struct ref_store {
 	/* The backend describing this ref_store's storage scheme: */
 	const struct ref_storage_be *be;
 
-	/* The gitdir that this ref_store applies to: */
+	struct repository *repo;
+
+	/*
+	 * The gitdir that this ref_store applies to. Note that this is not
+	 * necessarily repo->gitdir if the repo has multiple worktrees.
+	 */
 	char *gitdir;
 };
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 2/7] refs: teach arbitrary repo support to iterators
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 1/7] refs: plumb repo into ref stores Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c                | 3 ++-
 refs/files-backend.c  | 5 ++++-
 refs/packed-backend.c | 6 ++++--
 refs/refs-internal.h  | 1 +
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 9c4e388153..c07aeff6f4 100644
--- a/refs.c
+++ b/refs.c
@@ -255,12 +255,13 @@ int refname_is_safe(const char *refname)
  * does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags)
 {
 	if (flags & REF_ISBROKEN)
 		return 0;
-	if (!has_object_file(oid)) {
+	if (!repo_has_object_file(repo, oid)) {
 		error(_("%s does not point to a valid object!"), refname);
 		return 0;
 	}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6a481e968f..3f213d24b0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -732,6 +732,7 @@ struct files_ref_iterator {
 	struct ref_iterator base;
 
 	struct ref_iterator *iter0;
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -753,6 +754,7 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
 		    !ref_resolves_to_object(iter->iter0->refname,
+					    iter->repo,
 					    iter->iter0->oid,
 					    iter->iter0->flags))
 			continue;
@@ -855,6 +857,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
 			       overlay_iter->ordered);
 	iter->iter0 = overlay_iter;
+	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
 	return ref_iterator;
@@ -1139,7 +1142,7 @@ static int should_pack_ref(const char *refname,
 		return 0;
 
 	/* Do not pack broken refs: */
-	if (!ref_resolves_to_object(refname, oid, ref_flags))
+	if (!ref_resolves_to_object(refname, the_repository, oid, ref_flags))
 		return 0;
 
 	return 1;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ea3493b24e..63f78bbaea 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -778,6 +778,7 @@ struct packed_ref_iterator {
 	struct object_id oid, peeled;
 	struct strbuf refname_buf;
 
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -866,8 +867,8 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			continue;
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
-					    iter->flags))
+		    !ref_resolves_to_object(iter->base.refname, iter->repo,
+					    &iter->oid, iter->flags))
 			continue;
 
 		return ITER_OK;
@@ -956,6 +957,7 @@ static struct ref_iterator *packed_ref_iterator_begin(
 
 	iter->base.oid = &iter->oid;
 
+	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
 	if (prefix && *prefix)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index d28440c9cc..500d77864d 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -66,6 +66,7 @@ int refname_is_safe(const char *refname);
  * referred-to object does not exist, emit a warning and return false.
  */
 int ref_resolves_to_object(const char *refname,
+			   struct repository *repo,
 			   const struct object_id *oid,
 			   unsigned int flags);
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 3/7] refs: peeling non-the_repository iterators is BUG
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 1/7] refs: plumb repo into ref stores Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs/files-backend.c  |  5 +++--
 refs/packed-backend.c |  3 +++
 refs/ref-cache.c      | 10 ++++++++++
 refs/ref-cache.h      |  1 +
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3f213d24b0..8ee6ac2103 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -833,7 +833,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 */
 
 	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
-					      prefix, 1);
+					      prefix, ref_store->repo, 1);
 
 	/*
 	 * The packed-refs file might contain broken references, for
@@ -1165,7 +1165,8 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 
 	packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
-	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
+	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL,
+					the_repository, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		/*
 		 * If the loose reference can be packed, add an entry
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 63f78bbaea..2161218719 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -886,6 +886,9 @@ static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	struct packed_ref_iterator *iter =
 		(struct packed_ref_iterator *)ref_iterator;
 
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
+
 	if ((iter->base.flags & REF_KNOWS_PEELED)) {
 		oidcpy(peeled, &iter->peeled);
 		return is_null_oid(&iter->peeled) ? -1 : 0;
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 49d732f6db..97a6ac349e 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -435,6 +435,8 @@ struct cache_ref_iterator {
 	 * on from there.)
 	 */
 	struct cache_ref_iterator_level *levels;
+
+	struct repository *repo;
 };
 
 static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
@@ -491,6 +493,11 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
 				   struct object_id *peeled)
 {
+	struct cache_ref_iterator *iter =
+		(struct cache_ref_iterator *)ref_iterator;
+
+	if (iter->repo != the_repository)
+		BUG("peeling for non-the_repository is not supported");
 	return peel_object(ref_iterator->oid, peeled) ? -1 : 0;
 }
 
@@ -513,6 +520,7 @@ static struct ref_iterator_vtable cache_ref_iterator_vtable = {
 
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir)
 {
 	struct ref_dir *dir;
@@ -547,5 +555,7 @@ struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 		level->prefix_state = PREFIX_CONTAINS_DIR;
 	}
 
+	iter->repo = repo;
+
 	return ref_iterator;
 }
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index 3bfb89d2b3..7877bf86ed 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -238,6 +238,7 @@ struct ref_entry *find_ref_entry(struct ref_dir *dir, const char *refname);
  */
 struct ref_iterator *cache_ref_iterator_begin(struct ref_cache *cache,
 					      const char *prefix,
+					      struct repository *repo,
 					      int prime_dir);
 
 #endif /* REFS_REF_CACHE_H */
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 4/7] merge-{ort,recursive}: remove add_submodule_odb()
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-10-08 21:08   ` [PATCH v4 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 5/7] object-file: only register submodule ODB if needed Jonathan Tan
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-ort.c                | 18 ++++-------------
 merge-recursive.c          | 41 +++++++++++++++++++-------------------
 strbuf.c                   | 12 ++++++++---
 strbuf.h                   |  6 ++++--
 t/t6437-submodule-merge.sh |  3 +++
 5 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b88475475d..fbc5c204c1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -609,6 +609,7 @@ static int err(struct merge_options *opt, const char *err, ...)
 
 static void format_commit(struct strbuf *sb,
 			  int indent,
+			  struct repository *repo,
 			  struct commit *commit)
 {
 	struct merge_remote_desc *desc;
@@ -622,7 +623,7 @@ static void format_commit(struct strbuf *sb,
 		return;
 	}
 
-	format_commit_message(commit, "%h %s", sb, &ctx);
+	repo_format_commit_message(repo, commit, "%h %s", sb, &ctx);
 	strbuf_addch(sb, '\n');
 }
 
@@ -1578,17 +1579,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		path_msg(opt, path, 0,
-			 _("Failed to merge submodule %s (not checked out)"),
-			 path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		path_msg(opt, path, 0,
 				_("Failed to merge submodule %s (not checked out)"),
@@ -1653,7 +1643,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 
 	case 1:
-		format_commit(&sb, 4,
+		format_commit(&sb, 4, &subrepo,
 			      (struct commit *)merges.objects[0].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but a possible merge "
@@ -1670,7 +1660,7 @@ static int merge_submodule(struct merge_options *opt,
 		break;
 	default:
 		for (i = 0; i < merges.nr; i++)
-			format_commit(&sb, 4,
+			format_commit(&sb, 4, &subrepo,
 				      (struct commit *)merges.objects[i].item);
 		path_msg(opt, path, 0,
 			 _("Failed to merge submodule %s, but multiple "
diff --git a/merge-recursive.c b/merge-recursive.c
index 5a2d8a60c0..80594153f1 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,7 +334,9 @@ static void output(struct merge_options *opt, int v, const char *fmt, ...)
 		flush_output(opt);
 }
 
-static void output_commit_title(struct merge_options *opt, struct commit *commit)
+static void repo_output_commit_title(struct merge_options *opt,
+				     struct repository *repo,
+				     struct commit *commit)
 {
 	struct merge_remote_desc *desc;
 
@@ -343,23 +345,29 @@ static void output_commit_title(struct merge_options *opt, struct commit *commit
 	if (desc)
 		strbuf_addf(&opt->obuf, "virtual %s\n", desc->name);
 	else {
-		strbuf_add_unique_abbrev(&opt->obuf, &commit->object.oid,
-					 DEFAULT_ABBREV);
+		strbuf_repo_add_unique_abbrev(&opt->obuf, repo,
+					      &commit->object.oid,
+					      DEFAULT_ABBREV);
 		strbuf_addch(&opt->obuf, ' ');
-		if (parse_commit(commit) != 0)
+		if (repo_parse_commit(repo, commit) != 0)
 			strbuf_addstr(&opt->obuf, _("(bad commit)\n"));
 		else {
 			const char *title;
-			const char *msg = get_commit_buffer(commit, NULL);
+			const char *msg = repo_get_commit_buffer(repo, commit, NULL);
 			int len = find_commit_subject(msg, &title);
 			if (len)
 				strbuf_addf(&opt->obuf, "%.*s\n", len, title);
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(repo, commit, msg);
 		}
 	}
 	flush_output(opt);
 }
 
+static void output_commit_title(struct merge_options *opt, struct commit *commit)
+{
+	repo_output_commit_title(opt, the_repository, commit);
+}
+
 static int add_cacheinfo(struct merge_options *opt,
 			 const struct diff_filespec *blob,
 			 const char *path, int stage, int refresh, int options)
@@ -1149,14 +1157,14 @@ static int find_first_merges(struct repository *repo,
 	return result->nr;
 }
 
-static void print_commit(struct commit *commit)
+static void print_commit(struct repository *repo, struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
 	ctx.date_mode.type = DATE_NORMAL;
 	/* FIXME: Merge this with output_commit_title() */
 	assert(!merge_remote_util(commit));
-	format_commit_message(commit, " %h: %m %s", &sb, &ctx);
+	repo_format_commit_message(repo, commit, " %h: %m %s", &sb, &ctx);
 	fprintf(stderr, "%s\n", sb.buf);
 	strbuf_release(&sb);
 }
@@ -1196,15 +1204,6 @@ static int merge_submodule(struct merge_options *opt,
 	if (is_null_oid(b))
 		return 0;
 
-	/*
-	 * NEEDSWORK: Remove this when all submodule object accesses are
-	 * through explicitly specified repositores.
-	 */
-	if (add_submodule_odb(path)) {
-		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
-		return 0;
-	}
-
 	if (repo_submodule_init(&subrepo, opt->repo, path, null_oid())) {
 		output(opt, 1, _("Failed to merge submodule %s (not checked out)"), path);
 		return 0;
@@ -1229,7 +1228,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_b);
+			repo_output_commit_title(opt, &subrepo, commit_b);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1242,7 +1241,7 @@ static int merge_submodule(struct merge_options *opt,
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
-			output_commit_title(opt, commit_a);
+			repo_output_commit_title(opt, &subrepo, commit_a);
 		} else if (show(opt, 2))
 			output(opt, 2, _("Fast-forwarding submodule %s"), path);
 		else
@@ -1274,7 +1273,7 @@ static int merge_submodule(struct merge_options *opt,
 	case 1:
 		output(opt, 1, _("Failed to merge submodule %s (not fast-forward)"), path);
 		output(opt, 2, _("Found a possible merge resolution for the submodule:\n"));
-		print_commit((struct commit *) merges.objects[0].item);
+		print_commit(&subrepo, (struct commit *) merges.objects[0].item);
 		output(opt, 2, _(
 		       "If this is correct simply add it to the index "
 		       "for example\n"
@@ -1287,7 +1286,7 @@ static int merge_submodule(struct merge_options *opt,
 	default:
 		output(opt, 1, _("Failed to merge submodule %s (multiple merges found)"), path);
 		for (i = 0; i < merges.nr; i++)
-			print_commit((struct commit *) merges.objects[i].item);
+			print_commit(&subrepo, (struct commit *) merges.objects[i].item);
 	}
 
 	object_array_clear(&merges);
diff --git a/strbuf.c b/strbuf.c
index c8a5789694..b22e981655 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1059,15 +1059,21 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 	strbuf_setlen(sb, sb->len + len);
 }
 
-void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
-			      int abbrev_len)
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len)
 {
 	int r;
 	strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
-	r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
+	r = repo_find_unique_abbrev_r(repo, sb->buf + sb->len, oid, abbrev_len);
 	strbuf_setlen(sb, sb->len + r);
 }
 
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
+			      int abbrev_len)
+{
+	strbuf_repo_add_unique_abbrev(sb, the_repository, oid, abbrev_len);
+}
+
 /*
  * Returns the length of a line, without trailing spaces.
  *
diff --git a/strbuf.h b/strbuf.h
index 5b1113abf8..2d9e01c16f 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -634,8 +634,10 @@ void strbuf_list_free(struct strbuf **list);
  * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
  * the strbuf `sb`.
  */
-void strbuf_add_unique_abbrev(struct strbuf *sb,
-			      const struct object_id *oid,
+struct repository;
+void strbuf_repo_add_unique_abbrev(struct strbuf *sb, struct repository *repo,
+				   const struct object_id *oid, int abbrev_len);
+void strbuf_add_unique_abbrev(struct strbuf *sb, const struct object_id *oid,
 			      int abbrev_len);
 
 /**
diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
index e5e89c2045..178413c22f 100755
--- a/t/t6437-submodule-merge.sh
+++ b/t/t6437-submodule-merge.sh
@@ -5,6 +5,9 @@ test_description='merging with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 5/7] object-file: only register submodule ODB if needed
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (3 preceding siblings ...)
  2021-10-08 21:08   ` [PATCH v4 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index be4f94ecf3..0a1835fe30 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1614,7 +1614,14 @@ static int do_oid_object_info_extended(struct repository *r,
 				break;
 		}
 
-		if (register_all_submodule_odb_as_alternates())
+		/*
+		 * If r is the_repository, this might be an attempt at
+		 * accessing a submodule object as if it were in the_repository
+		 * (having called add_submodule_odb() on that submodule's ODB).
+		 * If any such ODBs exist, register them and try again.
+		 */
+		if (r == the_repository &&
+		    register_all_submodule_odb_as_alternates())
 			/* We added some alternates; retry */
 			continue;
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 6/7] submodule: pass repo to check_has_commit()
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (4 preceding siblings ...)
  2021-10-08 21:08   ` [PATCH v4 5/7] object-file: only register submodule ODB if needed Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-08 21:08   ` [PATCH v4 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
the last remaining tests no longer rely on add_submodule_odb(), so mark
these tests accordingly.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 submodule.c                            | 16 +++++++++++++---
 t/t5526-fetch-submodules.sh            |  3 +++
 t/t5531-deep-submodule-push.sh         |  3 +++
 t/t5545-push-options.sh                |  3 +++
 t/t5572-pull-submodule.sh              |  3 +++
 t/t7418-submodule-sparse-gitmodules.sh |  3 +++
 6 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 62beb8fd5f..4bf552b0e5 100644
--- a/submodule.c
+++ b/submodule.c
@@ -928,23 +928,33 @@ struct has_commit_data {
 static int check_has_commit(const struct object_id *oid, void *data)
 {
 	struct has_commit_data *cb = data;
+	struct repository subrepo;
+	enum object_type type;
 
-	enum object_type type = oid_object_info(cb->repo, oid, NULL);
+	if (repo_submodule_init(&subrepo, cb->repo, cb->path, null_oid())) {
+		cb->result = 0;
+		goto cleanup;
+	}
+
+	type = oid_object_info(&subrepo, oid, NULL);
 
 	switch (type) {
 	case OBJ_COMMIT:
-		return 0;
+		goto cleanup;
 	case OBJ_BAD:
 		/*
 		 * Object is missing or invalid. If invalid, an error message
 		 * has already been printed.
 		 */
 		cb->result = 0;
-		return 0;
+		goto cleanup;
 	default:
 		die(_("submodule entry '%s' (%s) is a %s, not a commit"),
 		    cb->path, oid_to_hex(oid), type_name(type));
 	}
+cleanup:
+	repo_clear(&subrepo);
+	return 0;
 }
 
 static int submodule_has_commits(struct repository *r,
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index ed11569d8d..2dc75b80db 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -6,6 +6,9 @@ test_description='Recursive "git fetch" for submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 pwd=$(pwd)
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index d573ca496a..3f58b515ce 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -5,6 +5,9 @@ test_description='test push with submodules'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 58c7add7ee..214228349a 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -5,6 +5,9 @@ test_description='pushing to a repository using push options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 mk_repo_pair () {
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index 4f92a116e1..fa6b4cca65 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -2,6 +2,9 @@
 
 test_description='pull can handle submodules'
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh
index 3f7f271883..f87e524d6d 100755
--- a/t/t7418-submodule-sparse-gitmodules.sh
+++ b/t/t7418-submodule-sparse-gitmodules.sh
@@ -12,6 +12,9 @@ The test setup uses a sparse checkout, however the same scenario can be set up
 also by committing .gitmodules and then just removing it from the filesystem.
 '
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
+export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+
 . ./test-lib.sh
 
 test_expect_success 'sparse checkout setup which hides .gitmodules' '
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 7/7] submodule: trace adding submodule ODB as alternate
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (5 preceding siblings ...)
  2021-10-08 21:08   ` [PATCH v4 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
@ 2021-10-08 21:08   ` Jonathan Tan
  2021-10-12 22:10   ` [PATCH v4 0/7] No more " Glen Choo
  2021-10-12 22:40   ` Josh Steadmon
  8 siblings, 0 replies; 65+ messages in thread
From: Jonathan Tan @ 2021-10-08 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, carenas, chooglen, steadmon, Junio C Hamano

Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 submodule.c | 2 ++
 t/README    | 7 ++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/submodule.c b/submodule.c
index 4bf552b0e5..61575e5a56 100644
--- a/submodule.c
+++ b/submodule.c
@@ -201,6 +201,8 @@ int register_all_submodule_odb_as_alternates(void)
 		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
 	if (ret) {
 		string_list_clear(&added_submodule_odb_paths, 0);
+		trace2_data_intmax("submodule", the_repository,
+				   "register_all_submodule_odb_as_alternates/registered", ret);
 		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
 			BUG("register_all_submodule_odb_as_alternates() called");
 	}
diff --git a/t/README b/t/README
index 51065d0800..b677caaf68 100644
--- a/t/README
+++ b/t/README
@@ -456,11 +456,8 @@ GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=<boolean>, when true, makes
 registering submodule ODBs as alternates a fatal action. Support for
 this environment variable can be removed once the migration to
 explicitly providing repositories when accessing submodule objects is
-complete (in which case we might want to replace this with a trace2
-call so that users can make it visible if accessing submodule objects
-without an explicit repository still happens) or needs to be abandoned
-for whatever reason (in which case the migrated codepaths still retain
-their performance benefits).
+complete or needs to be abandoned for whatever reason (in which case the
+migrated codepaths still retain their performance benefits).
 
 Naming Tests
 ------------
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/7] No more adding submodule ODB as alternate
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (6 preceding siblings ...)
  2021-10-08 21:08   ` [PATCH v4 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
@ 2021-10-12 22:10   ` Glen Choo
  2021-10-12 22:40   ` Josh Steadmon
  8 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 22:10 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Jonathan Tan, carenas, steadmon

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks everyone for your reviews. Here's an updated patch set, including
> Carlo's fixup squashed.

This series LGTM. My comments on v3 center around the remaining
references to the_repository, but as you have noted, we aren't at a
stage where we can remove the_repository. Rather, we are making
the_repository explicit in anticipation of removing the_repository from
the entire ref writing system.

I have some reservations about adding the backpointer from ref_store to
repository. I don't think this is the best long-term API, but it is a
reasonable step towards removing the_repository from the ref writing
system, and as Josh mentioned:

  This seems reasonable as we don't keep a lot of these structs around,
  so the additional memory usage isn't much of a concern.

Reviewed-by: Glen Choo <chooglen@google.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/7] No more adding submodule ODB as alternate
  2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
                     ` (7 preceding siblings ...)
  2021-10-12 22:10   ` [PATCH v4 0/7] No more " Glen Choo
@ 2021-10-12 22:40   ` Josh Steadmon
  2021-10-12 22:49     ` Junio C Hamano
  8 siblings, 1 reply; 65+ messages in thread
From: Josh Steadmon @ 2021-10-12 22:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, carenas, chooglen

On 2021.10.08 14:08, Jonathan Tan wrote:
> Thanks everyone for your reviews. Here's an updated patch set, including
> Carlo's fixup squashed.
> 
> Jonathan Tan (7):
>   refs: plumb repo into ref stores
>   refs: teach arbitrary repo support to iterators
>   refs: peeling non-the_repository iterators is BUG
>   merge-{ort,recursive}: remove add_submodule_odb()
>   object-file: only register submodule ODB if needed
>   submodule: pass repo to check_has_commit()
>   submodule: trace adding submodule ODB as alternate

This looks good to me. All my concerns from v3 have been addressed, so:

Reviewed-by: Josh Steadmon <steadmon@google.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/7] No more adding submodule ODB as alternate
  2021-10-12 22:40   ` Josh Steadmon
@ 2021-10-12 22:49     ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-12 22:49 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: Jonathan Tan, git, carenas, chooglen

Josh Steadmon <steadmon@google.com> writes:

> On 2021.10.08 14:08, Jonathan Tan wrote:
>> Thanks everyone for your reviews. Here's an updated patch set, including
>> Carlo's fixup squashed.
>> 
>> Jonathan Tan (7):
>>   refs: plumb repo into ref stores
>>   refs: teach arbitrary repo support to iterators
>>   refs: peeling non-the_repository iterators is BUG
>>   merge-{ort,recursive}: remove add_submodule_odb()
>>   object-file: only register submodule ODB if needed
>>   submodule: pass repo to check_has_commit()
>>   submodule: trace adding submodule ODB as alternate
>
> This looks good to me. All my concerns from v3 have been addressed, so:
>
> Reviewed-by: Josh Steadmon <steadmon@google.com>

Thanks, all.  Let's mark the topic for 'next' now.

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2021-10-12 22:49 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
2021-09-23  1:00   ` Junio C Hamano
2021-09-24 17:56     ` Jonathan Tan
2021-09-24 19:55       ` Junio C Hamano
2021-09-24 18:13   ` Jeff King
2021-09-24 18:28     ` Jonathan Tan
2021-09-21 16:51 ` [PATCH 2/9] refs: add repo paramater to _iterator_peel() Jonathan Tan
2021-09-21 16:51 ` [PATCH 3/9] refs iterator: support non-the_repository advance Jonathan Tan
2021-09-21 16:51 ` [PATCH 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
2021-09-21 16:51 ` [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-09-28  0:29   ` Elijah Newren
2021-09-21 16:51 ` [PATCH 6/9] object-file: only register submodule ODB if needed Jonathan Tan
2021-09-21 16:51 ` [PATCH 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-21 16:51 ` [PATCH 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
2021-09-21 16:51 ` [PATCH 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-09-23 18:05 ` [PATCH 0/9] No more " Junio C Hamano
2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
2021-09-28 22:24     ` Junio C Hamano
2021-09-28 20:10   ` [PATCH v2 2/9] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-09-28 22:35     ` Junio C Hamano
2021-09-29 17:04       ` Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 3/9] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
2021-09-28 22:49     ` Junio C Hamano
2021-09-28 20:10   ` [PATCH v2 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 6/9] object-file: only register submodule ODB if needed Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
2021-09-30 11:13     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2021-10-06 17:42     ` Glen Choo
2021-10-08 20:05       ` Jonathan Tan
2021-10-08 20:07       ` Jonathan Tan
2021-10-07 18:33     ` [PATCH v3 1/7] " Josh Steadmon
2021-10-08 20:08       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-10-07 19:31     ` Glen Choo
2021-10-08 20:12       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:19       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 5/7] object-file: only register submodule ODB if needed Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:22       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:23       ` Jonathan Tan
2021-10-07 18:32   ` [PATCH v3 0/7] No more " Josh Steadmon
2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 1/7] refs: plumb repo into ref stores Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 5/7] object-file: only register submodule ODB if needed Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-10-12 22:10   ` [PATCH v4 0/7] No more " Glen Choo
2021-10-12 22:40   ` Josh Steadmon
2021-10-12 22:49     ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).