[PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
  2018-10-11 21:17 [RFC PATCH 00/19] Bring more repository handles into our code base Stefan Beller
@ 2018-10-11 21:17 ` Stefan Beller
  2018-10-11 22:01   ` Jonathan Tan
  0 siblings, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-11 21:17 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

As read_object_file is a widely used function (which is also regularly used
in new code in flight between master..pu), changing its signature is painful
is hard, as other series in flight rely on the original signature. It would
burden the maintainer if we'd just change the signature.

Introduce repo_read_object_file which takes the repository argument, and
hide the original read_object_file as a macro behind
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in
e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)

Add a coccinelle patch to convert existing callers, but do not apply
the resulting patch from 'make coccicheck' to keep the diff of this
patch small.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 contrib/coccinelle/the_repository.cocci | 13 +++++++++++++
 object-store.h                          | 10 ++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 contrib/coccinelle/the_repository.cocci

diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
new file mode 100644
index 0000000000..3c7fa70502
--- /dev/null
+++ b/contrib/coccinelle/the_repository.cocci
@@ -0,0 +1,13 @@
+// This file is used for the ongoing refactoring of
+// bringing the index or repository struct in all of
+// our code base.
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- read_object_file(
++ repo_read_object_file(the_repository,
+  E, F, G)
+
diff --git a/object-store.h b/object-store.h
index 6bb0ccbf05..41ceebca48 100644
--- a/object-store.h
+++ b/object-store.h
@@ -150,10 +150,16 @@ extern void *read_object_file_extended(struct repository *r,
 				       const struct object_id *oid,
 				       enum object_type *type,
 				       unsigned long *size, int lookup_replace);
-static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
+static inline void *repo_read_object_file(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size)
 {
-	return read_object_file_extended(the_repository, oid, type, size, 1);
+	return read_object_file_extended(r, oid, type, size, 1);
 }
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size)
+#endif
 
 /* Read and unpack an object file into memory, write memory to an object file */
 int oid_object_info(struct repository *r, const struct object_id *, unsigned long *);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
  2018-10-11 21:17 ` [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories Stefan Beller
@ 2018-10-11 22:01   ` Jonathan Tan
  2018-10-11 22:02     ` Stefan Beller
  0 siblings, 1 reply; 49+ messages in thread
From: Jonathan Tan @ 2018-10-11 22:01 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy

> Introduce repo_read_object_file which takes the repository argument, and
> hide the original read_object_file as a macro behind
> NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in
> e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)

That commit didn't seem to plan for anything - it just seems to add a
new function with the name "repo_" preprended and define a macro if
NO_THE_REPOSITORY_COMPATIBILITY_MACROS is not set, just like this patch.
Maybe s/which we planned for in/just like in/.

The patch itself looks good.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
  2018-10-11 22:01   ` Jonathan Tan
@ 2018-10-11 22:02     ` Stefan Beller
  0 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-11 22:02 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Thu, Oct 11, 2018 at 3:01 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > Introduce repo_read_object_file which takes the repository argument, and
> > hide the original read_object_file as a macro behind
> > NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in
> > e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)
>
> That commit didn't seem to plan for anything - it just seems to add a
> new function with the name "repo_" preprended and define a macro if
> NO_THE_REPOSITORY_COMPATIBILITY_MACROS is not set, just like this patch.
> Maybe s/which we planned for in/just like in/.

I was reading too much into

    The plan is these macros will always be defined for all library files
    and the macros are only accessible in builtin/

of that commit message.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 00/19] Bring more repository handles into our code base
@ 2018-10-16 23:35 Stefan Beller
  2018-10-16 23:35 ` [PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
                   ` (21 more replies)
  0 siblings, 22 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

This rerolls sb/more-repo-in-api.
It applies on nd/the-index merged with ds/reachable and is available via
    git fetch https://github.com/stefanbeller/git object-store-final-3

I addressed all of Jonathans comments, see range-diff below;
the last patch (applying the semantic patches) has been omitted as that
would produce a lot of merge conflicts. Without that patch, this merges
cleanly to next.

As for when to apply the semantic patches, I wondered if we would prefer a dirty merge
(created via "make coccicheck && git apply contrib/coccinelle/the_repository.cocci.patch")
of the semantic patches, or if we'd actually trickle in the changes over some time?

Thanks,
Stefan

An earlier RFC was sent out at
https://public-inbox.org/git/20181011211754.31369-1-sbeller@google.com/ and said:
    This applies on nd/the-index (b3c7eef9b05) and is the logical continuation of
    the object store series, which I sent over the last year.
    
    The previous series did take a very slow and pedantic approach,
    using a #define trick, see cfc62fc98c for details, but it turns out,
    that it doesn't work:
       When changing the signature of widely used functions, it burdens the
       maintainer in resolving the semantic conflicts.
       
       In the orginal approach this was called a feature, as then we can ensure
       that not bugs creep into the code base during the merge window (while such
       a refactoring series wanders from pu to master). It turns out this
       was not well received and was just burdensome.
       
       The #define trick doesn't buy us much to begin with when dealing with
       non-merge-conflicts.  For example, see deref_tag at tag.c:68, which got
       the repository argument in 286d258d4f (tag.c: allow deref_tag to handle
       arbitrary repositories, 2018-06-28) but lost its property of working on any
       repository while 8c4cc32689 (tag: don't warn if target is missing but
       promised, 2018-07-12) was in flight simultaneously.
       
       Another example of failure of this approach is seen in patch 5, which
       shows that the pedantry was missed.
            
    This series takes another approach as it doesn't change the signature of
    functions, but introduces new functions that can deal with arbitrary 
    repositories, keeping the old function signature around using a shallow wrapper.
    
    Additionally each patch adds a semantic patch, that would port from the old to
    the new function. These semantic patches are all applied in the very last patch,
    but we could omit applying the last patch if it causes too many merge conflicts
    and trickl in the semantic patches over time when there are no merge conflicts.
    
    
    The original goal of all these refactoring series was to remove add_submodule_odb 
    in submodule.c, which was partially reached with this series. I'll investigate the
    remaining calls in another series, but it shows we're close to be done with these
    large refactorings as far as I am concerned.
    
    Thanks,
    Stefan
    

Stefan Beller (19):
  sha1_file: allow read_object to read objects in arbitrary repositories
  packfile: allow has_packed_and_bad to handle arbitrary repositories
  object-store: allow read_object_file_extended to read from arbitrary
    repositories
  object-store: prepare read_object_file to deal with arbitrary
    repositories
  object-store: prepare has_{sha1, object}_file[_with_flags] to handle
    arbitrary repositories
  object: parse_object to honor its repository argument
  commit: allow parse_commit* to handle arbitrary repositories
  commit-reach.c: allow paint_down_to_common to handle arbitrary
    repositories
  commit-reach.c: allow merge_bases_many to handle arbitrary
    repositories
  commit-reach.c: allow remove_redundant to handle arbitrary
    repositories
  commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary
    repositories
  commit-reach: prepare get_merge_bases to handle arbitrary repositories
  commit-reach: prepare in_merge_bases[_many] to handle arbitrary
    repositories
  commit: prepare get_commit_buffer to handle arbitrary repositories
  commit: prepare repo_unuse_commit_buffer to handle arbitrary
    repositories
  commit: prepare logmsg_reencode to handle arbitrary repositories
  pretty: prepare format_commit_message to handle arbitrary repositories
  submodule: use submodule repos for object lookup
  submodule: don't add submodule as odb for push

 commit-reach.c                          |  73 +++++++-----
 commit-reach.h                          |  38 +++++--
 commit.c                                |  32 ++++--
 commit.h                                |  39 ++++++-
 contrib/coccinelle/the_repository.cocci | 144 ++++++++++++++++++++++++
 object-store.h                          |  35 ++++--
 object.c                                |   6 +-
 packfile.c                              |   5 +-
 packfile.h                              |   2 +-
 pretty.c                                |  28 ++---
 pretty.h                                |   7 +-
 sha1-file.c                             |  34 +++---
 streaming.c                             |   2 +-
 submodule.c                             |  78 ++++++++++---
 14 files changed, 402 insertions(+), 121 deletions(-)
 create mode 100644 contrib/coccinelle/the_repository.cocci

 git range-diff origin/sb/more-repo-in-api...

 -:  ---------- >  1:  5227c38566 commit-reach: move walk methods from commit.c
 -:  ---------- >  2:  6404355657 commit.h: remove method declarations
 -:  ---------- >  3:  1d614d41e5 commit-reach: move ref_newer from remote.c
 -:  ---------- >  4:  920f93ca1c commit-reach: move commit_contains from ref-filter
 -:  ---------- >  5:  f044bb49ad upload-pack: make reachable() more generic
 -:  ---------- >  6:  921bf7734f upload-pack: refactor ok_to_give_up()
 -:  ---------- >  7:  118be5785e upload-pack: generalize commit date cutoff
 -:  ---------- >  8:  ba3ca1edce commit-reach: move can_all_from_reach_with_flags
 -:  ---------- >  9:  ab176ac4ae test-reach: create new test tool for ref_newer
 -:  ---------- > 10:  5cd52de326 test-reach: test in_merge_bases
 -:  ---------- > 11:  6255232ec1 test-reach: test is_descendant_of
 -:  ---------- > 12:  324dec0191 test-reach: test get_merge_bases_many
 -:  ---------- > 13:  0c89f715d0 test-reach: test reduce_heads
 -:  ---------- > 14:  1792bc1250 test-reach: test can_all_from_reach_with_flags
 -:  ---------- > 15:  1fee124257 test-reach: test commit_contains
 -:  ---------- > 16:  1e3497a24c commit-reach: replace ref_newer logic
 -:  ---------- > 17:  4fbcca4eff commit-reach: make can_all_from_reach... linear
 -:  ---------- > 18:  6cc017431c commit-reach: use can_all_from_reach
 -:  ---------- > 19:  6621c83874 commit-reach: correct accidental #include of C file
 -:  ---------- > 20:  b67f6b26e3 commit-reach: properly peel tags
 -:  ---------- > 21:  4067a64672 commit-reach: fix memory and flag leaks
 1:  2c13c6101f = 22:  14c12a5350 sha1_file: allow read_object to read objects in arbitrary repositories
 2:  82848c2571 = 23:  f978950604 packfile: allow has_packed_and_bad to handle arbitrary repositories
 3:  717023d801 = 24:  852f6696d1 object-store: allow read_object_file_extended to read from arbitrary repositories
 4:  fab6bdfad0 ! 25:  b185b88ad8 object-store: prepare read_object_file to deal with arbitrary repositories
    @@ -9,7 +9,7 @@
     
         Introduce repo_read_object_file which takes the repository argument, and
         hide the original read_object_file as a macro behind
    -    NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in
    +    NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to
         e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)
     
         Add a coccinelle patch to convert existing callers, but do not apply
 -:  ---------- > 26:  009d6f350b object-store: prepare has_{sha1, object}_file[_with_flags] to handle arbitrary repositories
 5:  9e34d76c53 ! 27:  df2feb2baf object: parse_object to honor its repository argument
    @@ -12,6 +12,17 @@
     diff --git a/object.c b/object.c
     --- a/object.c
     +++ b/object.c
    +@@
    + 	if (obj && obj->parsed)
    + 		return obj;
    + 
    +-	if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
    +-	    (!obj && has_object_file(oid) &&
    ++	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
    ++	    (!obj && repo_has_object_file(r, oid) &&
    + 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
    + 		if (check_object_signature(repl, NULL, 0, NULL) < 0) {
    + 			error(_("sha1 mismatch %s"), oid_to_hex(oid));
     @@
      		return lookup_object(r, oid->hash);
      	}
 6:  9cb7efc7b2 ! 28:  18b7073c25 commit: allow parse_commit* to handle arbitrary repositories
    @@ -97,9 +97,10 @@
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    - + repo_read_object_file(the_repository,
    -   E, F, G)
    - 
    + - has_object_file_with_flags(
    + + repo_has_object_file_with_flags(the_repository,
    +   E)
    ++
     +@@
     +expression E;
     +expression F;
    @@ -123,4 +124,3 @@
     +- parse_commit(
     ++ repo_parse_commit(the_repository,
     +  E)
    -+
 7:  fcab4a4000 ! 29:  48f8999e95 commit.c: allow paint_down_to_common to handle arbitrary repositories
    @@ -1,15 +1,14 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit.c: allow paint_down_to_common to handle arbitrary repositories
    +    commit-reach.c: allow paint_down_to_common to handle arbitrary repositories
     
         As the function is file local and not widely used, migrate it all at once.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      }
      
 8:  324422f0d3 ! 30:  b84d32c972 commit.c: allow merge_bases_many to handle arbitrary repositories
    @@ -1,13 +1,12 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit.c: allow merge_bases_many to handle arbitrary repositories
    +    commit-reach.c: allow merge_bases_many to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      	return result;
      }
 9:  9c43e74d6c ! 31:  cd0e595b8d commit.c: allow remove_redundant to handle arbitrary repositories
    @@ -1,13 +1,12 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit.c: allow remove_redundant to handle arbitrary repositories
    +    commit-reach.c: allow remove_redundant to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      	return ret;
      }
10:  e5c2e2fa8e ! 32:  40ebd46acd commit: allow get_merge_bases_many_0 to handle arbitrary repositories
    @@ -1,13 +1,12 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit: allow get_merge_bases_many_0 to handle arbitrary repositories
    +    commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      	return filled;
      }
11:  27d4669ccb ! 33:  151cc8cd9b commit: prepare get_merge_bases to handle arbitrary repositories
    @@ -1,6 +1,6 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit: prepare get_merge_bases to handle arbitrary repositories
    +    commit-reach: prepare get_merge_bases to handle arbitrary repositories
     
         Similarly to previous patches, the get_merge_base functions are used
         often in the code base, which makes migrating them hard.
    @@ -9,11 +9,10 @@
         functions behind a wrapper macro.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      	return result;
      }
    @@ -53,16 +52,21 @@
      
      /*
     
    -diff --git a/commit.h b/commit.h
    ---- a/commit.h
    -+++ b/commit.h
    +diff --git a/commit-reach.h b/commit-reach.h
    +--- a/commit-reach.h
    ++++ b/commit-reach.h
     @@
    - int register_commit_graft(struct repository *r, struct commit_graft *, int);
    - struct commit_graft *lookup_commit_graft(struct repository *r, const struct object_id *oid);
    + struct contains_cache;
    + struct ref_filter;
      
    --extern struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2);
    --extern struct commit_list *get_merge_bases_many(struct commit *one, int n, struct commit **twos);
    --extern struct commit_list *get_octopus_merge_bases(struct commit_list *in);
    +-struct commit_list *get_merge_bases_many(struct commit *one,
    +-					 int n,
    +-					 struct commit **twos);
    +-struct commit_list *get_merge_bases_many_dirty(struct commit *one,
    +-					       int n,
    +-					       struct commit **twos);
    +-struct commit_list *get_merge_bases(struct commit *one, struct commit *two);
    +-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
     -
     +struct commit_list *repo_get_merge_bases(struct repository *r,
     +					 struct commit *rev1,
    @@ -71,7 +75,7 @@
     +					      struct commit *one, int n,
     +					      struct commit **twos);
      /* To be used only when object flags after this call no longer matter */
    --extern struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos);
    +-struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos);
     +struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
     +						    struct commit *one, int n,
     +						    struct commit **twos);
    @@ -80,18 +84,20 @@
     +#define get_merge_bases_many(one, n, two) repo_get_merge_bases_many(the_repository, one, n, two)
     +#define get_merge_bases_many_dirty(one, n, twos) repo_get_merge_bases_many_dirty(the_repository, one, n, twos)
     +#endif
    -+extern struct commit_list *get_octopus_merge_bases(struct commit_list *in);
    ++
    ++struct commit_list *get_octopus_merge_bases(struct commit_list *in);
      
    - /* largest positive number a signed 32-bit integer can contain */
    - #define INFINITE_DEPTH 0x7fffffff
    + int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
    + int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
     
     diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    + - parse_commit(
      + repo_parse_commit(the_repository,
        E)
    - 
    ++
     +@@
     +expression E;
     +expression F;
13:  977359ec86 ! 34:  b508acf4fd commit: prepare in_merge_bases[_many] to handle arbitrary repositories
    @@ -1,13 +1,12 @@
     Author: Stefan Beller <sbeller@google.com>
     
    -    commit: prepare in_merge_bases[_many] to handle arbitrary repositories
    +    commit-reach: prepare in_merge_bases[_many] to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    -diff --git a/commit.c b/commit.c
    ---- a/commit.c
    -+++ b/commit.c
    +diff --git a/commit-reach.c b/commit-reach.c
    +--- a/commit-reach.c
    ++++ b/commit-reach.c
     @@
      /*
       * Is "commit" an ancestor of one of the "references"?
    @@ -53,33 +52,36 @@
      
      struct commit_list *reduce_heads(struct commit_list *heads)
     
    -diff --git a/commit.h b/commit.h
    ---- a/commit.h
    -+++ b/commit.h
    +diff --git a/commit-reach.h b/commit-reach.h
    +--- a/commit-reach.h
    ++++ b/commit-reach.h
     @@
    - extern struct trace_key trace_shallow;
    + struct commit_list *get_octopus_merge_bases(struct commit_list *in);
      
    - int is_descendant_of(struct commit *, struct commit_list *);
    --int in_merge_bases(struct commit *, struct commit *);
    --int in_merge_bases_many(struct commit *, int, struct commit **);
    -+int repo_in_merge_bases(struct repository *r, struct commit *, struct commit *);
    -+int repo_in_merge_bases_many(struct repository *r, struct commit *, int, struct commit **);
    + int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
    +-int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
    +-int in_merge_bases(struct commit *commit, struct commit *reference);
    ++int repo_in_merge_bases(struct repository *r,
    ++			struct commit *commit,
    ++			struct commit *reference);
    ++int repo_in_merge_bases_many(struct repository *r,
    ++			     struct commit *commit,
    ++			     int nr_reference, struct commit **reference);
     +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
     +#define in_merge_bases(c1, c2) repo_in_merge_bases(the_repository, c1, c2)
     +#define in_merge_bases_many(c1, n, cs) repo_in_merge_bases_many(the_repository, c1, n, cs)
     +#endif
      
    - extern int interactive_add(int argc, const char **argv, const char *prefix, int patch);
    - extern int run_add_interactive(const char *revision, const char *patch_mode,
    + /*
    +  * Takes a list of commits and returns a new list where those
     
     diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    - - get_commit_buffer(
    - + repo_get_commit_buffer(the_repository,
    -   E, F);
    -+
    + + repo_get_merge_bases_many_dirty(the_repository,
    +   E, F, G);
    + 
     +@@
     +expression E;
     +expression F;
12:  044bf2917c ! 35:  c825f2703c commit: prepare get_commit_buffer to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit: prepare get_commit_buffer to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/commit.c b/commit.c
     --- a/commit.c
    @@ -50,9 +49,10 @@
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    - + repo_get_merge_bases_many_dirty(the_repository,
    + - in_merge_bases_many(
    + + repo_in_merge_bases_many(the_repository,
        E, F, G);
    - 
    ++
     +@@
     +expression E;
     +expression F;
14:  65771970b9 ! 36:  bea6ff2e6d commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/commit.c b/commit.c
     --- a/commit.c
    @@ -46,9 +45,10 @@
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    + - get_commit_buffer(
      + repo_get_commit_buffer(the_repository,
        E, F);
    - 
    ++
     +@@
     +expression E;
     +expression F;
    @@ -56,7 +56,3 @@
     +- unuse_commit_buffer(
     ++ repo_unuse_commit_buffer(the_repository,
     +  E, F);
    -+
    - @@
    - expression E;
    - expression F;
15:  ddef24685d ! 37:  11ea084cd2 commit: prepare logmsg_reencode to handle arbitrary repositories
    @@ -28,9 +28,9 @@
     --- a/contrib/coccinelle/the_repository.cocci
     +++ b/contrib/coccinelle/the_repository.cocci
     @@
    - - in_merge_bases_many(
    - + repo_in_merge_bases_many(the_repository,
    -   E, F, G);
    + - unuse_commit_buffer(
    + + repo_unuse_commit_buffer(the_repository,
    +   E, F);
     +
     +@@
     +expression E;
16:  8612153e23 = 38:  9f50edd155 pretty: prepare format_commit_message to handle arbitrary repositories
17:  eba296fc18 ! 39:  1a4a392233 submodule: use submodule repos for object lookup
    @@ -7,7 +7,6 @@
         are not added to the main object store.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/submodule.c b/submodule.c
     --- a/submodule.c
    @@ -35,12 +34,15 @@
      			 DEFAULT_GIT_DIR_ENVIRONMENT);
      }
      
    +-/* Helper function to display the submodule header line prior to the full
    +- * summary output. If it can locate the submodule objects directory it will
    +- * attempt to lookup both the left and right commits and put them into the
    +- * left and right pointers.
     +/*
     + * Initialize 'out' based on the provided submodule path.
     + *
     + * Unlike repo_submodule_init, this tolerates submodules not present
    -+ * in .gitmodules. NEEDSWORK: The repo_submodule_init behavior is
    -+ * preferrable. This function exists only to preserve historical behavior.
    ++ * in .gitmodules. This function exists only to preserve historical behavior,
     + *
     + * Returns 0 on success, -1 when the submodule is not present.
     + */
    @@ -54,28 +56,38 @@
     +	}
     +
     +	out->submodule_prefix = xstrdup(path);
    ++	out->submodule_prefix = xstrfmt("%s%s/",
    ++					the_repository->submodule_prefix ?
    ++					the_repository->submodule_prefix :
    ++					"", path);
     +
     +	strbuf_release(&sb);
     +	return 0;
     +}
     +
    - /* Helper function to display the submodule header line prior to the full
    -  * summary output. If it can locate the submodule objects directory it will
    -  * attempt to lookup both the left and right commits and put them into the
    -  * left and right pointers.
    ++/*
    ++ * Helper function to display the submodule header line prior to the full
    ++ * summary output.
    ++ *
    ++ * If it can locate the submodule git directory it will create a repository
    ++ * handle for the submodule and lookup both the left and right commits and
    ++ * put them into the left and right pointers.
       */
     -static void show_submodule_header(struct diff_options *o, const char *path,
    -+static void show_submodule_header(struct diff_options *o, struct repository *sub,
    ++static void show_submodule_header(struct diff_options *o,
     +		const char *path,
      		struct object_id *one, struct object_id *two,
      		unsigned dirty_submodule,
    ++		struct repository *sub,
      		struct commit **left, struct commit **right,
    + 		struct commit_list **merge_bases)
    + {
     @@
      	else if (is_null_oid(two))
      		message = "(submodule deleted)";
      
     -	if (add_submodule_odb(path)) {
    -+	if (open_submodule(sub, path) < 0) {
    ++	if (!sub) {
      		if (!message)
      			message = "(commits not present)";
      		goto output_header;
    @@ -103,30 +115,63 @@
      	struct rev_info rev;
      	struct commit *left = NULL, *right = NULL;
      	struct commit_list *merge_bases = NULL;
    -+	struct repository sub;
    ++	struct repository subrepo, *sub = &subrepo;
    ++
    ++	if (open_submodule(&subrepo, path) < 0)
    ++		sub = NULL;
      
    --	show_submodule_header(o, path, one, two, dirty_submodule,
    -+	show_submodule_header(o, &sub, path, one, two, dirty_submodule,
    - 			      &left, &right, &merge_bases);
    + 	show_submodule_header(o, path, one, two, dirty_submodule,
    +-			      &left, &right, &merge_bases);
    ++			      sub, &left, &right, &merge_bases);
      
      	/*
    + 	 * If we don't have both a left and a right pointer, there is no
    + 	 * reason to try and display a summary. The header line should contain
    + 	 * all the information the user needs.
    + 	 */
    +-	if (!left || !right)
    ++	if (!left || !right || !sub)
    + 		goto out;
    + 
    + 	/* Treat revision walker failure the same as missing commits */
     @@
      		goto out;
      	}
      
     -	print_submodule_summary(&rev, o);
    -+	print_submodule_summary(&sub, &rev, o);
    ++	print_submodule_summary(sub, &rev, o);
      
      out:
      	if (merge_bases)
    + 		free_commit_list(merge_bases);
    + 	clear_commit_marks(left, ~0);
    + 	clear_commit_marks(right, ~0);
    ++	if (sub)
    ++		repo_clear(sub);
    + }
    + 
    + void show_submodule_inline_diff(struct diff_options *o, const char *path,
     @@
      	struct commit_list *merge_bases = NULL;
      	struct child_process cp = CHILD_PROCESS_INIT;
      	struct strbuf sb = STRBUF_INIT;
    -+	struct repository sub;
    ++	struct repository subrepo, *sub = &subrepo;
    ++
    ++	if (open_submodule(&subrepo, path) < 0)
    ++		sub = NULL;
      
    --	show_submodule_header(o, path, one, two, dirty_submodule,
    -+	show_submodule_header(o, &sub, path, one, two, dirty_submodule,
    - 			      &left, &right, &merge_bases);
    + 	show_submodule_header(o, path, one, two, dirty_submodule,
    +-			      &left, &right, &merge_bases);
    ++			      sub, &left, &right, &merge_bases);
      
      	/* We need a valid left and right commit to display a difference */
    + 	if (!(left || is_null_oid(one)) ||
    +@@
    + 		clear_commit_marks(left, ~0);
    + 	if (right)
    + 		clear_commit_marks(right, ~0);
    ++	if (sub)
    ++		repo_clear(sub);
    + }
    + 
    + int should_update_submodules(void)
18:  4b2033017a ! 40:  1ce2d0a867 submodule: don't add submodule as odb for push
    @@ -2,20 +2,13 @@
     
         submodule: don't add submodule as odb for push
     
    -    The submodule was added as an alternative in eb21c732d6 (push: teach
    -    --recurse-submodules the on-demand option, 2012-03-29), but was
    -    not explained, why.
    -
    -    In similar code, submodule_has_commits, the submodule is added as an
    -    alternative to perform a quick check if we need to dive into the submodule.
    -
    -    However in push_submodule
    -    (a) for_each_remote_ref_submodule will also provide the quick check and
    -    (b) after that we don't need to have submodule objects around, as all
    -        further code is to spawn a separate process.
    +    In push_submodule(), because we do not actually need access to objects
    +    in the submodule, do not invoke add_submodule_odb().
    +    (for_each_remote_ref_submodule() does not require access to those
    +    objects, and the actual push is done by spawning another process,
    +    which handles object access by itself.)
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/submodule.c b/submodule.c
     --- a/submodule.c
19:  c74af1496b <  -:  ---------- Apply semantic patches from previous patches

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 02/19] packfile: allow has_packed_and_bad to handle " Stefan Beller
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

Allow read_object (a file local functon in sha1_file) to
handle arbitrary repositories by passing the repository down
to oid_object_info_extended.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1-file.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/sha1-file.c b/sha1-file.c
index 308d5e20e2..647068a836 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1361,7 +1361,9 @@ int oid_object_info(struct repository *r,
 	return type;
 }
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
+static void *read_object(struct repository *r,
+			 const unsigned char *sha1,
+			 enum object_type *type,
 			 unsigned long *size)
 {
 	struct object_id oid;
@@ -1373,7 +1375,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 
 	hashcpy(oid.hash, sha1);
 
-	if (oid_object_info_extended(the_repository, &oid, &oi, 0) < 0)
+	if (oid_object_info_extended(r, &oid, &oi, 0) < 0)
 		return NULL;
 	return content;
 }
@@ -1414,7 +1416,7 @@ void *read_object_file_extended(const struct object_id *oid,
 		lookup_replace_object(the_repository, oid) : oid;
 
 	errno = 0;
-	data = read_object(repl->hash, type, size);
+	data = read_object(the_repository, repl->hash, type, size);
 	if (data)
 		return data;
 
@@ -1755,7 +1757,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime)
 
 	if (has_loose_object(oid))
 		return 0;
-	buf = read_object(oid->hash, &type, &len);
+	buf = read_object(the_repository, oid->hash, &type, &len);
 	if (!buf)
 		return error(_("cannot read sha1_file for %s"), oid_to_hex(oid));
 	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", type_name(type), len) + 1;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 02/19] packfile: allow has_packed_and_bad to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
  2018-10-16 23:35 ` [PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 03/19] object-store: allow read_object_file_extended to read from " Stefan Beller
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

has_packed_and_bad is not widely used, so just migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c  | 5 +++--
 packfile.h  | 2 +-
 sha1-file.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index ebcb5742ec..40085fe160 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1024,12 +1024,13 @@ void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
 	p->num_bad_objects++;
 }
 
-const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
+const struct packed_git *has_packed_and_bad(struct repository *r,
+					    const unsigned char *sha1)
 {
 	struct packed_git *p;
 	unsigned i;
 
-	for (p = the_repository->objects->packed_git; p; p = p->next)
+	for (p = r->objects->packed_git; p; p = p->next)
 		for (i = 0; i < p->num_bad_objects; i++)
 			if (!hashcmp(sha1,
 				     p->bad_object_sha1 + the_hash_algo->rawsz * i))
diff --git a/packfile.h b/packfile.h
index 630f35cf31..1d2d170bf8 100644
--- a/packfile.h
+++ b/packfile.h
@@ -136,7 +136,7 @@ extern int packed_object_info(struct repository *r,
 			      off_t offset, struct object_info *);
 
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
-extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
+extern const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
 
 /*
  * Iff a pack file in the given repository contains the object named by sha1,
diff --git a/sha1-file.c b/sha1-file.c
index 647068a836..13b3c5cb79 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1432,7 +1432,7 @@ void *read_object_file_extended(const struct object_id *oid,
 		die(_("loose object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), path);
 
-	if ((p = has_packed_and_bad(repl->hash)) != NULL)
+	if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 03/19] object-store: allow read_object_file_extended to read from arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
  2018-10-16 23:35 ` [PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
  2018-10-16 23:35 ` [PATCH 02/19] packfile: allow has_packed_and_bad to handle " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 04/19] object-store: prepare read_object_file to deal with " Stefan Beller
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

read_object_file_extended is not widely used, so migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-store.h |  5 +++--
 sha1-file.c    | 11 ++++++-----
 streaming.c    |  2 +-
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/object-store.h b/object-store.h
index 67e66227d9..6bb0ccbf05 100644
--- a/object-store.h
+++ b/object-store.h
@@ -146,12 +146,13 @@ void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned cha
 
 void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned long *size);
 
-extern void *read_object_file_extended(const struct object_id *oid,
+extern void *read_object_file_extended(struct repository *r,
+				       const struct object_id *oid,
 				       enum object_type *type,
 				       unsigned long *size, int lookup_replace);
 static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
 {
-	return read_object_file_extended(oid, type, size, 1);
+	return read_object_file_extended(the_repository, oid, type, size, 1);
 }
 
 /* Read and unpack an object file into memory, write memory to an object file */
diff --git a/sha1-file.c b/sha1-file.c
index 13b3c5cb79..ce47524679 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1403,7 +1403,8 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
  * deal with them should arrange to call read_object() and give error
  * messages themselves.
  */
-void *read_object_file_extended(const struct object_id *oid,
+void *read_object_file_extended(struct repository *r,
+				const struct object_id *oid,
 				enum object_type *type,
 				unsigned long *size,
 				int lookup_replace)
@@ -1413,10 +1414,10 @@ void *read_object_file_extended(const struct object_id *oid,
 	const char *path;
 	struct stat st;
 	const struct object_id *repl = lookup_replace ?
-		lookup_replace_object(the_repository, oid) : oid;
+		lookup_replace_object(r, oid) : oid;
 
 	errno = 0;
-	data = read_object(the_repository, repl->hash, type, size);
+	data = read_object(r, repl->hash, type, size);
 	if (data)
 		return data;
 
@@ -1428,11 +1429,11 @@ void *read_object_file_extended(const struct object_id *oid,
 		die(_("replacement %s not found for %s"),
 		    oid_to_hex(repl), oid_to_hex(oid));
 
-	if (!stat_sha1_file(the_repository, repl->hash, &st, &path))
+	if (!stat_sha1_file(r, repl->hash, &st, &path))
 		die(_("loose object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), path);
 
-	if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL)
+	if ((p = has_packed_and_bad(r, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
 
diff --git a/streaming.c b/streaming.c
index d1e6b2dce6..c843a1230f 100644
--- a/streaming.c
+++ b/streaming.c
@@ -490,7 +490,7 @@ static struct stream_vtbl incore_vtbl = {
 
 static open_method_decl(incore)
 {
-	st->u.incore.buf = read_object_file_extended(oid, type, &st->size, 0);
+	st->u.incore.buf = read_object_file_extended(the_repository, oid, type, &st->size, 0);
 	st->u.incore.read_ptr = 0;
 	st->vtbl = &incore_vtbl;
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (2 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 03/19] object-store: allow read_object_file_extended to read from " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 05/19] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

As read_object_file is a widely used function (which is also regularly used
in new code in flight between master..pu), changing its signature is painful
is hard, as other series in flight rely on the original signature. It would
burden the maintainer if we'd just change the signature.

Introduce repo_read_object_file which takes the repository argument, and
hide the original read_object_file as a macro behind
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to
e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)

Add a coccinelle patch to convert existing callers, but do not apply
the resulting patch from 'make coccicheck' to keep the diff of this
patch small.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/coccinelle/the_repository.cocci | 13 +++++++++++++
 object-store.h                          | 10 ++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 contrib/coccinelle/the_repository.cocci

diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
new file mode 100644
index 0000000000..3c7fa70502
--- /dev/null
+++ b/contrib/coccinelle/the_repository.cocci
@@ -0,0 +1,13 @@
+// This file is used for the ongoing refactoring of
+// bringing the index or repository struct in all of
+// our code base.
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- read_object_file(
++ repo_read_object_file(the_repository,
+  E, F, G)
+
diff --git a/object-store.h b/object-store.h
index 6bb0ccbf05..41ceebca48 100644
--- a/object-store.h
+++ b/object-store.h
@@ -150,10 +150,16 @@ extern void *read_object_file_extended(struct repository *r,
 				       const struct object_id *oid,
 				       enum object_type *type,
 				       unsigned long *size, int lookup_replace);
-static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
+static inline void *repo_read_object_file(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size)
 {
-	return read_object_file_extended(the_repository, oid, type, size, 1);
+	return read_object_file_extended(r, oid, type, size, 1);
 }
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size)
+#endif
 
 /* Read and unpack an object file into memory, write memory to an object file */
 int oid_object_info(struct repository *r, const struct object_id *, unsigned long *);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 05/19] object-store: prepare has_{sha1, object}_file[_with_flags] to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (3 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 04/19] object-store: prepare read_object_file to deal with " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 06/19] object: parse_object to honor its repository argument Stefan Beller
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 contrib/coccinelle/the_repository.cocci | 29 +++++++++++++++++++++++++
 object-store.h                          | 22 ++++++++++++++-----
 sha1-file.c                             | 15 ++++++++-----
 3 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 3c7fa70502..46f3a1b23a 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -11,3 +11,32 @@ expression G;
 + repo_read_object_file(the_repository,
   E, F, G)
 
+@@
+expression E;
+@@
+- has_sha1_file(
++ repo_has_sha1_file(the_repository,
+  E)
+
+@@
+expression E;
+expression F;
+@@
+- has_sha1_file_with_flags(
++ repo_has_sha1_file_with_flags(the_repository,
+  E)
+
+@@
+expression E;
+@@
+- has_object_file(
++ repo_has_object_file(the_repository,
+  E)
+
+@@
+expression E;
+expression F;
+@@
+- has_object_file_with_flags(
++ repo_has_object_file_with_flags(the_repository,
+  E)
diff --git a/object-store.h b/object-store.h
index 41ceebca48..e1c68d0774 100644
--- a/object-store.h
+++ b/object-store.h
@@ -197,15 +197,27 @@ int read_loose_object(const char *path,
  * object_info. OBJECT_INFO_SKIP_CACHED is automatically set; pass
  * nonzero flags to also set other flags.
  */
-extern int has_sha1_file_with_flags(const unsigned char *sha1, int flags);
-static inline int has_sha1_file(const unsigned char *sha1)
+int repo_has_sha1_file_with_flags(struct repository *r,
+				  const unsigned char *sha1, int flags);
+static inline int repo_has_sha1_file(struct repository *r,
+				     const unsigned char *sha1)
 {
-	return has_sha1_file_with_flags(sha1, 0);
+	return repo_has_sha1_file_with_flags(r, sha1, 0);
 }
 
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define has_sha1_file_with_flags(sha1, flags) repo_has_sha1_file_with_flags(the_repository, sha1, flags)
+#define has_sha1_file(sha1) repo_has_sha1_file(the_repository, sha1)
+#endif
+
 /* Same as the above, except for struct object_id. */
-extern int has_object_file(const struct object_id *oid);
-extern int has_object_file_with_flags(const struct object_id *oid, int flags);
+int repo_has_object_file(struct repository *r, const struct object_id *oid);
+int repo_has_object_file_with_flags(struct repository *r,
+				    const struct object_id *oid, int flags);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define has_object_file(oid) repo_has_object_file(the_repository, oid)
+#define has_object_file_with_flags(oid, flags) repo_has_object_file_with_flags(the_repository, oid, flags)
+#endif
 
 /*
  * Return true iff an alternate object database has a loose object
diff --git a/sha1-file.c b/sha1-file.c
index ce47524679..6153cedcf3 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1768,24 +1768,27 @@ int force_object_loose(const struct object_id *oid, time_t mtime)
 	return ret;
 }
 
-int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
+int repo_has_sha1_file_with_flags(struct repository *r,
+				  const unsigned char *sha1, int flags)
 {
 	struct object_id oid;
 	if (!startup_info->have_repository)
 		return 0;
 	hashcpy(oid.hash, sha1);
-	return oid_object_info_extended(the_repository, &oid, NULL,
+	return oid_object_info_extended(r, &oid, NULL,
 					flags | OBJECT_INFO_SKIP_CACHED) >= 0;
 }
 
-int has_object_file(const struct object_id *oid)
+int repo_has_object_file(struct repository *r,
+			 const struct object_id *oid)
 {
-	return has_sha1_file(oid->hash);
+	return repo_has_sha1_file(r, oid->hash);
 }
 
-int has_object_file_with_flags(const struct object_id *oid, int flags)
+int repo_has_object_file_with_flags(struct repository *r,
+				    const struct object_id *oid, int flags)
 {
-	return has_sha1_file_with_flags(oid->hash, flags);
+	return repo_has_sha1_file_with_flags(r, oid->hash, flags);
 }
 
 static void check_tree(const void *buf, size_t size)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 06/19] object: parse_object to honor its repository argument
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (4 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 05/19] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 07/19] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

In 8e4b0b6047 (object.c: allow parse_object to handle
arbitrary repositories, 2018-06-28), we forgot to pass the
repository down to the read_object_file.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/object.c b/object.c
index 51c4594515..a32afa14df 100644
--- a/object.c
+++ b/object.c
@@ -259,8 +259,8 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	if (obj && obj->parsed)
 		return obj;
 
-	if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
-	    (!obj && has_object_file(oid) &&
+	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
+	    (!obj && repo_has_object_file(r, oid) &&
 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
 		if (check_object_signature(repl, NULL, 0, NULL) < 0) {
 			error(_("sha1 mismatch %s"), oid_to_hex(oid));
@@ -270,7 +270,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 		return lookup_object(r, oid->hash);
 	}
 
-	buffer = read_object_file(oid, &type, &size);
+	buffer = repo_read_object_file(r, oid, &type, &size);
 	if (buffer) {
 		if (check_object_signature(repl, buffer, size, type_name(type)) < 0) {
 			free(buffer);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 07/19] commit: allow parse_commit* to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (5 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 06/19] object: parse_object to honor its repository argument Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 08/19] commit-reach.c: allow paint_down_to_common " Stefan Beller
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

Just like the previous commit, parse_commit and friends are used a lot
and are found in new patches, so we cannot change their signature easily.

Re-introduce these function prefixed with 'repo_' that take a repository
argument and keep the original as a shallow macro.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c                                | 18 +++++++++++-------
 commit.h                                | 17 +++++++++++++----
 contrib/coccinelle/the_repository.cocci | 24 ++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/commit.c b/commit.c
index 5781f9bc67..fe942754e2 100644
--- a/commit.c
+++ b/commit.c
@@ -443,7 +443,10 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	return 0;
 }
 
-int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph)
+int repo_parse_commit_internal(struct repository *r,
+			       struct commit *item,
+			       int quiet_on_missing,
+			       int use_commit_graph)
 {
 	enum object_type type;
 	void *buffer;
@@ -454,9 +457,9 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
+	if (use_commit_graph && parse_commit_in_graph(r, item))
 		return 0;
-	buffer = read_object_file(&item->object.oid, &type, &size);
+	buffer = repo_read_object_file(r, &item->object.oid, &type, &size);
 	if (!buffer)
 		return quiet_on_missing ? -1 :
 			error("Could not read %s",
@@ -467,18 +470,19 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 			     oid_to_hex(&item->object.oid));
 	}
 
-	ret = parse_commit_buffer(the_repository, item, buffer, size, 0);
+	ret = parse_commit_buffer(r, item, buffer, size, 0);
 	if (save_commit_buffer && !ret) {
-		set_commit_buffer(the_repository, item, buffer, size);
+		set_commit_buffer(r, item, buffer, size);
 		return 0;
 	}
 	free(buffer);
 	return ret;
 }
 
-int parse_commit_gently(struct commit *item, int quiet_on_missing)
+int repo_parse_commit_gently(struct repository *r,
+			     struct commit *item, int quiet_on_missing)
 {
-	return parse_commit_internal(item, quiet_on_missing, 1);
+	return repo_parse_commit_internal(r, item, quiet_on_missing, 1);
 }
 
 void parse_commit_or_die(struct commit *item)
diff --git a/commit.h b/commit.h
index e2c99d9b04..58a0c5409b 100644
--- a/commit.h
+++ b/commit.h
@@ -79,12 +79,21 @@ struct commit *lookup_commit_reference_by_name(const char *name);
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name);
 
 int parse_commit_buffer(struct repository *r, struct commit *item, const void *buffer, unsigned long size, int check_graph);
-int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph);
-int parse_commit_gently(struct commit *item, int quiet_on_missing);
-static inline int parse_commit(struct commit *item)
+int repo_parse_commit_internal(struct repository *r, struct commit *item,
+			       int quiet_on_missing, int use_commit_graph);
+int repo_parse_commit_gently(struct repository *r,
+			     struct commit *item,
+			     int quiet_on_missing);
+static inline int repo_parse_commit(struct repository *r, struct commit *item)
 {
-	return parse_commit_gently(item, 0);
+	return repo_parse_commit_gently(r, item, 0);
 }
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use)
+#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)
+#define parse_commit(item) repo_parse_commit(the_repository, item)
+#endif
+
 void parse_commit_or_die(struct commit *item);
 
 struct buffer_slab;
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 46f3a1b23a..b185fe0a1d 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -40,3 +40,27 @@ expression F;
 - has_object_file_with_flags(
 + repo_has_object_file_with_flags(the_repository,
   E)
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- parse_commit_internal(
++ repo_parse_commit_internal(the_repository,
+  E, F, G)
+
+@@
+expression E;
+expression F;
+@@
+- parse_commit_gently(
++ repo_parse_commit_gently(the_repository,
+  E, F)
+
+@@
+expression E;
+@@
+- parse_commit(
++ repo_parse_commit(the_repository,
+  E)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 08/19] commit-reach.c: allow paint_down_to_common to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (6 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 07/19] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 09/19] commit-reach.c: allow merge_bases_many " Stefan Beller
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

As the function is file local and not widely used, migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5a845440a9..2f7ae20bd4 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -30,7 +30,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct commit *one, int n,
+static struct commit_list *paint_down_to_common(struct repository *r,
+						struct commit *one, int n,
 						struct commit **twos,
 						int min_generation)
 {
@@ -80,7 +81,7 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (parse_commit(p))
+			if (repo_parse_commit(r, p))
 				return NULL;
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -113,7 +114,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
 			return NULL;
 	}
 
-	list = paint_down_to_common(one, n, twos, 0);
+	list = paint_down_to_common(the_repository, one, n, twos, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -184,8 +185,8 @@ static int remove_redundant(struct commit **array, int cnt)
 			if (array[j]->generation < min_generation)
 				min_generation = array[j]->generation;
 		}
-		common = paint_down_to_common(array[i], filled, work,
-					      min_generation);
+		common = paint_down_to_common(the_repository, array[i], filled,
+					      work, min_generation);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -319,7 +320,9 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 	if (commit->generation > min_generation)
 		return ret;
 
-	bases = paint_down_to_common(commit, nr_reference, reference, commit->generation);
+	bases = paint_down_to_common(the_repository, commit,
+				     nr_reference, reference,
+				     commit->generation);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 09/19] commit-reach.c: allow merge_bases_many to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (7 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 08/19] commit-reach.c: allow paint_down_to_common " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 10/19] commit-reach.c: allow remove_redundant " Stefan Beller
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 2f7ae20bd4..aacd8cdc1e 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -92,7 +92,9 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	return result;
 }
 
-static struct commit_list *merge_bases_many(struct commit *one, int n, struct commit **twos)
+static struct commit_list *merge_bases_many(struct repository *r,
+					    struct commit *one, int n,
+					    struct commit **twos)
 {
 	struct commit_list *list = NULL;
 	struct commit_list *result = NULL;
@@ -107,14 +109,14 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
 			return commit_list_insert(one, &result);
 	}
 
-	if (parse_commit(one))
+	if (repo_parse_commit(r, one))
 		return NULL;
 	for (i = 0; i < n; i++) {
-		if (parse_commit(twos[i]))
+		if (repo_parse_commit(r, twos[i]))
 			return NULL;
 	}
 
-	list = paint_down_to_common(the_repository, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -221,7 +223,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	struct commit_list *result;
 	int cnt, i;
 
-	result = merge_bases_many(one, n, twos);
+	result = merge_bases_many(the_repository, one, n, twos);
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 10/19] commit-reach.c: allow remove_redundant to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (8 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 09/19] commit-reach.c: allow merge_bases_many " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 11/19] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index aacd8cdc1e..bce4169f1f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -153,7 +153,7 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 	return ret;
 }
 
-static int remove_redundant(struct commit **array, int cnt)
+static int remove_redundant(struct repository *r, struct commit **array, int cnt)
 {
 	/*
 	 * Some commit in the array may be an ancestor of
@@ -171,7 +171,7 @@ static int remove_redundant(struct commit **array, int cnt)
 	ALLOC_ARRAY(filled_index, cnt - 1);
 
 	for (i = 0; i < cnt; i++)
-		parse_commit(array[i]);
+		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
 		struct commit_list *common;
 		uint32_t min_generation = array[i]->generation;
@@ -187,7 +187,7 @@ static int remove_redundant(struct commit **array, int cnt)
 			if (array[j]->generation < min_generation)
 				min_generation = array[j]->generation;
 		}
-		common = paint_down_to_common(the_repository, array[i], filled,
+		common = paint_down_to_common(r, array[i], filled,
 					      work, min_generation);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
@@ -246,7 +246,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
 
-	cnt = remove_redundant(rslt, cnt);
+	cnt = remove_redundant(the_repository, rslt, cnt);
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -367,7 +367,7 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 			p->item->object.flags &= ~STALE;
 		}
 	}
-	num_head = remove_redundant(array, num_head);
+	num_head = remove_redundant(the_repository, array, num_head);
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 11/19] commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (9 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 10/19] commit-reach.c: allow remove_redundant " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 12/19] commit-reach: prepare get_merge_bases " Stefan Beller
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index bce4169f1f..df93274966 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -213,7 +213,8 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return filled;
 }
 
-static struct commit_list *get_merge_bases_many_0(struct commit *one,
+static struct commit_list *get_merge_bases_many_0(struct repository *r,
+						  struct commit *one,
 						  int n,
 						  struct commit **twos,
 						  int cleanup)
@@ -223,7 +224,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	struct commit_list *result;
 	int cnt, i;
 
-	result = merge_bases_many(the_repository, one, n, twos);
+	result = merge_bases_many(r, one, n, twos);
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
@@ -246,7 +247,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
 
-	cnt = remove_redundant(the_repository, rslt, cnt);
+	cnt = remove_redundant(r, rslt, cnt);
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -258,19 +259,19 @@ struct commit_list *get_merge_bases_many(struct commit *one,
 					 int n,
 					 struct commit **twos)
 {
-	return get_merge_bases_many_0(one, n, twos, 1);
+	return get_merge_bases_many_0(the_repository, one, n, twos, 1);
 }
 
 struct commit_list *get_merge_bases_many_dirty(struct commit *one,
 					       int n,
 					       struct commit **twos)
 {
-	return get_merge_bases_many_0(one, n, twos, 0);
+	return get_merge_bases_many_0(the_repository, one, n, twos, 0);
 }
 
 struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
 {
-	return get_merge_bases_many_0(one, 1, &two, 1);
+	return get_merge_bases_many_0(the_repository, one, 1, &two, 1);
 }
 
 /*
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 12/19] commit-reach: prepare get_merge_bases to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (10 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 11/19] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 13/19] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Similarly to previous patches, the get_merge_base functions are used
often in the code base, which makes migrating them hard.

Implement the new functions, prefixed with 'repo_' and hide the old
functions behind a wrapper macro.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c                          | 24 +++++++++++++---------
 commit-reach.h                          | 26 +++++++++++++++---------
 contrib/coccinelle/the_repository.cocci | 27 +++++++++++++++++++++++++
 3 files changed, 57 insertions(+), 20 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index df93274966..aa3cde5c8a 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -255,23 +255,27 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	return result;
 }
 
-struct commit_list *get_merge_bases_many(struct commit *one,
-					 int n,
-					 struct commit **twos)
+struct commit_list *repo_get_merge_bases_many(struct repository *r,
+					      struct commit *one,
+					      int n,
+					      struct commit **twos)
 {
-	return get_merge_bases_many_0(the_repository, one, n, twos, 1);
+	return get_merge_bases_many_0(r, one, n, twos, 1);
 }
 
-struct commit_list *get_merge_bases_many_dirty(struct commit *one,
-					       int n,
-					       struct commit **twos)
+struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
+						    struct commit *one,
+						    int n,
+						    struct commit **twos)
 {
-	return get_merge_bases_many_0(the_repository, one, n, twos, 0);
+	return get_merge_bases_many_0(r, one, n, twos, 0);
 }
 
-struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
+struct commit_list *repo_get_merge_bases(struct repository *r,
+					 struct commit *one,
+					 struct commit *two)
 {
-	return get_merge_bases_many_0(the_repository, one, 1, &two, 1);
+	return get_merge_bases_many_0(r, one, 1, &two, 1);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 7d313e2975..52667d64ac 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -8,17 +8,23 @@ struct commit_list;
 struct contains_cache;
 struct ref_filter;
 
-struct commit_list *get_merge_bases_many(struct commit *one,
-					 int n,
-					 struct commit **twos);
-struct commit_list *get_merge_bases_many_dirty(struct commit *one,
-					       int n,
-					       struct commit **twos);
-struct commit_list *get_merge_bases(struct commit *one, struct commit *two);
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
-
+struct commit_list *repo_get_merge_bases(struct repository *r,
+					 struct commit *rev1,
+					 struct commit *rev2);
+struct commit_list *repo_get_merge_bases_many(struct repository *r,
+					      struct commit *one, int n,
+					      struct commit **twos);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos);
+struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
+						    struct commit *one, int n,
+						    struct commit **twos);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define get_merge_bases(r1, r2)           repo_get_merge_bases(the_repository, r1, r2)
+#define get_merge_bases_many(one, n, two) repo_get_merge_bases_many(the_repository, one, n, two)
+#define get_merge_bases_many_dirty(one, n, twos) repo_get_merge_bases_many_dirty(the_repository, one, n, twos)
+#endif
+
+struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
 int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index b185fe0a1d..5e037fe428 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -64,3 +64,30 @@ expression E;
 - parse_commit(
 + repo_parse_commit(the_repository,
   E)
+
+@@
+expression E;
+expression F;
+@@
+- get_merge_bases(
++ repo_get_merge_bases(the_repository,
+  E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- get_merge_bases_many(
++ repo_get_merge_bases_many(the_repository,
+  E, F, G);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- get_merge_bases_many_dirty(
++ repo_get_merge_bases_many_dirty(the_repository,
+  E, F, G);
+
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 13/19] commit-reach: prepare in_merge_bases[_many] to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (11 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 12/19] commit-reach: prepare get_merge_bases " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 14/19] commit: prepare get_commit_buffer " Stefan Beller
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c                          | 15 +++++++++------
 commit-reach.h                          | 12 ++++++++++--
 contrib/coccinelle/the_repository.cocci | 16 ++++++++++++++++
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index aa3cde5c8a..1877f356e3 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -309,16 +309,17 @@ int is_descendant_of(struct commit *commit, struct commit_list *with_commit)
 /*
  * Is "commit" an ancestor of one of the "references"?
  */
-int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference)
+int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
+			     int nr_reference, struct commit **reference)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
 	uint32_t min_generation = GENERATION_NUMBER_INFINITY;
 
-	if (parse_commit(commit))
+	if (repo_parse_commit(r, commit))
 		return ret;
 	for (i = 0; i < nr_reference; i++) {
-		if (parse_commit(reference[i]))
+		if (repo_parse_commit(r, reference[i]))
 			return ret;
 		if (reference[i]->generation < min_generation)
 			min_generation = reference[i]->generation;
@@ -327,7 +328,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 	if (commit->generation > min_generation)
 		return ret;
 
-	bases = paint_down_to_common(the_repository, commit,
+	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
 				     commit->generation);
 	if (commit->object.flags & PARENT2)
@@ -341,9 +342,11 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 /*
  * Is "commit" an ancestor of (i.e. reachable from) the "reference"?
  */
-int in_merge_bases(struct commit *commit, struct commit *reference)
+int repo_in_merge_bases(struct repository *r,
+			struct commit *commit,
+			struct commit *reference)
 {
-	return in_merge_bases_many(commit, 1, &reference);
+	return repo_in_merge_bases_many(r, commit, 1, &reference);
 }
 
 struct commit_list *reduce_heads(struct commit_list *heads)
diff --git a/commit-reach.h b/commit-reach.h
index 52667d64ac..a0d4a29d25 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -27,8 +27,16 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
-int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
-int in_merge_bases(struct commit *commit, struct commit *reference);
+int repo_in_merge_bases(struct repository *r,
+			struct commit *commit,
+			struct commit *reference);
+int repo_in_merge_bases_many(struct repository *r,
+			     struct commit *commit,
+			     int nr_reference, struct commit **reference);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define in_merge_bases(c1, c2) repo_in_merge_bases(the_repository, c1, c2)
+#define in_merge_bases_many(c1, n, cs) repo_in_merge_bases_many(the_repository, c1, n, cs)
+#endif
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 5e037fe428..8c6a71bf64 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -91,3 +91,19 @@ expression G;
 + repo_get_merge_bases_many_dirty(the_repository,
   E, F, G);
 
+@@
+expression E;
+expression F;
+@@
+- in_merge_bases(
++ repo_in_merge_bases(the_repository,
+  E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- in_merge_bases_many(
++ repo_in_merge_bases_many(the_repository,
+  E, F, G);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 14/19] commit: prepare get_commit_buffer to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (12 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 13/19] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 15/19] commit: prepare repo_unuse_commit_buffer " Stefan Beller
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit.c                                | 8 +++++---
 commit.h                                | 7 ++++++-
 contrib/coccinelle/the_repository.cocci | 8 ++++++++
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index fe942754e2..cc5321af8c 100644
--- a/commit.c
+++ b/commit.c
@@ -297,13 +297,15 @@ const void *get_cached_commit_buffer(struct repository *r, const struct commit *
 	return v->buffer;
 }
 
-const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep)
+const void *repo_get_commit_buffer(struct repository *r,
+				   const struct commit *commit,
+				   unsigned long *sizep)
 {
-	const void *ret = get_cached_commit_buffer(the_repository, commit, sizep);
+	const void *ret = get_cached_commit_buffer(r, commit, sizep);
 	if (!ret) {
 		enum object_type type;
 		unsigned long size;
-		ret = read_object_file(&commit->object.oid, &type, &size);
+		ret = repo_read_object_file(r, &commit->object.oid, &type, &size);
 		if (!ret)
 			die("cannot read commit object %s",
 			    oid_to_hex(&commit->object.oid));
diff --git a/commit.h b/commit.h
index 58a0c5409b..b8437f66e2 100644
--- a/commit.h
+++ b/commit.h
@@ -117,7 +117,12 @@ const void *get_cached_commit_buffer(struct repository *, const struct commit *,
  * from disk. The resulting memory should not be modified, and must be given
  * to unuse_commit_buffer when the caller is done.
  */
-const void *get_commit_buffer(const struct commit *, unsigned long *size);
+const void *repo_get_commit_buffer(struct repository *r,
+				   const struct commit *,
+				   unsigned long *size);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define get_commit_buffer(c, s) repo_get_commit_buffer(the_repository, c, s)
+#endif
 
 /*
  * Tell the commit subsytem that we are done with a particular commit buffer.
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 8c6a71bf64..4018e6eaf7 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -107,3 +107,11 @@ expression G;
 - in_merge_bases_many(
 + repo_in_merge_bases_many(the_repository,
   E, F, G);
+
+@@
+expression E;
+expression F;
+@@
+- get_commit_buffer(
++ repo_get_commit_buffer(the_repository,
+  E, F);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 15/19] commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (13 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 14/19] commit: prepare get_commit_buffer " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 16/19] commit: prepare logmsg_reencode " Stefan Beller
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit.c                                | 6 ++++--
 commit.h                                | 7 ++++++-
 contrib/coccinelle/the_repository.cocci | 8 ++++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/commit.c b/commit.c
index cc5321af8c..8e7b1138f7 100644
--- a/commit.c
+++ b/commit.c
@@ -318,10 +318,12 @@ const void *repo_get_commit_buffer(struct repository *r,
 	return ret;
 }
 
-void unuse_commit_buffer(const struct commit *commit, const void *buffer)
+void repo_unuse_commit_buffer(struct repository *r,
+			      const struct commit *commit,
+			      const void *buffer)
 {
 	struct commit_buffer *v = buffer_slab_peek(
-		the_repository->parsed_objects->buffer_slab, commit);
+		r->parsed_objects->buffer_slab, commit);
 	if (!(v && v->buffer == buffer))
 		free((void *)buffer);
 }
diff --git a/commit.h b/commit.h
index b8437f66e2..f3ef497723 100644
--- a/commit.h
+++ b/commit.h
@@ -130,7 +130,12 @@ const void *repo_get_commit_buffer(struct repository *r,
  * from an earlier call to get_commit_buffer.  The buffer may or may not be
  * freed by this call; callers should not access the memory afterwards.
  */
-void unuse_commit_buffer(const struct commit *, const void *buffer);
+void repo_unuse_commit_buffer(struct repository *r,
+			      const struct commit *,
+			      const void *buffer);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define unuse_commit_buffer(c, b) repo_unuse_commit_buffer(the_repository, c, b)
+#endif
 
 /*
  * Free any cached object buffer associated with the commit.
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 4018e6eaf7..516f19ffee 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -115,3 +115,11 @@ expression F;
 - get_commit_buffer(
 + repo_get_commit_buffer(the_repository,
   E, F);
+
+@@
+expression E;
+expression F;
+@@
+- unuse_commit_buffer(
++ repo_unuse_commit_buffer(the_repository,
+  E, F);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 16/19] commit: prepare logmsg_reencode to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (14 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 15/19] commit: prepare repo_unuse_commit_buffer " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 17/19] pretty: prepare format_commit_message " Stefan Beller
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.h                                |  8 ++++++++
 contrib/coccinelle/the_repository.cocci |  9 +++++++++
 pretty.c                                | 13 +++++++------
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/commit.h b/commit.h
index f3ef497723..b69a4b140d 100644
--- a/commit.h
+++ b/commit.h
@@ -180,6 +180,14 @@ extern int has_non_ascii(const char *text);
 extern const char *logmsg_reencode(const struct commit *commit,
 				   char **commit_encoding,
 				   const char *output_encoding);
+const char *repo_logmsg_reencode(struct repository *r,
+				 const struct commit *commit,
+				 char **commit_encoding,
+				 const char *output_encoding);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define logmsg_reencode(c, enc, out) repo_logmsg_reencode(the_repository, c, enc, out)
+#endif
+
 extern const char *skip_blank_lines(const char *msg);
 
 /** Removes the first commit from a list sorted by date, and adds all
diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index 516f19ffee..f5b42cfc62 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -123,3 +123,12 @@ expression F;
 - unuse_commit_buffer(
 + repo_unuse_commit_buffer(the_repository,
   E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- logmsg_reencode(
++ repo_logmsg_reencode(the_repository,
+  E, F, G);
diff --git a/pretty.c b/pretty.c
index 98cf5228f9..26e44472bb 100644
--- a/pretty.c
+++ b/pretty.c
@@ -595,14 +595,15 @@ static char *replace_encoding_header(char *buf, const char *encoding)
 	return strbuf_detach(&tmp, NULL);
 }
 
-const char *logmsg_reencode(const struct commit *commit,
-			    char **commit_encoding,
-			    const char *output_encoding)
+const char *repo_logmsg_reencode(struct repository *r,
+				 const struct commit *commit,
+				 char **commit_encoding,
+				 const char *output_encoding)
 {
 	static const char *utf8 = "UTF-8";
 	const char *use_encoding;
 	char *encoding;
-	const char *msg = get_commit_buffer(commit, NULL);
+	const char *msg = repo_get_commit_buffer(r, commit, NULL);
 	char *out;
 
 	if (!output_encoding || !*output_encoding) {
@@ -630,7 +631,7 @@ const char *logmsg_reencode(const struct commit *commit,
 		 * the cached copy from get_commit_buffer, we need to duplicate it
 		 * to avoid munging the cached copy.
 		 */
-		if (msg == get_cached_commit_buffer(the_repository, commit, NULL))
+		if (msg == get_cached_commit_buffer(r, commit, NULL))
 			out = xstrdup(msg);
 		else
 			out = (char *)msg;
@@ -644,7 +645,7 @@ const char *logmsg_reencode(const struct commit *commit,
 		 */
 		out = reencode_string(msg, output_encoding, use_encoding);
 		if (out)
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(r, commit, msg);
 	}
 
 	/*
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 17/19] pretty: prepare format_commit_message to handle arbitrary repositories
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (15 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 16/19] commit: prepare logmsg_reencode " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller, Junio C Hamano

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/coccinelle/the_repository.cocci | 10 ++++++++++
 pretty.c                                | 15 ++++++++-------
 pretty.h                                |  7 ++++++-
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
index f5b42cfc62..2ee702ecf7 100644
--- a/contrib/coccinelle/the_repository.cocci
+++ b/contrib/coccinelle/the_repository.cocci
@@ -132,3 +132,13 @@ expression G;
 - logmsg_reencode(
 + repo_logmsg_reencode(the_repository,
   E, F, G);
+
+@@
+expression E;
+expression F;
+expression G;
+expression H;
+@@
+- format_commit_message(
++ repo_format_commit_message(the_repository,
+  E, F, G, H);
diff --git a/pretty.c b/pretty.c
index 26e44472bb..948f5346cf 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1505,9 +1505,10 @@ void userformat_find_requirements(const char *fmt, struct userformat_want *w)
 	strbuf_release(&dummy);
 }
 
-void format_commit_message(const struct commit *commit,
-			   const char *format, struct strbuf *sb,
-			   const struct pretty_print_context *pretty_ctx)
+void repo_format_commit_message(struct repository *r,
+				const struct commit *commit,
+				const char *format, struct strbuf *sb,
+				const struct pretty_print_context *pretty_ctx)
 {
 	struct format_commit_context context;
 	const char *output_enc = pretty_ctx->output_encoding;
@@ -1521,9 +1522,9 @@ void format_commit_message(const struct commit *commit,
 	 * convert a commit message to UTF-8 first
 	 * as far as 'format_commit_item' assumes it in UTF-8
 	 */
-	context.message = logmsg_reencode(commit,
-					  &context.commit_encoding,
-					  utf8);
+	context.message = repo_logmsg_reencode(r, commit,
+					       &context.commit_encoding,
+					       utf8);
 
 	strbuf_expand(sb, format, format_commit_item, &context);
 	rewrap_message_tail(sb, &context, 0, 0, 0);
@@ -1547,7 +1548,7 @@ void format_commit_message(const struct commit *commit,
 	}
 
 	free(context.commit_encoding);
-	unuse_commit_buffer(commit, context.message);
+	repo_unuse_commit_buffer(r, commit, context.message);
 }
 
 static void pp_header(struct pretty_print_context *pp,
diff --git a/pretty.h b/pretty.h
index 7359d318a9..e6625269cf 100644
--- a/pretty.h
+++ b/pretty.h
@@ -103,9 +103,14 @@ void pp_remainder(struct pretty_print_context *pp, const char **msg_p,
  * Put the result to "sb".
  * Please use this function for custom formats.
  */
-void format_commit_message(const struct commit *commit,
+void repo_format_commit_message(struct repository *r,
+			const struct commit *commit,
 			const char *format, struct strbuf *sb,
 			const struct pretty_print_context *context);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define format_commit_message(c, f, s, con) \
+	repo_format_commit_message(the_repository, c, f, s, con)
+#endif
 
 /*
  * Parse given arguments from "arg", check it for correctness and
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 18/19] submodule: use submodule repos for object lookup
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (16 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 17/19] pretty: prepare format_commit_message " Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-19 20:37   ` Jonathan Tan
                     ` (2 more replies)
  2018-10-16 23:35 ` [PATCH 19/19] submodule: don't add submodule as odb for push Stefan Beller
                   ` (3 subsequent siblings)
  21 siblings, 3 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

This converts the 'show_submodule_header' function to use
the repository API properly, such that the submodule objects
are not added to the main object store.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 75 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 60 insertions(+), 15 deletions(-)

diff --git a/submodule.c b/submodule.c
index 4ee69cc014..7305ae2e10 100644
--- a/submodule.c
+++ b/submodule.c
@@ -444,7 +444,7 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
 	return prepare_revision_walk(rev);
 }
 
-static void print_submodule_summary(struct rev_info *rev, struct diff_options *o)
+static void print_submodule_summary(struct repository *r, struct rev_info *rev, struct diff_options *o)
 {
 	static const char format[] = "  %m %s";
 	struct strbuf sb = STRBUF_INIT;
@@ -455,7 +455,8 @@ static void print_submodule_summary(struct rev_info *rev, struct diff_options *o
 		ctx.date_mode = rev->date_mode;
 		ctx.output_encoding = get_log_output_encoding();
 		strbuf_setlen(&sb, 0);
-		format_commit_message(commit, format, &sb, &ctx);
+		repo_format_commit_message(r, commit, format, &sb,
+				      &ctx);
 		strbuf_addch(&sb, '\n');
 		if (commit->object.flags & SYMMETRIC_LEFT)
 			diff_emit_submodule_del(o, sb.buf);
@@ -482,14 +483,46 @@ void prepare_submodule_repo_env(struct argv_array *out)
 			 DEFAULT_GIT_DIR_ENVIRONMENT);
 }
 
-/* Helper function to display the submodule header line prior to the full
- * summary output. If it can locate the submodule objects directory it will
- * attempt to lookup both the left and right commits and put them into the
- * left and right pointers.
+/*
+ * Initialize 'out' based on the provided submodule path.
+ *
+ * Unlike repo_submodule_init, this tolerates submodules not present
+ * in .gitmodules. This function exists only to preserve historical behavior,
+ *
+ * Returns 0 on success, -1 when the submodule is not present.
+ */
+static int open_submodule(struct repository *out, const char *path)
+{
+	struct strbuf sb = STRBUF_INIT;
+
+	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
+		strbuf_release(&sb);
+		return -1;
+	}
+
+	out->submodule_prefix = xstrdup(path);
+	out->submodule_prefix = xstrfmt("%s%s/",
+					the_repository->submodule_prefix ?
+					the_repository->submodule_prefix :
+					"", path);
+
+	strbuf_release(&sb);
+	return 0;
+}
+
+/*
+ * Helper function to display the submodule header line prior to the full
+ * summary output.
+ *
+ * If it can locate the submodule git directory it will create a repository
+ * handle for the submodule and lookup both the left and right commits and
+ * put them into the left and right pointers.
  */
-static void show_submodule_header(struct diff_options *o, const char *path,
+static void show_submodule_header(struct diff_options *o,
+		const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule,
+		struct repository *sub,
 		struct commit **left, struct commit **right,
 		struct commit_list **merge_bases)
 {
@@ -508,7 +541,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	else if (is_null_oid(two))
 		message = "(submodule deleted)";
 
-	if (add_submodule_odb(path)) {
+	if (!sub) {
 		if (!message)
 			message = "(commits not present)";
 		goto output_header;
@@ -518,8 +551,8 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	 * Attempt to lookup the commit references, and determine if this is
 	 * a fast forward or fast backwards update.
 	 */
-	*left = lookup_commit_reference(the_repository, one);
-	*right = lookup_commit_reference(the_repository, two);
+	*left = lookup_commit_reference(sub, one);
+	*right = lookup_commit_reference(sub, two);
 
 	/*
 	 * Warn about missing commits in the submodule project, but only if
@@ -529,7 +562,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = get_merge_bases(*left, *right);
+	*merge_bases = repo_get_merge_bases(sub, *left, *right);
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
@@ -563,16 +596,20 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 	struct rev_info rev;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
+	struct repository subrepo, *sub = &subrepo;
+
+	if (open_submodule(&subrepo, path) < 0)
+		sub = NULL;
 
 	show_submodule_header(o, path, one, two, dirty_submodule,
-			      &left, &right, &merge_bases);
+			      sub, &left, &right, &merge_bases);
 
 	/*
 	 * If we don't have both a left and a right pointer, there is no
 	 * reason to try and display a summary. The header line should contain
 	 * all the information the user needs.
 	 */
-	if (!left || !right)
+	if (!left || !right || !sub)
 		goto out;
 
 	/* Treat revision walker failure the same as missing commits */
@@ -581,13 +618,15 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 		goto out;
 	}
 
-	print_submodule_summary(&rev, o);
+	print_submodule_summary(sub, &rev, o);
 
 out:
 	if (merge_bases)
 		free_commit_list(merge_bases);
 	clear_commit_marks(left, ~0);
 	clear_commit_marks(right, ~0);
+	if (sub)
+		repo_clear(sub);
 }
 
 void show_submodule_inline_diff(struct diff_options *o, const char *path,
@@ -599,9 +638,13 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 	struct commit_list *merge_bases = NULL;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf sb = STRBUF_INIT;
+	struct repository subrepo, *sub = &subrepo;
+
+	if (open_submodule(&subrepo, path) < 0)
+		sub = NULL;
 
 	show_submodule_header(o, path, one, two, dirty_submodule,
-			      &left, &right, &merge_bases);
+			      sub, &left, &right, &merge_bases);
 
 	/* We need a valid left and right commit to display a difference */
 	if (!(left || is_null_oid(one)) ||
@@ -662,6 +705,8 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		clear_commit_marks(left, ~0);
 	if (right)
 		clear_commit_marks(right, ~0);
+	if (sub)
+		repo_clear(sub);
 }
 
 int should_update_submodules(void)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 19/19] submodule: don't add submodule as odb for push
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (17 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
@ 2018-10-16 23:35 ` Stefan Beller
  2018-10-19 20:39   ` Jonathan Tan
  2018-10-17 12:41 ` [PATCH 00/19] Bring more repository handles into our code base Derrick Stolee
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-16 23:35 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Stefan Beller

In push_submodule(), because we do not actually need access to objects
in the submodule, do not invoke add_submodule_odb().
(for_each_remote_ref_submodule() does not require access to those
objects, and the actual push is done by spawning another process,
which handles object access by itself.)

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 7305ae2e10..e623e6bf7f 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1024,9 +1024,6 @@ static int push_submodule(const char *path,
 			  const struct string_list *push_options,
 			  int dry_run)
 {
-	if (add_submodule_odb(path))
-		return 1;
-
 	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
 		struct child_process cp = CHILD_PROCESS_INIT;
 		argv_array_push(&cp.args, "push");
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/19] Bring more repository handles into our code base
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (18 preceding siblings ...)
  2018-10-16 23:35 ` [PATCH 19/19] submodule: don't add submodule as odb for push Stefan Beller
@ 2018-10-17 12:41 ` Derrick Stolee
  2018-10-17 17:53   ` Stefan Beller
  2018-10-19  7:23 ` [PATCH 00/19] Bring more repository handles into our code base Junio C Hamano
  2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
  21 siblings, 1 reply; 49+ messages in thread
From: Derrick Stolee @ 2018-10-17 12:41 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: jonathantanmy

On 10/16/2018 7:35 PM, Stefan Beller wrote:
>              
>      This series takes another approach as it doesn't change the signature of
>      functions, but introduces new functions that can deal with arbitrary
>      repositories, keeping the old function signature around using a shallow wrapper.
I think this is a good direction, and the changes look good to me.

>      Additionally each patch adds a semantic patch, that would port from the old to
>      the new function. These semantic patches are all applied in the very last patch,
>      but we could omit applying the last patch if it causes too many merge conflicts
>      and trickl in the semantic patches over time when there are no merge conflicts.

The semantic patches are a good idea. At some point in the future, we 
can submit a series that applies the patches to any leftover calls and 
removes the old callers. We can hopefully rely on review (and the 
semantic patches warning that there is work to do) to prevent new 
callers from being introduced in future topics. That doesn't count 
topics that come around while this one isn't merged down.

I had one high-level question: How are we testing that these "arbitrary 
repository" changes are safe? I just remember the issue we had with the 
commit-graph code relying on arbitrary repositories, but then adding a 
reference to the replace objects broke the code (because replace-objects 
wasn't using arbitrary repos correctly, but the commit-graph was tested 
with arbitrary repositories using "test-tool repository"). It would be 
nice to introduce more method calls in t/helper/test-repository.c that 
help us know these are safe conversions. Otherwise, we are essentially 
waiting until we try to take submodule things in-process and find out 
what breaks. As we discovered with the refstore, we can't just ensure 
that all references to the_repository are removed.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/19] Bring more repository handles into our code base
  2018-10-17 12:41 ` [PATCH 00/19] Bring more repository handles into our code base Derrick Stolee
@ 2018-10-17 17:53   ` Stefan Beller
  2018-10-18 18:37     ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Stefan Beller
  0 siblings, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-17 17:53 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Jonathan Tan

On Wed, Oct 17, 2018 at 5:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/16/2018 7:35 PM, Stefan Beller wrote:
> >
> >      This series takes another approach as it doesn't change the signature of
> >      functions, but introduces new functions that can deal with arbitrary
> >      repositories, keeping the old function signature around using a shallow wrapper.
> I think this is a good direction, and the changes look good to me.
>
> >      Additionally each patch adds a semantic patch, that would port from the old to
> >      the new function. These semantic patches are all applied in the very last patch,
> >      but we could omit applying the last patch if it causes too many merge conflicts
> >      and trickl in the semantic patches over time when there are no merge conflicts.
>
> The semantic patches are a good idea. At some point in the future, we
> can submit a series that applies the patches to any leftover calls and
> removes the old callers. We can hopefully rely on review (and the
> semantic patches warning that there is work to do) to prevent new
> callers from being introduced in future topics. That doesn't count
> topics that come around while this one isn't merged down.

For those topics still in flight, I added re-defines, e.g.

#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
#define get_merge_bases(r1, r2)
repo_get_merge_bases(the_repository, r1, r2)
#endif

so the base function still keeps working, and we can cleanup
multiple times, until eventually, we can get rid of the base function.

> I had one high-level question: How are we testing that these "arbitrary
> repository" changes are safe?

I did the bare minimum in conversions in this series, such that the
submodule code tests successfully. So if we'd revert some parts,
the submodule tests would break.



> I just remember the issue we had with the
> commit-graph code relying on arbitrary repositories, but then adding a
> reference to the replace objects broke the code (because replace-objects
> wasn't using arbitrary repos correctly, but the commit-graph was tested
> with arbitrary repositories using "test-tool repository"). It would be
> nice to introduce more method calls in t/helper/test-repository.c that
> help us know these are safe conversions.

Or instead we could accelerate the long term plan of removing a
hard coded the_repository and have each cmd builtin take an additional
repository pointer from the init code, such that we'd bring all of Git to
work on arbitrary repositories. Then the standard test suite should be
okay, as there is no special case for the_repository any more.

> Otherwise, we are essentially
> waiting until we try to take submodule things in-process and find out
> what breaks. As we discovered with the refstore, we can't just ensure
> that all references to the_repository are removed.

Yes, that is correct. We had a similar case with partial clone,
as laid out in the cover letter for the RFC.

I'll explore both the test tool approach as well as
repository-fication of the code base.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH 0/2] Bring the_repository into cmd_foo
  2018-10-17 17:53   ` Stefan Beller
@ 2018-10-18 18:37     ` Stefan Beller
  2018-10-18 18:37       ` [RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency Stefan Beller
                         ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-18 18:37 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, stolee

> On Wed, Oct 17, 2018 at 5:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>> I had one high-level question: How are we testing that these "arbitrary
>> repository" changes are safe?
> [...]
> Or instead we could accelerate the long term plan of removing a
> hard coded the_repository and have each cmd builtin take an additional
> repository pointer from the init code, such that we'd bring all of Git to
> work on arbitrary repositories. Then the standard test suite should be
> okay, as there is no special case for the_repository any more.

Demo'd in this RFC series for git-merge-base.

The core idea is found in patch 1,
and the proof of concept is found in patch 2.

What do you think?

Thanks,
Stefan

Stefan Beller (3):
  repository: have get_the_repository() to remove the_repository
    dependency
  builtin/merge-base.c: do not rely on the_repository any more

 builtin/merge-base.c  | 67 ++++++++++++++++++++++++++-----------------
 repository.c          | 10 +++++++
 repository.h          | 13 ++++++++-
 t/t6010-merge-base.sh |  3 +-

-- 
2.19.0


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency
  2018-10-18 18:37     ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Stefan Beller
@ 2018-10-18 18:37       ` Stefan Beller
  2018-10-18 18:37       ` [RFC PATCH 2/2 (BREAKS BUILD)] builtin/merge-base.c: do not rely on the_repository any more Stefan Beller
  2018-10-18 21:01       ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Jonathan Tan
  2 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-18 18:37 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, stolee

The struct 'the_repo' contains all data that of the main repository.
As we move more and more globals into this struct, the usual way of
accessing these is using 'the_repository', which can be used as a drop in
replacement for accessing the migrated globals.

However during the migration of globals into the repository object, it is
not clear if some code path rely on the_repository or can work on an
arbitrary repository (as we'd eventually want for submodules) due to the
excessive use of the_repository throughout the code base.

To address this, introduce a function 'get_the_repository(void)' which
will return the main repository and set the_repository to NULL when the
environment variable GIT_NO_THE_REPOSITORY is set.

This function is to be strictly used only at the beginning of builtin
command to assign it to a local repository pointer that we'll use to
pass through the code base.

By having the possibility to set the_repository to NULL, we'll get
a segfault when we try to access the_repository instead of using the
handle that we pass around.

This approach let's us have the_repository in the setup code, which
in its current form is not yet able to transition into a world where
the repository handle is passed around and only test the passing around
of the repository handle for later stage code.

Eventually in the future the setup code will produce the repository
handle and each 'cmd_foo(int argc, char **argv)' builtin would get the
repository via an additional parameter.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 repository.c | 10 ++++++++++
 repository.h | 13 ++++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 5dd1486718..aaaababb88 100644
--- a/repository.c
+++ b/repository.c
@@ -20,6 +20,16 @@ void initialize_the_repository(void)
 	repo_set_hash_algo(&the_repo, GIT_HASH_SHA1);
 }

+struct repository *get_the_repository(void)
+{
+	struct repository *r = the_repository;
+
+	if (getenv("GIT_NO_THE_REPOSITORY"))
+		the_repository = NULL;
+
+	return r;
+}
+
 static void expand_base_dir(char **out, const char *in,
 			    const char *base_dir, const char *def_in)
 {
diff --git a/repository.h b/repository.h
index 9f16c42c1e..26f5d64f68 100644
--- a/repository.h
+++ b/repository.h
@@ -114,13 +114,24 @@ void repo_set_gitdir(struct repository *repo, const char *root,
 		     const struct set_gitdir_args *extra_args);
 void repo_set_worktree(struct repository *repo, const char *path);
 void repo_set_hash_algo(struct repository *repo, int algo);
-void initialize_the_repository(void);
 int repo_init(struct repository *r, const char *gitdir, const char *worktree);
 int repo_submodule_init(struct repository *submodule,
 			struct repository *superproject,
 			const char *path);
 void repo_clear(struct repository *repo);

+/*
+ * Initializes the repository 'the_repository', which is used in the transition
+ * phase of moving globals into the repository struct.
+ */
+void initialize_the_repository(void);
+
+/*
+ * To be called once; after the call use only returned repository, and do not
+ * use the_repository any more
+ */
+struct repository *get_the_repository(void);
+
 /*
  * Populates the repository's index from its index_file, an index struct will
  * be allocated if needed.
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH 2/2 (BREAKS BUILD)] builtin/merge-base.c: do not rely on the_repository any more
  2018-10-18 18:37     ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Stefan Beller
  2018-10-18 18:37       ` [RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency Stefan Beller
@ 2018-10-18 18:37       ` Stefan Beller
  2018-10-18 21:01       ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Jonathan Tan
  2 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-18 18:37 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, stolee

To avoid creeping in the dependency of the_repository,
use GIT_NO_THE_REPOSITORY in the test to prove it still works.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This doesn't work yet, as we have not converted get_oid, yet.
It proves that GIT_NO_THE_REPOSITORY works, though.

Stefan

 builtin/merge-base.c  | 67 ++++++++++++++++++++++++++-----------------
 t/t6010-merge-base.sh |  3 +-
 2 files changed, 43 insertions(+), 27 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 1c92099070..29341f8839 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -1,3 +1,4 @@
+#define NO_THE_REPOSITORY_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "cache.h"
 #include "config.h"
@@ -37,27 +38,28 @@ static const char * const merge_base_usage[] = {
 	NULL
 };
 
-static struct commit *get_commit_reference(const char *arg)
+static struct commit *get_commit_reference(struct repository *r,
+					   const char *arg)
 {
 	struct object_id revkey;
-	struct commit *r;
+	struct commit *ref;
 
 	if (get_oid(arg, &revkey))
 		die("Not a valid object name %s", arg);
-	r = lookup_commit_reference(the_repository, &revkey);
-	if (!r)
+	ref = lookup_commit_reference(r, &revkey);
+	if (!ref)
 		die("Not a valid commit name %s", arg);
 
-	return r;
+	return ref;
 }
 
-static int handle_independent(int count, const char **args)
+static int handle_independent(struct repository *r, int count, const char **args)
 {
 	struct commit_list *revs = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
-		commit_list_insert(get_commit_reference(args[i]), &revs);
+		commit_list_insert(get_commit_reference(r, args[i]), &revs);
 
 	reduce_heads_replace(&revs);
 
@@ -71,14 +73,16 @@ static int handle_independent(int count, const char **args)
 	return 0;
 }
 
-static int handle_octopus(int count, const char **args, int show_all)
+static int handle_octopus(struct repository *r,
+			  int count, const char **args,
+			  int show_all)
 {
 	struct commit_list *revs = NULL;
 	struct commit_list *result, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
-		commit_list_insert(get_commit_reference(args[i]), &revs);
+		commit_list_insert(get_commit_reference(r, args[i]), &revs);
 
 	result = get_octopus_merge_bases(revs);
 	free_commit_list(revs);
@@ -97,15 +101,15 @@ static int handle_octopus(int count, const char **args, int show_all)
 	return 0;
 }
 
-static int handle_is_ancestor(int argc, const char **argv)
+static int handle_is_ancestor(struct repository *r, int argc, const char **argv)
 {
 	struct commit *one, *two;
 
 	if (argc != 2)
 		die("--is-ancestor takes exactly two commits");
-	one = get_commit_reference(argv[0]);
-	two = get_commit_reference(argv[1]);
-	if (in_merge_bases(one, two))
+	one = get_commit_reference(r, argv[0]);
+	two = get_commit_reference(r, argv[1]);
+	if (repo_in_merge_bases(r, one, two))
 		return 0;
 	else
 		return 1;
@@ -116,19 +120,22 @@ struct rev_collect {
 	int nr;
 	int alloc;
 	unsigned int initial : 1;
+	struct repository *repo;
 };
 
-static void add_one_commit(struct object_id *oid, struct rev_collect *revs)
+static void add_one_commit(struct repository *r,
+			   struct object_id *oid,
+			   struct rev_collect *revs)
 {
 	struct commit *commit;
 
 	if (is_null_oid(oid))
 		return;
 
-	commit = lookup_commit(the_repository, oid);
+	commit = lookup_commit(r, oid);
 	if (!commit ||
 	    (commit->object.flags & TMP_MARK) ||
-	    parse_commit(commit))
+	    repo_parse_commit(r, commit))
 		return;
 
 	ALLOC_GROW(revs->commit, revs->nr + 1, revs->alloc);
@@ -144,13 +151,13 @@ static int collect_one_reflog_ent(struct object_id *ooid, struct object_id *noid
 
 	if (revs->initial) {
 		revs->initial = 0;
-		add_one_commit(ooid, revs);
+		add_one_commit(revs->repo, ooid, revs);
 	}
-	add_one_commit(noid, revs);
+	add_one_commit(revs->repo, noid, revs);
 	return 0;
 }
 
-static int handle_fork_point(int argc, const char **argv)
+static int handle_fork_point(struct repository *r, int argc, const char **argv)
 {
 	struct object_id oid;
 	char *refname;
@@ -173,13 +180,14 @@ static int handle_fork_point(int argc, const char **argv)
 	if (get_oid(commitname, &oid))
 		die("Not a valid object name: '%s'", commitname);
 
-	derived = lookup_commit_reference(the_repository, &oid);
+	derived = lookup_commit_reference(r, &oid);
 	memset(&revs, 0, sizeof(revs));
 	revs.initial = 1;
+	revs.repo = r;
 	for_each_reflog_ent(refname, collect_one_reflog_ent, &revs);
 
 	if (!revs.nr && !get_oid(refname, &oid))
-		add_one_commit(&oid, &revs);
+		add_one_commit(r, &oid, &revs);
 
 	for (i = 0; i < revs.nr; i++)
 		revs.commit[i]->object.flags &= ~TMP_MARK;
@@ -217,6 +225,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 	int rev_nr = 0;
 	int show_all = 0;
 	int cmdmode = 0;
+	struct repository *r;
 
 	struct option options[] = {
 		OPT_BOOL('a', "all", &show_all, N_("output all common ancestors")),
@@ -234,27 +243,33 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 	git_config(git_default_config, NULL);
 	argc = parse_options(argc, argv, prefix, options, merge_base_usage, 0);
 
+	/*
+	 * TODO: once the config machinery can cope without proper setup of
+	 * the_repository, move this call up
+	 */
+	r = get_the_repository();
+
 	if (cmdmode == 'a') {
 		if (argc < 2)
 			usage_with_options(merge_base_usage, options);
 		if (show_all)
 			die("--is-ancestor cannot be used with --all");
-		return handle_is_ancestor(argc, argv);
+		return handle_is_ancestor(r, argc, argv);
 	}
 
 	if (cmdmode == 'r' && show_all)
 		die("--independent cannot be used with --all");
 
 	if (cmdmode == 'o')
-		return handle_octopus(argc, argv, show_all);
+		return handle_octopus(r, argc, argv, show_all);
 
 	if (cmdmode == 'r')
-		return handle_independent(argc, argv);
+		return handle_independent(r, argc, argv);
 
 	if (cmdmode == 'f') {
 		if (argc < 1 || 2 < argc)
 			usage_with_options(merge_base_usage, options);
-		return handle_fork_point(argc, argv);
+		return handle_fork_point(r, argc, argv);
 	}
 
 	if (argc < 2)
@@ -262,6 +277,6 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 
 	ALLOC_ARRAY(rev, argc);
 	while (argc-- > 0)
-		rev[rev_nr++] = get_commit_reference(*argv++);
+		rev[rev_nr++] = get_commit_reference(r, *argv++);
 	return show_merge_base(rev, rev_nr, show_all);
 }
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 44c726ea39..ae1c85ee8d 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -15,7 +15,8 @@ GIT_COMMITTER_EMAIL=git@comm.iter.xz
 GIT_COMMITTER_NAME='C O Mmiter'
 GIT_AUTHOR_NAME='A U Thor'
 GIT_AUTHOR_EMAIL=git@au.thor.xz
-export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL
+GIT_NO_THE_REPOSITORY=1
+export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_NO_THE_REPOSITORY
 
 doit () {
 	OFFSET=$1 &&
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH 0/2] Bring the_repository into cmd_foo
  2018-10-18 18:37     ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Stefan Beller
  2018-10-18 18:37       ` [RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency Stefan Beller
  2018-10-18 18:37       ` [RFC PATCH 2/2 (BREAKS BUILD)] builtin/merge-base.c: do not rely on the_repository any more Stefan Beller
@ 2018-10-18 21:01       ` Jonathan Tan
  2018-10-18 23:23         ` Stefan Beller
  2 siblings, 1 reply; 49+ messages in thread
From: Jonathan Tan @ 2018-10-18 21:01 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, stolee

> > Or instead we could accelerate the long term plan of removing a
> > hard coded the_repository and have each cmd builtin take an additional
> > repository pointer from the init code, such that we'd bring all of Git to
> > work on arbitrary repositories. Then the standard test suite should be
> > okay, as there is no special case for the_repository any more.
> 
> Demo'd in this RFC series for git-merge-base.
> 
> The core idea is found in patch 1,
> and the proof of concept is found in patch 2.

I don't think working around the_repository is sufficient, as there are
other ways to access the same repository state (the_index, directly
accessing files through file I/O). Instead I would prefer a test like in
t/test-repository.c - each patch set would probably only need one test
for the last function converted, since typically the last function uses
every other function converted.

Also, even if we decided that working around the_repository is
sufficient, I don't think this get_the_repository() is a good approach.
If (or when) we decide to convert all builtins to not use
the_repository, we would have to clean up all such calls.

Better would be to pass the_repository from the invoker of the cmd
builtin, and reuse NO_THE_REPOSITORY_COMPATIBILITY_MACROS in the
builtin. (I haven't thought much about how to transition to this, but
one way might be to extend "struct cmd_struct" in git.c to also have a
new-style function pointer, and #define GIT_CMD(c, f, o) {c, NULL, o, f}
or something like that.)

This doesn't directly address the fact that the builtin might call lib
code that indirectly references the_repository, but I think that this
won't be an issue because by the time we're ready to convert builtins to
not use the_repository, most if not all of the lib code would have
NO_THE_REPOSITORY_COMPATIBILITY_MACROS defined anyway.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH 0/2] Bring the_repository into cmd_foo
  2018-10-18 21:01       ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Jonathan Tan
@ 2018-10-18 23:23         ` Stefan Beller
  0 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-18 23:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Derrick Stolee

On Thu, Oct 18, 2018 at 2:01 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > > Or instead we could accelerate the long term plan of removing a
> > > hard coded the_repository and have each cmd builtin take an additional
> > > repository pointer from the init code, such that we'd bring all of Git to
> > > work on arbitrary repositories. Then the standard test suite should be
> > > okay, as there is no special case for the_repository any more.
> >
> > Demo'd in this RFC series for git-merge-base.
> >
> > The core idea is found in patch 1,
> > and the proof of concept is found in patch 2.
>
> I don't think working around the_repository is sufficient, as there are
> other ways to access the same repository state (the_index, directly
> accessing files through file I/O).

Sure, but that would stick out like a sore thumb?

> Instead I would prefer a test like in
> t/test-repository.c - each patch set would probably only need one test
> for the last function converted, since typically the last function uses
> every other function converted.

I'll look into that.

>
> Also, even if we decided that working around the_repository is
> sufficient, I don't think this get_the_repository() is a good approach.
> If (or when) we decide to convert all builtins to not use
> the_repository, we would have to clean up all such calls.

Just like we have to cleanup the calls to the_repository or the_index
in general now (c.f. nd/the-index)

> Better would be to pass the_repository from the invoker of the cmd
> builtin, and reuse NO_THE_REPOSITORY_COMPATIBILITY_MACROS in the
> builtin. (I haven't thought much about how to transition to this, but
> one way might be to extend "struct cmd_struct" in git.c to also have a
> new-style function pointer, and #define GIT_CMD(c, f, o) {c, NULL, o, f}
> or something like that.)

This sounds like the next step to me.

> This doesn't directly address the fact that the builtin might call lib
> code that indirectly references the_repository, but I think that this
> won't be an issue because by the time we're ready to convert builtins to
> not use the_repository, most if not all of the lib code would have
> NO_THE_REPOSITORY_COMPATIBILITY_MACROS defined anyway.

And until then we double up on tests, one time the regular end-to-end tests
and additional tests for repository agnostic units in test-repository.c ?

The whole point of this approach is to keep the testing at the level
that we currently have and make the tests more powerful in doubling
for both (a) testing existing behavior, (b) getting fairly good coverage
of repository-fication of the code base by these 2 simple knobs.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/19] Bring more repository handles into our code base
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (19 preceding siblings ...)
  2018-10-17 12:41 ` [PATCH 00/19] Bring more repository handles into our code base Derrick Stolee
@ 2018-10-19  7:23 ` Junio C Hamano
  2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
  21 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-19  7:23 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy

Stefan Beller <sbeller@google.com> writes:

> This rerolls sb/more-repo-in-api.
> It applies on nd/the-index merged with ds/reachable and is available via
>     git fetch https://github.com/stefanbeller/git object-store-final-3

Thanks.  Luckily we have both of these prerequisites in 'master'
now, o hopefully this can be applied to 'master' directly and would
play well when merged to 'next' and to 'pu'.

Will queue and play with it a bit before sending comments.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 18/19] submodule: use submodule repos for object lookup
  2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
@ 2018-10-19 20:37   ` Jonathan Tan
  2018-10-25  9:14   ` SZEDER Gábor
  2018-10-31 13:38   ` Derrick Stolee
  2 siblings, 0 replies; 49+ messages in thread
From: Jonathan Tan @ 2018-10-19 20:37 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy

> This converts the 'show_submodule_header' function to use
> the repository API properly, such that the submodule objects
> are not added to the main object store.

There is also a side effect in that the submodule now needs to pass all
the checks done by repo_init() instead of merely having  the "objects/"
directory. Can you add information about this to the commit message?

Also, which tests exercise this functionality? Mention them in the
commit message.

> +/*
> + * Initialize 'out' based on the provided submodule path.
> + *
> + * Unlike repo_submodule_init, this tolerates submodules not present
> + * in .gitmodules. This function exists only to preserve historical behavior,
> + *
> + * Returns 0 on success, -1 when the submodule is not present.
> + */
> +static int open_submodule(struct repository *out, const char *path)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
> +		strbuf_release(&sb);
> +		return -1;
> +	}
> +
> +	out->submodule_prefix = xstrdup(path);
> +	out->submodule_prefix = xstrfmt("%s%s/",
> +					the_repository->submodule_prefix ?
> +					the_repository->submodule_prefix :
> +					"", path);

You seem to say here [1] that we don't need submodule_prefix, but you're
instead setting it twice? :-)

[1] https://public-inbox.org/git/CAGZ79kZTxoNMUyX+EHg0q3Ss2M-Etkf0yiw3E40U3VCt4QMwrQ@mail.gmail.com/

> +/*
> + * Helper function to display the submodule header line prior to the full
> + * summary output.
> + *
> + * If it can locate the submodule git directory it will create a repository
> + * handle for the submodule and lookup both the left and right commits and
> + * put them into the left and right pointers.
>   */
> -static void show_submodule_header(struct diff_options *o, const char *path,
> +static void show_submodule_header(struct diff_options *o,
> +		const char *path,
>  		struct object_id *one, struct object_id *two,
>  		unsigned dirty_submodule,
> +		struct repository *sub,
>  		struct commit **left, struct commit **right,
>  		struct commit_list **merge_bases)
>  {

Location of the submodule git directory is done by the caller of this
function, not by this function. Also this function doesn't create any
repository handles. The documentation probably needs to be updated. Also
mention what happens if sub is NULL.

> @@ -563,16 +596,20 @@ void show_submodule_summary(struct diff_options *o, const char *path,
>  	struct rev_info rev;
>  	struct commit *left = NULL, *right = NULL;
>  	struct commit_list *merge_bases = NULL;
> +	struct repository subrepo, *sub = &subrepo;
> +
> +	if (open_submodule(&subrepo, path) < 0)
> +		sub = NULL;

Handling of the subrepo and *sub seems clumsy - it might be better to
just let open_submodule() return a struct repository pointer.

Previous 17 patches look good - most are the same as the previous
version, which I already was happy with, and I see that the patch I
requested in [2] was added.

[2] https://public-inbox.org/git/20181011221114.186377-1-jonathantanmy@google.com/

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 19/19] submodule: don't add submodule as odb for push
  2018-10-16 23:35 ` [PATCH 19/19] submodule: don't add submodule as odb for push Stefan Beller
@ 2018-10-19 20:39   ` Jonathan Tan
  0 siblings, 0 replies; 49+ messages in thread
From: Jonathan Tan @ 2018-10-19 20:39 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy

> In push_submodule(), because we do not actually need access to objects
> in the submodule, do not invoke add_submodule_odb().
> (for_each_remote_ref_submodule() does not require access to those
> objects, and the actual push is done by spawning another process,
> which handles object access by itself.)

The code looks good - my analysis is the same as that in my review of
the previous version [1].

Can you mention, in the commit message, the tests that exercise the
functionality here (and say that they still pass)?

[1] https://public-inbox.org/git/20181011230028.200488-1-jonathantanmy@google.com/
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  submodule.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index 7305ae2e10..e623e6bf7f 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1024,9 +1024,6 @@ static int push_submodule(const char *path,
>  			  const struct string_list *push_options,
>  			  int dry_run)
>  {
> -	if (add_submodule_odb(path))
> -		return 1;
> -
>  	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
>  		struct child_process cp = CHILD_PROCESS_INIT;
>  		argv_array_push(&cp.args, "push");
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
                   ` (20 preceding siblings ...)
  2018-10-19  7:23 ` [PATCH 00/19] Bring more repository handles into our code base Junio C Hamano
@ 2018-10-22 17:39 ` SZEDER Gábor
  2018-10-22 18:54   ` Stefan Beller
                     ` (2 more replies)
  21 siblings, 3 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 17:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git, jonathantanmy

On Tue, Oct 16, 2018 at 04:35:31PM -0700, Stefan Beller wrote:
> the last patch (applying the semantic patches) has been omitted as that
> would produce a lot of merge conflicts. Without that patch, this merges
> cleanly to next.
> 
> As for when to apply the semantic patches, I wondered if we would prefer a dirty merge
> (created via "make coccicheck && git apply contrib/coccinelle/the_repository.cocci.patch")
> of the semantic patches, or if we'd actually trickle in the changes over some time?

>     This series takes another approach as it doesn't change the signature of
>     functions, but introduces new functions that can deal with arbitrary 
>     repositories, keeping the old function signature around using a shallow wrapper.
>     
>     Additionally each patch adds a semantic patch, that would port from the old to
>     the new function. These semantic patches are all applied in the very last patch,
>     but we could omit applying the last patch if it causes too many merge conflicts
>     and trickl in the semantic patches over time when there are no merge conflicts.

I don't really like how this or the previous RFC patch series deal
with semantic patches (or how some past patch series dealt with them,
for that matter), for various reasons:

  - Applying the transformations from several semantic patches in one
    single patch makes it harder to review it, because we won't know
    which change came from which semantic patch.
    
    For comparison, see the patch series adding hasheq()/oideq(),
    merged in 769af0fd9e (Merge branch 'jk/cocci', 2018-09-17), in
    particular the four "convert <this> to <that>" patches.

  - 'make coccicheck' won't run clean (and the static analysis build
    job on Travis CI will fail) for all commits following adding the
    new semantic patches but before applying the resulting
    transformations.

  - These semantic patches interact badly with 'pu' and 'next',
    because those integration branches can contain topic branches
    adding new code that should be transformed by these semanic
    patches.  Consequently, 'make coccicheck' won't run clean and the
    static analysis build job will fail until all those topics reach
    'master', and the remaining transformations are applied on top.
    
    This was (and still is!) an issue with the hasheq()/oideq() series
    as well: that series was added on 2018-08-28, and the static
    analysis build job is red on 'pu' ever since.  See the follow-up
    patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
    one more follow-up will be necessary after the builtin stash topic
    is merged to 'master'.

    This makes it harder to review other patch series.

  - Is it really necessary to carry these semantic patches in-tree?
    Let me ellaborate.  There are basically two main use cases for
    semantic patches:

      - To avoid undesirable code patterns, e.g. we should not use
        sha1_to_hex(oid.hash) or strbuf_addf(&sb, "fixed string"), but
        use oid_to_hex(&oid) or strbuf_addstr(&sb, "fixed string")
        instead.  Note that in these cases we don't remove the
        functions sha1_to_hex() or strbuf_addf(), because there are
        good reasons to use them in other scenarios.

        Our semantic patches under 'contrib/coccinelle/' fall into
        this category, and we have 'make coccicheck' and the static
        analysis build job on Travis CI to catch these undesirable
        code patterns preferably early, and to prevent them from
        entering our codebase.

      - To perform one-off code transformations, e.g. to modify a
        function's name and/or signature and convert all its
        callsites; see e.g. commits abef9020e3 (sha1_file: convert
        sha1_object_info* to object_id, 2018-03-12) and b4f5aca40e
        (sha1_file: convert read_sha1_file to struct object_id,
        2018-03-12).

        As far as I understand this patch series falls into this
        category: once the conversion is complete the old functions
        will be removed.  After that there will be no use for these
        semanic patches.

        Having said that, it's certainly easier to double-check the
        resulting transformations when one can apply the semantic
        patches locally, and doing so is easier when the semantic
        patches are in tree than when they must be copy-pasted from a
        commit message.

OK, that was already long.  Now, can we do better?

How about introducing the concept of "pending" semantic patches,
stored in 'contrib/coccinelle/<name>.pending.cocci' files, modifying
'make coccicheck' to skip them, and adding the new 'make
coccicheck-pending' target to make it convenient to apply them, e.g.
something like the simple patch at the end.

So the process would go something like this:

  - A new semantic patch should be added as "pending", e.g. to the
    file 'the_repository.pending.cocci', together with the resulting
    transformations in the same commit.

    This way neither 'make coccicheck' nor the static analysis build
    job would complain in the topic branch or in the two integration
    branches.  And if they do complain, then we would know right away
    that they complain because of a well-established semantic patch.
    Yet, anyone interested could run 'make coccicheck-pending' to see
    where are we heading.

  - The author of the "pending" semanting patch should then keep an
    eye on already cooking topics: whether any of them contain new
    code that should be transformed, and how they progress to
    'master', and sending followup patch(es) with the remaining
    transformations when applicable.
    
    Futhermore, the author should also pay attention to any new topics
    that branch off after the "pending" semantic patch, and whether
    any of them introduce code to be transformed, warning their
    authors as necessary.

  - Finally, after all the dust settled, the dev should follow up with
    a patch to:
    
      - promote the "penging" patch to '<name>.cocci', if its purpose
        is to avoid undesirable code patterns in the future, or
    
      - remove the semantic patch, if it was used in a one-off
        transformation.

Thoughts?


  -- >8 --

Subject: [PATCH] coccinelle: add support for "pending" semantic patches

(should add a 'contrib/coccinelle/README' to explain what
'.pending.cocci' means)

---
 Makefile | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index d18ab0fe78..67a4441f39 100644
--- a/Makefile
+++ b/Makefile
@@ -2728,9 +2728,11 @@ endif
 	then \
 		echo '    ' SPATCH result: $@; \
 	fi
-coccicheck: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.cocci))
+coccicheck: $(addsuffix .patch,$(filter-out %.pending.cocci,$(wildcard contrib/coccinelle/*.cocci)))
 
-.PHONY: coccicheck
+coccicheck-pending: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.pending.cocci))
+
+.PHONY: coccicheck coccicheck-pending
 
 ### Installation rules
 
-- 
2.19.1.681.g6bd79da3f5




^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
@ 2018-10-22 18:54   ` Stefan Beller
  2018-10-25  1:59     ` SZEDER Gábor
  2018-10-22 22:49   ` Junio C Hamano
  2018-10-25  5:39   ` Jeff King
  2 siblings, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-22 18:54 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Jeff King, git, Jonathan Tan

On Mon, Oct 22, 2018 at 10:39 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Tue, Oct 16, 2018 at 04:35:31PM -0700, Stefan Beller wrote:
> > the last patch (applying the semantic patches) has been omitted as that
> > would produce a lot of merge conflicts. Without that patch, this merges
> > cleanly to next.
> >
> > As for when to apply the semantic patches, I wondered if we would prefer a dirty merge
> > (created via "make coccicheck && git apply contrib/coccinelle/the_repository.cocci.patch")
> > of the semantic patches, or if we'd actually trickle in the changes over some time?
>
> >     This series takes another approach as it doesn't change the signature of
> >     functions, but introduces new functions that can deal with arbitrary
> >     repositories, keeping the old function signature around using a shallow wrapper.
> >
> >     Additionally each patch adds a semantic patch, that would port from the old to
> >     the new function. These semantic patches are all applied in the very last patch,
> >     but we could omit applying the last patch if it causes too many merge conflicts
> >     and trickl in the semantic patches over time when there are no merge conflicts.
>
> I don't really like how this or the previous RFC patch series deal
> with semantic patches (or how some past patch series dealt with them,
> for that matter), for various reasons:
>
>   - Applying the transformations from several semantic patches in one
>     single patch makes it harder to review it, because we won't know
>     which change came from which semantic patch.

Good point, so to improve the series sb/more-repo-in-api, I could
send the application of the semantic patch just after each patch
that adds another semantic patching rule? I personally dislike
applying the semantic patch in the same patch as where the
semantic rule was introduced, as then the mechanical conversions
(from the semantic patch) drown out reviewers attention to the
manual changes.

>     For comparison, see the patch series adding hasheq()/oideq(),
>     merged in 769af0fd9e (Merge branch 'jk/cocci', 2018-09-17), in
>     particular the four "convert <this> to <that>" patches.

The four patches "convert <this> to <that> only contain the semantic
patch as well as its effects, the manual changes are separated out
to later patches, which is quite nice.

>   - 'make coccicheck' won't run clean (and the static analysis build
>     job on Travis CI will fail) for all commits following adding the
>     new semantic patches but before applying the resulting
>     transformations.
>
>   - These semantic patches interact badly with 'pu' and 'next',
>     because those integration branches can contain topic branches
>     adding new code that should be transformed by these semanic
>     patches.

And I thought of this as a feature. There is no merge conflict and it
still compiles, which makes Junios work easy.

Of course it put the load elsewhere. :/

For the sake of a good history, I would think running 'make coccicheck'
and applying the resulting patches would be best as part of the (dirty)
merge of any topic that proposes new semantic patches, but that would
add load to Junio as it would be an extra step during the merge.

One could argue that the step of applying such transformations into
the dirty merge is cheaper than resolving merge conflicts that are
had when the topic includes the transformation.

>     Consequently, 'make coccicheck' won't run clean and the
>     static analysis build job will fail until all those topics reach
>     'master', and the remaining transformations are applied on top.
>
>     This was (and still is!) an issue with the hasheq()/oideq() series
>     as well: that series was added on 2018-08-28, and the static
>     analysis build job is red on 'pu' ever since.  See the follow-up
>     patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
>     one more follow-up will be necessary after the builtin stash topic
>     is merged to 'master'.

In my understanding this follow up is a feature, as it helps to avoid
merge conflicts with other topics in flight.

>     This makes it harder to review other patch series.

as 'make coccicheck' is an integral part of your review?

>   - Is it really necessary to carry these semantic patches in-tree?
>     Let me ellaborate.  There are basically two main use cases for
>     semantic patches:
>
>       - To avoid undesirable code patterns, e.g. we should not use
>         sha1_to_hex(oid.hash) or strbuf_addf(&sb, "fixed string"), but
>         use oid_to_hex(&oid) or strbuf_addstr(&sb, "fixed string")
>         instead.  Note that in these cases we don't remove the
>         functions sha1_to_hex() or strbuf_addf(), because there are
>         good reasons to use them in other scenarios.
>
>         Our semantic patches under 'contrib/coccinelle/' fall into
>         this category, and we have 'make coccicheck' and the static
>         analysis build job on Travis CI to catch these undesirable
>         code patterns preferably early, and to prevent them from
>         entering our codebase.
>
>       - To perform one-off code transformations, e.g. to modify a
>         function's name and/or signature and convert all its
>         callsites; see e.g. commits abef9020e3 (sha1_file: convert
>         sha1_object_info* to object_id, 2018-03-12) and b4f5aca40e
>         (sha1_file: convert read_sha1_file to struct object_id,
>         2018-03-12).
>
>         As far as I understand this patch series falls into this
>         category: once the conversion is complete the old functions
>         will be removed.  After that there will be no use for these
>         semanic patches.
>
>         Having said that, it's certainly easier to double-check the
>         resulting transformations when one can apply the semantic
>         patches locally, and doing so is easier when the semantic
>         patches are in tree than when they must be copy-pasted from a
>         commit message.
>
> OK, that was already long.  Now, can we do better?
>
> How about introducing the concept of "pending" semantic patches,
> stored in 'contrib/coccinelle/<name>.pending.cocci' files, modifying
> 'make coccicheck' to skip them, and adding the new 'make
> coccicheck-pending' target to make it convenient to apply them, e.g.
> something like the simple patch at the end.
>
> So the process would go something like this:
>
>   - A new semantic patch should be added as "pending", e.g. to the
>     file 'the_repository.pending.cocci', together with the resulting
>     transformations in the same commit.
>
>     This way neither 'make coccicheck' nor the static analysis build
>     job would complain in the topic branch or in the two integration
>     branches.  And if they do complain, then we would know right away
>     that they complain because of a well-established semantic patch.
>     Yet, anyone interested could run 'make coccicheck-pending' to see
>     where are we heading.
>
>   - The author of the "pending" semanting patch should then keep an
>     eye on already cooking topics: whether any of them contain new
>     code that should be transformed, and how they progress to
>     'master', and sending followup patch(es) with the remaining
>     transformations when applicable.
>
>     Futhermore, the author should also pay attention to any new topics
>     that branch off after the "pending" semantic patch, and whether
>     any of them introduce code to be transformed, warning their
>     authors as necessary.
>
>   - Finally, after all the dust settled, the dev should follow up with
>     a patch to:
>
>       - promote the "penging" patch to '<name>.cocci', if its purpose
>         is to avoid undesirable code patterns in the future, or
>
>       - remove the semantic patch, if it was used in a one-off
>         transformation.
>
> Thoughts?

I like the approach of having separate classes of semantic patches:
(a) the regular "we need to keep checking these" as they address
    undesirable code patterns, which is what we currently have,
    and what 'make coccicheck' would complain about.
(b) The pending patches as you propose. However I would
    argue that we'd not want to include the transformation into
    the same patch as then the patch will have merge conflicts.
    Ideally we'd have an automated process/bot that would apply
    all pending semantic patches onto master and then checks for
    conflicts in HEAD..pu, and only sends off the non-conflicting
    diffs as a topic.
    Then after a couple integration cycles we'd have all pending
    changes in, with no conflicts on Junios side.

So I think we should add a patch like you post, but we would
need to discuss the exact approach how to deal with pending
patches. Is it the original dev who should push forward on their
own pending patches, or does it become a pooled effort?

Thanks for starting this discussion!
Stefan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
  2018-10-22 18:54   ` Stefan Beller
@ 2018-10-22 22:49   ` Junio C Hamano
  2018-10-23  0:26     ` Stefan Beller
  2018-10-23  9:38     ` Junio C Hamano
  2018-10-25  5:39   ` Jeff King
  2 siblings, 2 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-22 22:49 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Stefan Beller, Jeff King, git, jonathantanmy

SZEDER Gábor <szeder.dev@gmail.com> writes:

> I don't really like how this or the previous RFC patch series deal
> with semantic patches (or how some past patch series dealt with them,
> for that matter), for various reasons:
> ...
> How about introducing the concept of "pending" semantic patches,
> stored in 'contrib/coccinelle/<name>.pending.cocci' files, modifying
> 'make coccicheck' to skip them, and adding the new 'make
> coccicheck-pending' target to make it convenient to apply them, e.g.
> something like the simple patch at the end.
>
> So the process would go something like this:
>
>   - A new semantic patch should be added as "pending", e.g. to the
>     file 'the_repository.pending.cocci', together with the resulting
>     transformations in the same commit.
>
>     This way neither 'make coccicheck' nor the static analysis build
>     job would complain in the topic branch or in the two integration
>     branches.  And if they do complain, then we would know right away
>     that they complain because of a well-established semantic patch.
>     Yet, anyone interested could run 'make coccicheck-pending' to see
>     where are we heading.
>
>   - The author of the "pending" semanting patch should then keep an
>     eye on already cooking topics: whether any of them contain new
>     code that should be transformed, and how they progress to
>     'master', and sending followup patch(es) with the remaining
>     transformations when applicable.
>     
>     Futhermore, the author should also pay attention to any new topics
>     that branch off after the "pending" semantic patch, and whether
>     any of them introduce code to be transformed, warning their
>     authors as necessary.
>
>   - Finally, after all the dust settled, the dev should follow up with
>     a patch to:
>     
>       - promote the "penging" patch to '<name>.cocci', if its purpose
>         is to avoid undesirable code patterns in the future, or
>     
>       - remove the semantic patch, if it was used in a one-off
>         transformation.
> ...
> Thoughts?

I actually think this round does a far nicer job playing well with
other topics than any earlier series.  The pain you are observing I
think come primarily from my not making the best use of these
patches.

Steppng back a bit, I'd imagine in an ideal world where "make
coccicheck" can be done instantaneously _and_ the spatch machinery
is just as reliable as C compilers.  In such a world, I may rename
all the *.c files in my tree to *.c.in, make the very first step of
the "make all" to run coccicheck and transform *.c.in to *.c before
starting the compilation.  There is no need to have changes to
*.c.in that spatch would fix.  Such an idealized set-up has a few
nice property.

 - Mechanical conflict resolutions between topics in flight and a
   series that modifies or adds new .cocci rules would greatly be
   reduced.

 - Still, *.c files that are "compiled" from *.c.in file used by
   each build will by definition cocci-clean.

 - Bugs like the one that required 6afedba8 ("object_id.cocci: match
   only expressions of type 'struct object_id'", 2018-10-15) are
   still possible, but some of them may be caught by C compilers
   that inspect the result of spatch compilation from *.c.in to *.c

Now we do not live in that ideal world and there is no separate
"turn *.c.in into *.c" step in our build procedure.  In such a
"real" world, if we had a rule like "each individual commit must
pass 'make coccicheck' cleanly", anybody who does such a merge and
resolve huge conflics would essentially be wasting time on something
that the tool could do, and then the result of the merge would
further need to be amended for semantic conflicts (i.e. the other
branch may have introduced new instances of old pattern .cocci
patches in our branch would want to transform)).  By not insisting
on cocci-cleanness at each commit level, we can gain (or at least
sumulate gaining) some benefit that we would reap in the ideal
world, and Stefan's latest series already does so for the former.
If we insisted that these patches must be accompanied with the
result of the cocci transformations, such a series will have zero
chance of landing in 'pu', unless we freeze the world.

What I _could_ do (and what I did do only for one round of pushing
out 'pu') is to queue a coccinelle transformation at the tip of
integration branches.  If "make coccicheck" ran in subsecond, I
could even automate it in the script that is used to rebuild 'pu'
every day, so that after merging each and every topic, do the "make
coccicheck" and apply resulting .cocci.patch files and squash that
into the merge commit.

But with the current "make coccicheck" performance, that is not
realistic.

I am wondering if it is feasible to do it at the tip of 'pu' (which
is rebuilt two to three times a day), 'next' (which is updated once
or twice a week) and 'master'.

I find that your "pending" idea may be nicer, as it distributes the
load.  Whoever wants to change the world order by updating the .cocci
rules is primarily responsible for making it happen without breaking
the world during the transition.  That's more scalable.

But at some point in the ideal future, it may turn out that it is
making the process scalable by distributing a task that does not
have to be done by any human, not by the integrator or by individual
developer---but the new compilation step from *.c.in to *.c should
be able to do.  We obviously are not there yet, so...

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 22:49   ` Junio C Hamano
@ 2018-10-23  0:26     ` Stefan Beller
  2018-10-23  4:24       ` Junio C Hamano
  2018-10-23  9:38     ` Junio C Hamano
  1 sibling, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-23  0:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, Jeff King, git, Jonathan Tan

> Stepping back a bit, I'd imagine in an ideal world where "make
> coccicheck" can be done instantaneously _and_ the spatch machinery
> is just as reliable as C compilers.
> [...]
> Now we do not live in that ideal world and [...]
>  such a series will have zero
> chance of landing in 'pu', unless we freeze the world.

I wonder if we could approximate the ideal world with
rerere+spatch a bit more:

1) I resend the series that includes "apply cocci patches"
    as the last patch and you queue it as usual

2) The first time such a series is merged, you'd merge
    HEAD^ (i.e. excluding the "apply the semantic patch)
    to pu instead. I view this as a yet-to-be invented mode
    '--theirs-is-stale-use-tree-instead=THEIRS~1^{tree}',
    then run spatch to reproduce the last patch into the
    dirty merge (which has pu and the last patch as parent)

    This step is done to 'pre-heat' the rerere cache.

3) Any further integration (e.g. rebuilding pu) would
    benefit from the hot rerere cache and very little work
    is actually required as the conflicts are resolved by rerere.

Am I overestimating or misunderstanding rerere here?

> What I _could_ do (and what I did do only for one round of pushing
> out 'pu') is to queue a coccinelle transformation at the tip of
> integration branches.  If "make coccicheck" ran in subsecond, I
> could even automate it in the script that is used to rebuild 'pu'
> every day, so that after merging each and every topic, do the "make
> coccicheck" and apply resulting .cocci.patch files and squash that
> into the merge commit.
>
> But with the current "make coccicheck" performance, that is not
> realistic.

Would it be realistic for next and master branch instead of pu?

I'd be wary for the master branch, as we may not want to rely on
spatch without review. (It can produce funny white space issues,
but seems to produce working/correct code)

> I am wondering if it is feasible to do it at the tip of 'pu' (which
> is rebuilt two to three times a day), 'next' (which is updated once
> or twice a week) and 'master'.

We could even optimize that, by checking if contrib/cocci/ has
changes for the new tip of next/master respectively.

Another thing I wonder is if we care about the distinction between
the (a) pending changes as described by SZEDER, as we introduce
these deliberately, whereas (b) undesirable code patterns
(e.g. free and null instead of FREE_AND_NULL macro) might be
caught and reported in pu/next and then someone learns from it.
Automatic rewriting the (b) cases seems not just as desirable as
(a), where we do it purely to avoid resolving merge conflicts by
hand.

> I find that your "pending" idea may be nicer, as it distributes the
> load.  Whoever wants to change the world order by updating the .cocci
> rules is primarily responsible for making it happen without breaking
> the world during the transition.  That's more scalable.

... and I think SZEDER considers the current world broken as
'make coccicheck' returns non-empty, so it sounds to me as if
the current transition is thought less-than-optimal.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-23  0:26     ` Stefan Beller
@ 2018-10-23  4:24       ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-23  4:24 UTC (permalink / raw)
  To: Stefan Beller; +Cc: SZEDER Gábor, Jeff King, git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

> Am I overestimating or misunderstanding rerere here?

Yes.

> Would it be realistic for next and master branch instead of pu?
>
> I'd be wary for the master branch, as we may not want to rely on
> spatch without review. (It can produce funny white space issues,
> but seems to produce working/correct code)

Yes, that is why I think it is a bit too late to do the "fixup with
spatch" after merging to 'next' and 'master'.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 22:49   ` Junio C Hamano
  2018-10-23  0:26     ` Stefan Beller
@ 2018-10-23  9:38     ` Junio C Hamano
  2018-10-23 10:15       ` Carlo Arenas
  2018-10-23 17:30       ` Stefan Beller
  1 sibling, 2 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-23  9:38 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Stefan Beller, Jeff King, git, jonathantanmy

Junio C Hamano <gitster@pobox.com> writes:

> I actually think this round does a far nicer job playing well with
> other topics than any earlier series.  The pain you are observing I
> think come primarily from my not making the best use of these
> patches.
>
> Steppng back a bit, I'd imagine in an ideal world where "make
> coccicheck" can be done instantaneously _and_ the spatch machinery
> is just as reliable as C compilers....
>
> What I _could_ do (and what I did do only for one round of pushing
> out 'pu') is to queue a coccinelle transformation at the tip of
> integration branches.  If "make coccicheck" ran in subsecond, I
> could even automate it in the script that is used to rebuild 'pu'
> every day, ...

Anyway, even though "make coccicheck" does not run in subsecond,
I've updated my machinery to rebuild the integration branches so
that I can optionally queue generated coccicheck patches, and what I
pushed out tonight has one at the tip of 'pu' and also another at
the tip of 'next'.  The latter seems to be passing all archs and
executing Windows run.

The tip of 'pu' has trouble with -Wunused on Apple around the
delta-islands series.


next: https://travis-ci.org/git/git/builds/444969406
pu:   https://travis-ci.org/git/git/builds/444969342



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-23  9:38     ` Junio C Hamano
@ 2018-10-23 10:15       ` Carlo Arenas
  2018-10-23 10:21         ` Junio C Hamano
  2018-10-23 17:30       ` Stefan Beller
  1 sibling, 1 reply; 49+ messages in thread
From: Carlo Arenas @ 2018-10-23 10:15 UTC (permalink / raw)
  To: gitster; +Cc: szeder.dev, sbeller, peff, git, jonathantanmy

On Tue, Oct 23, 2018 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> The tip of 'pu' has trouble with -Wunused on Apple around the
> delta-islands series.

FWIW the "problem" is actually with -Wunused-function and is AFAIK not
related to the semantic changes or Apple (AKA macOS)

Indeed, I saw this issue before also in Linux (Fedora Rawhide) when
using the latest clang and presumed it was just expected because the
topic branch was most likely incomplete.

Carlo

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-23 10:15       ` Carlo Arenas
@ 2018-10-23 10:21         ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-23 10:21 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: szeder.dev, sbeller, peff, git, jonathantanmy

Carlo Arenas <carenas@gmail.com> writes:

> On Tue, Oct 23, 2018 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> The tip of 'pu' has trouble with -Wunused on Apple around the
>> delta-islands series.
>
> FWIW the "problem" is actually with -Wunused-function and is AFAIK not
> related to the semantic changes or Apple (AKA macOS)

Oh, don't get me wrong.  By Apple, I meant "the versions of compiler
used on the Apple build at TravisCI site".  I could have sent the
above two lines in a separate topic, as the issue does not have
anything to do with "new semantic patches" discussion, but I thought
it would be obvious to everybody who is reading the thread---it
seems I thought wrong ;-)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-23  9:38     ` Junio C Hamano
  2018-10-23 10:15       ` Carlo Arenas
@ 2018-10-23 17:30       ` Stefan Beller
  2018-10-24  1:22         ` Junio C Hamano
  1 sibling, 1 reply; 49+ messages in thread
From: Stefan Beller @ 2018-10-23 17:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, Jeff King, git, Jonathan Tan

On Tue, Oct 23, 2018 at 2:38 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > I actually think this round does a far nicer job playing well with
> > other topics than any earlier series.  The pain you are observing I
> > think come primarily from my not making the best use of these
> > patches.
> >
> > Steppng back a bit, I'd imagine in an ideal world where "make
> > coccicheck" can be done instantaneously _and_ the spatch machinery
> > is just as reliable as C compilers....
> >
> > What I _could_ do (and what I did do only for one round of pushing
> > out 'pu') is to queue a coccinelle transformation at the tip of
> > integration branches.  If "make coccicheck" ran in subsecond, I
> > could even automate it in the script that is used to rebuild 'pu'
> > every day, ...
>
> Anyway, even though "make coccicheck" does not run in subsecond,
> I've updated my machinery to rebuild the integration branches so
> that I can optionally queue generated coccicheck patches, and what I
> pushed out tonight has one at the tip of 'pu' and also another at
> the tip of 'next'.  The latter seems to be passing all archs and
> executing Windows run.

That is pretty exciting!

Looking at the commit in next, you also included the suggestion
from [1] to use a postincrement instead of a preincrement and I got
excited to see how we express such a thing in coccinelle,
but it turns out that it slipped in unrelated to the coccinelle patches.

How would we go from here?
It is not obvious to me how such changes would be integrated,
as regenerating them on top of pu will not help getting these changes
merged down, and applying the semantic patch on next (once
sb/more-repo-in-api lands in next) would created the merge conflicts for
all topics that are merged to next after that series.

[1] https://public-inbox.org/git/xmqqin1wyxvz.fsf@gitster-ct.c.googlers.com/

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-23 17:30       ` Stefan Beller
@ 2018-10-24  1:22         ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-24  1:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: SZEDER Gábor, Jeff King, git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

>> Anyway, even though "make coccicheck" does not run in subsecond,
>> I've updated my machinery to rebuild the integration branches so
>> that I can optionally queue generated coccicheck patches, and what I
>> pushed out tonight has one at the tip of 'pu' and also another at
>> the tip of 'next'.  The latter seems to be passing all archs and
>> executing Windows run.
>
> That is pretty exciting!
>
> Looking at the commit in next, you also included the suggestion
> from [1] to use a postincrement instead of a preincrement and I got
> excited to see how we express such a thing in coccinelle,
> but it turns out that it slipped in unrelated to the coccinelle patches.

See below, which was sitting in my working tree.

> How would we go from here?
> It is not obvious to me how such changes would be integrated,
> as regenerating them on top of pu will not help getting these changes
> merged down, and applying the semantic patch on next (once
> sb/more-repo-in-api lands in next) would created the merge conflicts for
> all topics that are merged to next after that series.

Conflicts with later topics is indeed worrysome.  That is why I did
it as an experiment.  If it becomes too painful, I'd probably stop
doing it while merging to anything other than 'pu', and then we can
follow the more distributed approach along the lines of what Szeder
suggested, to see how smoothly it goes.

-- >8 --
Subject: [PATCH] cocci: simplify "if (++u > 1)" to "if (u++)"

It is more common to use post-increment than pre-increment when the
side effect is the primary thing we want in our code and in C in
general (unlike C++).

Initializing a variable to 0, incrementing it every time we do
something, and checking if we have already done that thing to guard
the code to do that thing, is easier to understand when written

	if (u++)
		; /* we've done that! */
	else
		do_it(); /* just once. */

but if you try to use pre-increment, you end up with a less natural
looking

	if (++u > 1)

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/coccinelle/preincr.cocci | 5 +++++
 1 file changed, 5 insertions(+)
 create mode 100644 contrib/coccinelle/preincr.cocci

diff --git a/contrib/coccinelle/preincr.cocci b/contrib/coccinelle/preincr.cocci
new file mode 100644
index 0000000000..7fe1e8d2d9
--- /dev/null
+++ b/contrib/coccinelle/preincr.cocci
@@ -0,0 +1,5 @@
+@ preincrement @
+identifier i;
+@@
+- ++i > 1
++ i++
-- 
2.19.1-542-gc4df23f792

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 18:54   ` Stefan Beller
@ 2018-10-25  1:59     ` SZEDER Gábor
  2018-10-25 19:25       ` Stefan Beller
  0 siblings, 1 reply; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-25  1:59 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git, Jonathan Tan

On Mon, Oct 22, 2018 at 11:54:06AM -0700, Stefan Beller wrote:

> For the sake of a good history, I would think running 'make coccicheck'
> and applying the resulting patches would be best as part of the (dirty)
> merge of any topic that proposes new semantic patches, but that would
> add load to Junio as it would be an extra step during the merge.
> 
> One could argue that the step of applying such transformations into
> the dirty merge is cheaper than resolving merge conflicts that are
> had when the topic includes the transformation.

Please consider that merge commits' have uglier diffs than regular
commits, and that merge commits cause additional complications when
'git bisect' points the finger at them, both of which are exacerbated
by additional changes squeezed into evil merges.

> >     Consequently, 'make coccicheck' won't run clean and the
> >     static analysis build job will fail until all those topics reach
> >     'master', and the remaining transformations are applied on top.
> >
> >     This was (and still is!) an issue with the hasheq()/oideq() series
> >     as well: that series was added on 2018-08-28, and the static
> >     analysis build job is red on 'pu' ever since.  See the follow-up
> >     patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
> >     one more follow-up will be necessary after the builtin stash topic
> >     is merged to 'master'.
> 
> In my understanding this follow up is a feature, as it helps to avoid
> merge conflicts with other topics in flight.

I don't see how such a follow up patch helps to avoid merge conflicts.

There were topics that branched off before the introduction of oideq()
into 'master', therefore they couldn't make use of this new function
until they were merged to 'master' as well, so they added their own
!oidcmp() calls.  That follow up patch was necessary to transform
these new !oidcmp() calls after those topics reached 'master'.  Merge
conflicts had nothing to do with it.

So this follow up patch is not a feature, but rather an inherent
consequence of the project's branching model, with lots of parallel
running topics branching off at different points and progressing at
different speeds.

> >     This makes it harder to review other patch series.
> 
> as 'make coccicheck' is an integral part of your review?

Erm, right, "review" was not the right word here.  Anyway, as it is,
'make coccicheck' is an integral part of our automated tests, not only
on Travis CI but on the upcoming Azure thing as well.  I just try to
pay attention to its results and the results of a bunch of my
additional builds, and complain or even send a fix when something goes
reproducibly wrong.  This has certainly became more cumbersome with
the permanently failing static analysis build job in the last couple
of weeks.

> > How about introducing the concept of "pending" semantic patches,
> > stored in 'contrib/coccinelle/<name>.pending.cocci' files, modifying
> > 'make coccicheck' to skip them, and adding the new 'make
> > coccicheck-pending' target to make it convenient to apply them, e.g.
> > something like the simple patch at the end.
> >
> > So the process would go something like this:
> >
> >   - A new semantic patch should be added as "pending", e.g. to the
> >     file 'the_repository.pending.cocci', together with the resulting
> >     transformations in the same commit.
> >
> >     This way neither 'make coccicheck' nor the static analysis build
> >     job would complain in the topic branch or in the two integration
> >     branches.  And if they do complain, then we would know right away
> >     that they complain because of a well-established semantic patch.
> >     Yet, anyone interested could run 'make coccicheck-pending' to see
> >     where are we heading.
> >
> >   - The author of the "pending" semanting patch should then keep an
> >     eye on already cooking topics: whether any of them contain new
> >     code that should be transformed, and how they progress to
> >     'master', and sending followup patch(es) with the remaining
> >     transformations when applicable.
> >
> >     Futhermore, the author should also pay attention to any new topics
> >     that branch off after the "pending" semantic patch, and whether
> >     any of them introduce code to be transformed, warning their
> >     authors as necessary.
> >
> >   - Finally, after all the dust settled, the dev should follow up with
> >     a patch to:
> >
> >       - promote the "penging" patch to '<name>.cocci', if its purpose
> >         is to avoid undesirable code patterns in the future, or
> >
> >       - remove the semantic patch, if it was used in a one-off
> >         transformation.
> >
> > Thoughts?
> 
> I like the approach of having separate classes of semantic patches:
> (a) the regular "we need to keep checking these" as they address
>     undesirable code patterns, which is what we currently have,
>     and what 'make coccicheck' would complain about.
> (b) The pending patches as you propose. However I would
>     argue that we'd not want to include the transformation into
>     the same patch as then the patch will have merge conflicts.

Since we have a lot of parallel running topics, merge conflicts are
basically unavoidable anyway.  If the conflicts from the
transformation are really that severe, then perhaps the whole series
should be postponed to a calmer, more suitable time.

In the case of 'the_repository.cocci', merging its transformations
into 'pu' resulted in only four conflicts, and I found all four on the
easy side to resolve.  I don't think it's worth waiting with the
transformations in this particular case.

>     Ideally we'd have an automated process/bot that would apply
>     all pending semantic patches onto master and then checks for
>     conflicts in HEAD..pu, and only sends off the non-conflicting
>     diffs as a topic.

New semantic patches didn't pop up all that frequently in the past, so
I'm not sure it's worth investing in such an automation.  Of course
they can become more frequent in the future, and in that case we might
want to reconsider it.  Unfortunately, however, Coccinelle's results
can't be completely trusted, either because our semantic patches or
because Coccinelle itself are buggy...

>     Then after a couple integration cycles we'd have all pending
>     changes in, with no conflicts on Junios side.
> 
> So I think we should add a patch like you post, but we would
> need to discuss the exact approach how to deal with pending
> patches. Is it the original dev who should push forward on their
> own pending patches, or does it become a pooled effort?

Well, it makes sense to me that whoever proposes a change with an
accompanying new semantic patch should also deal with the necessary
followups.  However, it doesn't really matter who deals with them, as
long as somebody deals with them.  I don't think it's much different
from e.g. sending a followup bugfix to someone else's patch series.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
  2018-10-22 18:54   ` Stefan Beller
  2018-10-22 22:49   ` Junio C Hamano
@ 2018-10-25  5:39   ` Jeff King
  2 siblings, 0 replies; 49+ messages in thread
From: Jeff King @ 2018-10-25  5:39 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Stefan Beller, git, jonathantanmy

On Mon, Oct 22, 2018 at 07:39:35PM +0200, SZEDER Gábor wrote:

> I don't really like how this or the previous RFC patch series deal
> with semantic patches (or how some past patch series dealt with them,
> for that matter), for various reasons:
> [..]

I am a little late to this thread, but it really seems to me that the
crux of the issue here:

>   - 'make coccicheck' won't run clean (and the static analysis build
>     job on Travis CI will fail) for all commits following adding the
>     new semantic patches but before applying the resulting
>     transformations.
> 
>   - These semantic patches interact badly with 'pu' and 'next',
>     because those integration branches can contain topic branches
>     adding new code that should be transformed by these semanic
>     patches.  Consequently, 'make coccicheck' won't run clean and the
>     static analysis build job will fail until all those topics reach
>     'master', and the remaining transformations are applied on top.

Is that we are considering it a failure for "pu" to have pending
coccinelle patches. Are there any coccicheck transformations that aren't
just "we prefer to do it like X instead of Y"?

Changes that actually break things should be caught by the compiler
(either by their nature, or because we take care to rename or change
interfaces when making a semantic change to a function).

I think that's getting at what you're saying here:

>   - Is it really necessary to carry these semantic patches in-tree?
>     Let me ellaborate.  There are basically two main use cases for
>     semantic patches:
> 
>       - To avoid undesirable code patterns, e.g. we should not use
> [...]
>       - To perform one-off code transformations, e.g. to modify a

I am mostly thinking of your first type here. And I think it makes sense
for people to avoid submitting code that does not pass "make coccicheck"
_as it was at the state of their branch_. But for cocci changes that are
in flight at the same time as their branch, I do not see any need to
strictly adhere to them. The code is "undesirable", not "wrong".

For the second type, I agree that we do not necessarily need to carry
them in-tree. Eventually the transformation happens, and nobody would
use the "old" way because it doesn't work anymore. Problem solved.

I do not mind carrying them for a while as a convenience to people who
will need to fix up their topics in flight (or more likely to the
maintainer, who will need to fixup the merge). But we should make sure
not to forget to remove them when their usefulness has passed.

Likewise, we should not forget to occasionally run "make coccicheck" on
master to make sure people have a clean base to build on. If Junio is
able to do that, then great. But other people can also do so and submit
patches (to go on their respective topics, or to be a new mass-cleanup
topic).

I guess there is some lag there if Junio pushes out a master branch that
fails coccicheck, because contributors may build on it before somebody
gets around to fixing it.

>         Having said that, it's certainly easier to double-check the
>         resulting transformations when one can apply the semantic
>         patches locally, and doing so is easier when the semantic
>         patches are in tree than when they must be copy-pasted from a
>         commit message.

I've wondered if we could have a script that pulls a coccinelle snippet
from a commit message and runs it. It may be a hassle to find the right
commit, though (you'd start the procedure from "oops, my compile now
seems broken; what was the change that I need to apply to adapt?").

>   - A new semantic patch should be added as "pending", e.g. to the
>     file 'the_repository.pending.cocci', together with the resulting
>     transformations in the same commit.
> 
>     This way neither 'make coccicheck' nor the static analysis build
>     job would complain in the topic branch or in the two integration
>     branches.  And if they do complain, then we would know right away
>     that they complain because of a well-established semantic patch.
>     Yet, anyone interested could run 'make coccicheck-pending' to see
>     where are we heading.

OK, makes sense.

>   - The author of the "pending" semanting patch should then keep an
>     eye on already cooking topics: whether any of them contain new
>     code that should be transformed, and how they progress to
>     'master', and sending followup patch(es) with the remaining
>     transformations when applicable.
>     
>     Futhermore, the author should also pay attention to any new topics
>     that branch off after the "pending" semantic patch, and whether
>     any of them introduce code to be transformed, warning their
>     authors as necessary.

This part seems tricky, though. There's a race condition between
promoting the patch from pending to not-pending and other topics
branching off. And remember that we do not always see other people's
branches, which they may work on in private for a long time (though I
suppose "when Junio applies it" is one effective cutoff).

>   - Finally, after all the dust settled, the dev should follow up with
>     a patch to:
>     
>       - promote the "penging" patch to '<name>.cocci', if its purpose
>         is to avoid undesirable code patterns in the future, or
>     
>       - remove the semantic patch, if it was used in a one-off
>         transformation.

I'm not wild about leaving these loose ends that need to be tied up
manually. It adds a lot of friction and effort.

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 18/19] submodule: use submodule repos for object lookup
  2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
  2018-10-19 20:37   ` Jonathan Tan
@ 2018-10-25  9:14   ` SZEDER Gábor
  2018-10-31 13:38   ` Derrick Stolee
  2 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-25  9:14 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Derrick Stolee, git, jonathantanmy

On Tue, Oct 16, 2018 at 04:35:49PM -0700, Stefan Beller wrote:
> This converts the 'show_submodule_header' function to use
> the repository API properly, such that the submodule objects
> are not added to the main object store.

This patch breaks the test suite with 'GIT_TEST_COMMIT_GRAPH=1', in
particular 't4041-diff-submodule-option.sh' fails with:

  expecting success:
          git diff-index -p --submodule=log HEAD >actual &&
          cat >expected <<-EOF &&
          Submodule sm1 $head2..$head3 (rewind):
            < Add foo3 ($added foo3)
            < Add foo2 ($added foo2)
          EOF
          test_cmp expected actual
  
  + git diff-index -p --submodule=log HEAD
  + cat
  + test_cmp expected actual
  + diff -u expected actual
  --- expected    2018-10-25 09:10:00.541610016 +0000
  +++ actual      2018-10-25 09:10:00.537609885 +0000
  @@ -1,3 +1,5 @@
  -Submodule sm1 30b9670..dafb207 (rewind):
  +Submodule sm1 30b9670...dafb207:
     < Add foo3 (hinzugefügt foo3)
     < Add foo2 (hinzugefügt foo2)
  +  > Add foo1 (hinzugefügt foo1)
  +  < Add foo1 (hinzugefügt foo1)
  error: last command exited with $?=1
  not ok 9 - modified submodule(backward)

and 't4060-diff-submodule-option-diff-format.sh' fails with:

  expecting success:
          git diff-index -p --submodule=diff HEAD >actual &&
          cat >expected <<-EOF &&
          Submodule sm1 $head2..$head3 (rewind):
          diff --git a/sm1/foo2 b/sm1/foo2
          deleted file mode 100644
          index 54b060e..0000000
          --- a/sm1/foo2
          +++ /dev/null
          @@ -1 +0,0 @@
          -foo2
          diff --git a/sm1/foo3 b/sm1/foo3
          deleted file mode 100644
          index c1ec6c6..0000000
          --- a/sm1/foo3
          +++ /dev/null
          @@ -1 +0,0 @@
          -foo3
          EOF
          test_cmp expected actual
  
  + git diff-index -p --submodule=diff HEAD
  + cat
  + test_cmp expected actual
  + diff -u expected actual
  --- expected    2018-10-25 09:10:38.854868800 +0000
  +++ actual      2018-10-25 09:10:38.854868800 +0000
  @@ -1,4 +1,4 @@
  -Submodule sm1 30b9670..dafb207 (rewind):
  +Submodule sm1 30b9670...dafb207:
   diff --git a/sm1/foo2 b/sm1/foo2
   deleted file mode 100644
   index 54b060e..0000000
  error: last command exited with $?=1
  not ok 10 - modified submodule(backward)



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base]
  2018-10-25  1:59     ` SZEDER Gábor
@ 2018-10-25 19:25       ` Stefan Beller
  0 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-10-25 19:25 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Jeff King, git, Jonathan Tan

On Wed, Oct 24, 2018 at 6:59 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Mon, Oct 22, 2018 at 11:54:06AM -0700, Stefan Beller wrote:
>
> > For the sake of a good history, I would think running 'make coccicheck'
> > and applying the resulting patches would be best as part of the (dirty)
> > merge of any topic that proposes new semantic patches, but that would
> > add load to Junio as it would be an extra step during the merge.
> >
> > One could argue that the step of applying such transformations into
> > the dirty merge is cheaper than resolving merge conflicts that are
> > had when the topic includes the transformation.
>
> Please consider that merge commits' have uglier diffs than regular
> commits, and that merge commits cause additional complications when
> 'git bisect' points the finger at them, both of which are exacerbated
> by additional changes squeezed into evil merges.
>
> > >     Consequently, 'make coccicheck' won't run clean and the
> > >     static analysis build job will fail until all those topics reach
> > >     'master', and the remaining transformations are applied on top.
> > >
> > >     This was (and still is!) an issue with the hasheq()/oideq() series
> > >     as well: that series was added on 2018-08-28, and the static
> > >     analysis build job is red on 'pu' ever since.  See the follow-up
> > >     patch e43d2dcce1 (more oideq/hasheq conversions, 2018-10-02), and
> > >     one more follow-up will be necessary after the builtin stash topic
> > >     is merged to 'master'.
> >
> > In my understanding this follow up is a feature, as it helps to avoid
> > merge conflicts with other topics in flight.
>
> I don't see how such a follow up patch helps to avoid merge conflicts.

Well, you merge first (the new topic and the cocci patches), and then
do the transformation. But that is putting a lot more work on Junio
as the way to integrate is not just merge a new topic into the pile of
topics (whether it is pu/next/master), but to first perform a merge
of the topic and the coccinelle patches, apply the transformation
and then merge to the pile, assuming the pile is already transformed
(as it happened in "treewide: apply cocci patch" in next/pu).

> > as 'make coccicheck' is an integral part of your review?
>
> Erm, right, "review" was not the right word here.  Anyway, as it is,
> 'make coccicheck' is an integral part of our automated tests, not only
> on Travis CI but on the upcoming Azure thing as well.  I just try to
> pay attention to its results and the results of a bunch of my
> additional builds, and complain or even send a fix when something goes
> reproducibly wrong.  This has certainly became more cumbersome with
> the permanently failing static analysis build job in the last couple
> of weeks.

I seem to not pay as much attention as I should to these,
mostly because they are unreliable  on the aggregate level (a failure
there most likely means another topic than the one I am interested
broke; except in this case, where we discuss the fallout there via
this topic.)

> > I like the approach of having separate classes of semantic patches:
> > (a) the regular "we need to keep checking these" as they address
> >     undesirable code patterns, which is what we currently have,
> >     and what 'make coccicheck' would complain about.
> > (b) The pending patches as you propose. However I would
> >     argue that we'd not want to include the transformation into
> >     the same patch as then the patch will have merge conflicts.
>
> Since we have a lot of parallel running topics, merge conflicts are
> basically unavoidable anyway.  If the conflicts from the
> transformation are really that severe, then perhaps the whole series
> should be postponed to a calmer, more suitable time.

Merge conflicts of this kind could be avoided, by running the
transformation on both sides before merging (or not running them
on both sides, but only after merging).

So maybe for these larger 'the_repository.pending.cocci' patches
we could get them into the tree and wait until all (most) of the topics
are including that semantic patch in their base, such that application
of the patch is easy whether before or after writing a series
(as the semantic patch is in its base).

Another short term plan would be renaming the_repository.cocci
such that it would break the 'make coccicheck'

> In the case of 'the_repository.cocci', merging its transformations
> into 'pu' resulted in only four conflicts, and I found all four on the
> easy side to resolve.  I don't think it's worth waiting with the
> transformations in this particular case.

I am not worried about the current conflicts, but about those to come
in new series.

>
> >     Ideally we'd have an automated process/bot that would apply
> >     all pending semantic patches onto master and then checks for
> >     conflicts in HEAD..pu, and only sends off the non-conflicting
> >     diffs as a topic.
>
> New semantic patches didn't pop up all that frequently in the past, so
> I'm not sure it's worth investing in such an automation.  Of course
> they can become more frequent in the future, and in that case we might
> want to reconsider it.  Unfortunately, however, Coccinelle's results
> can't be completely trusted, either because our semantic patches or
> because Coccinelle itself are buggy...

For the first, we could just be become better at reviewing the Cocci
patch. ;-) (Well, they are harder to review than usual patches, so
this doesn't surprise me. Also not fully understanding the whole
tool may have impact on reviewability)

>
> >     Then after a couple integration cycles we'd have all pending
> >     changes in, with no conflicts on Junios side.
> >
> > So I think we should add a patch like you post, but we would
> > need to discuss the exact approach how to deal with pending
> > patches. Is it the original dev who should push forward on their
> > own pending patches, or does it become a pooled effort?
>
> Well, it makes sense to me that whoever proposes a change with an
> accompanying new semantic patch should also deal with the necessary
> followups.  However, it doesn't really matter who deals with them, as
> long as somebody deals with them.  I don't think it's much different
> from e.g. sending a followup bugfix to someone else's patch series.

Ok, I plan on resending these patches after I get
origin/sb/submodule-recursive-fetch-gets-the-tip going
resent.

Stefan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 18/19] submodule: use submodule repos for object lookup
  2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
  2018-10-19 20:37   ` Jonathan Tan
  2018-10-25  9:14   ` SZEDER Gábor
@ 2018-10-31 13:38   ` Derrick Stolee
  2018-11-01 19:13     ` Stefan Beller
  2 siblings, 1 reply; 49+ messages in thread
From: Derrick Stolee @ 2018-10-31 13:38 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: jonathantanmy

On 10/16/2018 7:35 PM, Stefan Beller wrote:
> @@ -482,14 +483,46 @@ void prepare_submodule_repo_env(struct argv_array *out)
>   			 DEFAULT_GIT_DIR_ENVIRONMENT);
>   }
>   
> -/* Helper function to display the submodule header line prior to the full
> - * summary output. If it can locate the submodule objects directory it will
> - * attempt to lookup both the left and right commits and put them into the
> - * left and right pointers.
> +/*
> + * Initialize 'out' based on the provided submodule path.
> + *
> + * Unlike repo_submodule_init, this tolerates submodules not present
> + * in .gitmodules. This function exists only to preserve historical behavior,
> + *
> + * Returns 0 on success, -1 when the submodule is not present.
> + */
> +static int open_submodule(struct repository *out, const char *path)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
> +		strbuf_release(&sb);
> +		return -1;
> +	}
> +
> +	out->submodule_prefix = xstrdup(path);
> +	out->submodule_prefix = xstrfmt("%s%s/",
> +					the_repository->submodule_prefix ?
> +					the_repository->submodule_prefix :
> +					"", path);
> +
> +	strbuf_release(&sb);
> +	return 0;
> +}

Based on the recent test coverage report [1], this xstrfmt() call is never
run witha non-null the_repository->submodule_prefix. Is there a way we can
exercise that branch?

Thanks,
-Stolee

[1] 
https://public-inbox.org/git/62f0bcf6-aa73-c192-d804-e6d69cac146f@gmail.com/

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 18/19] submodule: use submodule repos for object lookup
  2018-10-31 13:38   ` Derrick Stolee
@ 2018-11-01 19:13     ` Stefan Beller
  0 siblings, 0 replies; 49+ messages in thread
From: Stefan Beller @ 2018-11-01 19:13 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Jonathan Tan

On Wed, Oct 31, 2018 at 6:38 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/16/2018 7:35 PM, Stefan Beller wrote:
> > @@ -482,14 +483,46 @@ void prepare_submodule_repo_env(struct argv_array *out)
> >                        DEFAULT_GIT_DIR_ENVIRONMENT);
> >   }
> >
> > -/* Helper function to display the submodule header line prior to the full
> > - * summary output. If it can locate the submodule objects directory it will
> > - * attempt to lookup both the left and right commits and put them into the
> > - * left and right pointers.
> > +/*
> > + * Initialize 'out' based on the provided submodule path.
> > + *
> > + * Unlike repo_submodule_init, this tolerates submodules not present
> > + * in .gitmodules. This function exists only to preserve historical behavior,
> > + *
> > + * Returns 0 on success, -1 when the submodule is not present.
> > + */
> > +static int open_submodule(struct repository *out, const char *path)
> > +{
> > +     struct strbuf sb = STRBUF_INIT;
> > +
> > +     if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
> > +             strbuf_release(&sb);
> > +             return -1;
> > +     }
> > +
> > +     out->submodule_prefix = xstrdup(path);
> > +     out->submodule_prefix = xstrfmt("%s%s/",
> > +                                     the_repository->submodule_prefix ?
> > +                                     the_repository->submodule_prefix :
> > +                                     "", path);
> > +
> > +     strbuf_release(&sb);
> > +     return 0;
> > +}
>
> Based on the recent test coverage report [1], this xstrfmt() call is never
> run witha non-null the_repository->submodule_prefix. Is there a way we can
> exercise that branch?

No it's dead code, actually. the_repository never has submodule_prefix set
as it is the main repository. So this is overly cautious to enable the
'any repo'
case.
In a resend we'll go with xstrdup(path);

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2018-11-01 19:13 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 23:35 [PATCH 00/19] Bring more repository handles into our code base Stefan Beller
2018-10-16 23:35 ` [PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
2018-10-16 23:35 ` [PATCH 02/19] packfile: allow has_packed_and_bad to handle " Stefan Beller
2018-10-16 23:35 ` [PATCH 03/19] object-store: allow read_object_file_extended to read from " Stefan Beller
2018-10-16 23:35 ` [PATCH 04/19] object-store: prepare read_object_file to deal with " Stefan Beller
2018-10-16 23:35 ` [PATCH 05/19] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
2018-10-16 23:35 ` [PATCH 06/19] object: parse_object to honor its repository argument Stefan Beller
2018-10-16 23:35 ` [PATCH 07/19] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
2018-10-16 23:35 ` [PATCH 08/19] commit-reach.c: allow paint_down_to_common " Stefan Beller
2018-10-16 23:35 ` [PATCH 09/19] commit-reach.c: allow merge_bases_many " Stefan Beller
2018-10-16 23:35 ` [PATCH 10/19] commit-reach.c: allow remove_redundant " Stefan Beller
2018-10-16 23:35 ` [PATCH 11/19] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
2018-10-16 23:35 ` [PATCH 12/19] commit-reach: prepare get_merge_bases " Stefan Beller
2018-10-16 23:35 ` [PATCH 13/19] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
2018-10-16 23:35 ` [PATCH 14/19] commit: prepare get_commit_buffer " Stefan Beller
2018-10-16 23:35 ` [PATCH 15/19] commit: prepare repo_unuse_commit_buffer " Stefan Beller
2018-10-16 23:35 ` [PATCH 16/19] commit: prepare logmsg_reencode " Stefan Beller
2018-10-16 23:35 ` [PATCH 17/19] pretty: prepare format_commit_message " Stefan Beller
2018-10-16 23:35 ` [PATCH 18/19] submodule: use submodule repos for object lookup Stefan Beller
2018-10-19 20:37   ` Jonathan Tan
2018-10-25  9:14   ` SZEDER Gábor
2018-10-31 13:38   ` Derrick Stolee
2018-11-01 19:13     ` Stefan Beller
2018-10-16 23:35 ` [PATCH 19/19] submodule: don't add submodule as odb for push Stefan Beller
2018-10-19 20:39   ` Jonathan Tan
2018-10-17 12:41 ` [PATCH 00/19] Bring more repository handles into our code base Derrick Stolee
2018-10-17 17:53   ` Stefan Beller
2018-10-18 18:37     ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Stefan Beller
2018-10-18 18:37       ` [RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency Stefan Beller
2018-10-18 18:37       ` [RFC PATCH 2/2 (BREAKS BUILD)] builtin/merge-base.c: do not rely on the_repository any more Stefan Beller
2018-10-18 21:01       ` [RFC PATCH 0/2] Bring the_repository into cmd_foo Jonathan Tan
2018-10-18 23:23         ` Stefan Beller
2018-10-19  7:23 ` [PATCH 00/19] Bring more repository handles into our code base Junio C Hamano
2018-10-22 17:39 ` New semantic patches vs. in-flight topics [was: Re: [PATCH 00/19] Bring more repository handles into our code base] SZEDER Gábor
2018-10-22 18:54   ` Stefan Beller
2018-10-25  1:59     ` SZEDER Gábor
2018-10-25 19:25       ` Stefan Beller
2018-10-22 22:49   ` Junio C Hamano
2018-10-23  0:26     ` Stefan Beller
2018-10-23  4:24       ` Junio C Hamano
2018-10-23  9:38     ` Junio C Hamano
2018-10-23 10:15       ` Carlo Arenas
2018-10-23 10:21         ` Junio C Hamano
2018-10-23 17:30       ` Stefan Beller
2018-10-24  1:22         ` Junio C Hamano
2018-10-25  5:39   ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2018-10-11 21:17 [RFC PATCH 00/19] Bring more repository handles into our code base Stefan Beller
2018-10-11 21:17 ` [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories Stefan Beller
2018-10-11 22:01   ` Jonathan Tan
2018-10-11 22:02     ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).