git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCHv2 00/24] Bring more repository handles into our code base]
@ 2018-10-30 22:07 Stefan Beller
  2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
                   ` (24 more replies)
  0 siblings, 25 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

This resends sb/more-repo-in-api

On Thu, Oct 25, 2018 at 2:14 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Tue, Oct 16, 2018 at 04:35:49PM -0700, Stefan Beller wrote:
> > This converts the 'show_submodule_header' function to use
> > the repository API properly, such that the submodule objects
> > are not added to the main object store.
>
> This patch breaks the test suite with 'GIT_TEST_COMMIT_GRAPH=1', in
> particular 't4041-diff-submodule-option.sh' fails with:
> [...]

I debugged into this and the last four patches will fix it.

I also picked up the patch for pending semantic patches, as the
first patch, can I have your sign off, please?

This also addresses Jonathans feedback in
https://public-inbox.org/git/20181019203750.110741-1-jonathantanmy@google.com/

A range-diff is below.

Thanks,
Stefan

SZEDER Gábor (1):
  Makefile: add pending semantic patches

Stefan Beller (23):
  sha1_file: allow read_object to read objects in arbitrary repositories
  packfile: allow has_packed_and_bad to handle arbitrary repositories
  object-store: allow read_object_file_extended to read from arbitrary
    repositories
  object-store: prepare read_object_file to deal with arbitrary
    repositories
  object-store: prepare has_{sha1, object}_file[_with_flags] to handle
    arbitrary repositories
  object: parse_object to honor its repository argument
  commit: allow parse_commit* to handle arbitrary repositories
  commit-reach.c: allow paint_down_to_common to handle arbitrary
    repositories
  commit-reach.c: allow merge_bases_many to handle arbitrary
    repositories
  commit-reach.c: allow remove_redundant to handle arbitrary
    repositories
  commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary
    repositories
  commit-reach: prepare get_merge_bases to handle arbitrary repositories
  commit-reach: prepare in_merge_bases[_many] to handle arbitrary
    repositories
  commit: prepare get_commit_buffer to handle arbitrary repositories
  commit: prepare repo_unuse_commit_buffer to handle arbitrary
    repositories
  commit: prepare logmsg_reencode to handle arbitrary repositories
  pretty: prepare format_commit_message to handle arbitrary repositories
  submodule: use submodule repos for object lookup
  submodule: don't add submodule as odb for push
  commit-graph: convert remaining function to handle arbitrary
    repositories
  commit: make free_commit_buffer and release_commit_memory repository
    agnostic
  path.h: make REPO_GIT_PATH_FUNC repository agnostic
  t/helper/test-repository: celebrate independence from the_repository

 Makefile                                      |   6 +-
 builtin/fsck.c                                |   3 +-
 builtin/log.c                                 |   6 +-
 builtin/rev-list.c                            |   3 +-
 cache.h                                       |   2 +
 commit-graph.c                                |  40 +++--
 commit-reach.c                                |  73 +++++----
 commit-reach.h                                |  38 +++--
 commit.c                                      |  41 ++---
 commit.h                                      |  43 +++++-
 .../coccinelle/the_repository.pending.cocci   | 144 ++++++++++++++++++
 object-store.h                                |  35 ++++-
 object.c                                      |   8 +-
 packfile.c                                    |   5 +-
 packfile.h                                    |   2 +-
 path.h                                        |   2 +-
 pretty.c                                      |  28 ++--
 pretty.h                                      |   7 +-
 sha1-file.c                                   |  34 +++--
 streaming.c                                   |   2 +-
 submodule.c                                   |  79 +++++++---
 t/helper/test-repository.c                    |  10 ++
 22 files changed, 459 insertions(+), 152 deletions(-)
 create mode 100644 contrib/coccinelle/the_repository.pending.cocci

git range-diff origin/sb/more-repo-in-api...
[...] # rebased to origin/master
  -:  ---------- > 116:  4ede3d42df Seventh batch for 2.20
  -:  ---------- > 117:  b1de196491 Makefile: add pending semantic patches
  1:  1b9b5c695e = 118:  f1be5eb487 sha1_file: allow read_object to read objects in arbitrary repositories
  2:  33b94066f2 = 119:  9100a5705d packfile: allow has_packed_and_bad to handle arbitrary repositories
  3:  5217b6b1e1 = 120:  a4ad7791da object-store: allow read_object_file_extended to read from arbitrary repositories
  4:  2b7239b55b ! 121:  5d7b3a6dd9 object-store: prepare read_object_file to deal with arbitrary repositories
    @@ -19,10 +19,10 @@
         Signed-off-by: Stefan Beller <sbeller@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
      new file mode 100644
      --- /dev/null
    - +++ b/contrib/coccinelle/the_repository.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
     +// This file is used for the ongoing refactoring of
     +// bringing the index or repository struct in all of
    @@ -36,6 +36,7 @@
     +- read_object_file(
     ++ repo_read_object_file(the_repository,
     +  E, F, G)
    ++
     
      diff --git a/object-store.h b/object-store.h
      --- a/object-store.h
  5:  24291f4d84 ! 122:  77d406fc51 object-store: prepare has_{sha1, object}_file[_with_flags] to handle arbitrary repositories
    @@ -3,16 +3,14 @@
         object-store: prepare has_{sha1, object}_file[_with_flags] to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
    - - read_object_file(
      + repo_read_object_file(the_repository,
        E, F, G)
    -+
    + 
     +@@
     +expression E;
     +@@
  6:  6aa209978e = 123:  7224b378d8 object: parse_object to honor its repository argument
  7:  9b2d6aa7c3 ! 124:  e4d4ae08ca commit: allow parse_commit* to handle arbitrary repositories
    @@ -93,9 +93,9 @@
      
      struct buffer_slab;
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - has_object_file_with_flags(
      + repo_has_object_file_with_flags(the_repository,
  8:  0e7e681118 ! 125:  f739ef6078 commit-reach.c: allow paint_down_to_common to handle arbitrary repositories
    @@ -5,7 +5,6 @@
         As the function is file local and not widely used, migrate it all at once.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
  9:  e83b26f5b3 ! 126:  c054a80564 commit-reach.c: allow merge_bases_many to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit-reach.c: allow merge_bases_many to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
 10:  d80b9de832 ! 127:  3434a5a262 commit-reach.c: allow remove_redundant to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit-reach.c: allow remove_redundant to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
 11:  3ec21ad503 ! 128:  82f6e797bf commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
 12:  3f21279f50 ! 129:  b66b636a97 commit-reach: prepare get_merge_bases to handle arbitrary repositories
    @@ -9,7 +9,6 @@
         functions behind a wrapper macro.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
    @@ -91,9 +90,9 @@
      int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
      int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - parse_commit(
      + repo_parse_commit(the_repository,
    @@ -124,3 +123,4 @@
     +- get_merge_bases_many_dirty(
     ++ repo_get_merge_bases_many_dirty(the_repository,
     +  E, F, G);
    ++
 13:  4decc3acc1 ! 130:  e68e48ca91 commit-reach: prepare in_merge_bases[_many] to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit-reach: prepare in_merge_bases[_many] to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit-reach.c b/commit-reach.c
      --- a/commit-reach.c
    @@ -76,14 +75,13 @@
      /*
       * Takes a list of commits and returns a new list where those
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
    - - get_merge_bases_many_dirty(
      + repo_get_merge_bases_many_dirty(the_repository,
        E, F, G);
    -+
    + 
     +@@
     +expression E;
     +expression F;
 14:  141b86ea89 ! 131:  aac401ca98 commit: prepare get_commit_buffer to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit: prepare get_commit_buffer to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit.c b/commit.c
      --- a/commit.c
    @@ -46,9 +45,9 @@
      /*
       * Tell the commit subsytem that we are done with a particular commit buffer.
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - in_merge_bases_many(
      + repo_in_merge_bases_many(the_repository,
 15:  2cc415c1db ! 132:  c45e6d2c0c commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
    @@ -3,7 +3,6 @@
         commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/commit.c b/commit.c
      --- a/commit.c
    @@ -42,9 +41,9 @@
      /*
       * Free any cached object buffer associated with the commit.
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - get_commit_buffer(
      + repo_get_commit_buffer(the_repository,
 16:  554daa8cfc ! 133:  25e8f5dec4 commit: prepare logmsg_reencode to handle arbitrary repositories
    @@ -24,9 +24,9 @@
      
      /** Removes the first commit from a list sorted by date, and adds all
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - unuse_commit_buffer(
      + repo_unuse_commit_buffer(the_repository,
 17:  bd8737176b ! 134:  2cab6c18b6 pretty: prepare format_commit_message to handle arbitrary repositories
    @@ -5,9 +5,9 @@
         Signed-off-by: Stefan Beller <sbeller@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
    - --- a/contrib/coccinelle/the_repository.cocci
    - +++ b/contrib/coccinelle/the_repository.cocci
    + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
    + --- a/contrib/coccinelle/the_repository.pending.cocci
    + +++ b/contrib/coccinelle/the_repository.pending.cocci
     @@
      - logmsg_reencode(
      + repo_logmsg_reencode(the_repository,
 18:  b303ef65e7 ! 135:  794592c573 submodule: use submodule repos for object lookup
    @@ -7,7 +7,6 @@
         are not added to the main object store.
     
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/submodule.c b/submodule.c
      --- a/submodule.c
    @@ -46,24 +45,26 @@
     + * in .gitmodules. This function exists only to preserve historical behavior,
     + *
     + * Returns 0 on success, -1 when the submodule is not present.
    -+ */
    -+static int open_submodule(struct repository *out, const char *path)
    +  */
    +-static void show_submodule_header(struct diff_options *o, const char *path,
    ++static struct repository *open_submodule(const char *path)
     +{
     +	struct strbuf sb = STRBUF_INIT;
    ++	struct repository *out = xmalloc(sizeof(*out));
     +
     +	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
     +		strbuf_release(&sb);
    -+		return -1;
    ++		free(out);
    ++		return NULL;
     +	}
     +
    -+	out->submodule_prefix = xstrdup(path);
     +	out->submodule_prefix = xstrfmt("%s%s/",
     +					the_repository->submodule_prefix ?
     +					the_repository->submodule_prefix :
     +					"", path);
     +
     +	strbuf_release(&sb);
    -+	return 0;
    ++	return out;
     +}
     +
     +/*
    @@ -73,8 +74,7 @@
     + * If it can locate the submodule git directory it will create a repository
     + * handle for the submodule and lookup both the left and right commits and
     + * put them into the left and right pointers.
    -  */
    --static void show_submodule_header(struct diff_options *o, const char *path,
    ++ */
     +static void show_submodule_header(struct diff_options *o,
     +		const char *path,
      		struct object_id *one, struct object_id *two,
    @@ -116,11 +116,9 @@
      	struct rev_info rev;
      	struct commit *left = NULL, *right = NULL;
      	struct commit_list *merge_bases = NULL;
    -+	struct repository subrepo, *sub = &subrepo;
    -+
    -+	if (open_submodule(&subrepo, path) < 0)
    -+		sub = NULL;
    ++	struct repository *sub;
      
    ++	sub = open_submodule(path);
      	show_submodule_header(o, path, one, two, dirty_submodule,
     -			      &left, &right, &merge_bases);
     +			      sub, &left, &right, &merge_bases);
    @@ -147,8 +145,10 @@
      		free_commit_list(merge_bases);
      	clear_commit_marks(left, ~0);
      	clear_commit_marks(right, ~0);
    -+	if (sub)
    ++	if (sub) {
     +		repo_clear(sub);
    ++		free(sub);
    ++	}
      }
      
      void show_submodule_inline_diff(struct diff_options *o, const char *path,
    @@ -156,11 +156,9 @@
      	struct commit_list *merge_bases = NULL;
      	struct child_process cp = CHILD_PROCESS_INIT;
      	struct strbuf sb = STRBUF_INIT;
    -+	struct repository subrepo, *sub = &subrepo;
    -+
    -+	if (open_submodule(&subrepo, path) < 0)
    -+		sub = NULL;
    ++	struct repository *sub;
      
    ++	sub = open_submodule(path);
      	show_submodule_header(o, path, one, two, dirty_submodule,
     -			      &left, &right, &merge_bases);
     +			      sub, &left, &right, &merge_bases);
    @@ -171,8 +169,10 @@
      		clear_commit_marks(left, ~0);
      	if (right)
      		clear_commit_marks(right, ~0);
    -+	if (sub)
    ++	if (sub) {
     +		repo_clear(sub);
    ++		free(sub);
    ++	}
      }
      
      int should_update_submodules(void)
 19:  dbb0dd9322 ! 136:  df748a859f submodule: don't add submodule as odb for push
    @@ -8,8 +8,10 @@
         objects, and the actual push is done by spawning another process,
         which handles object access by itself.)
     
    +    This code of push_submodule() is exercised in t5531 and continues
    +    to work, showing that the submodule odbc is not needed.
    +
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      diff --git a/submodule.c b/submodule.c
      --- a/submodule.c
  -:  ---------- > 137:  c9e990afac commit-graph: convert remaining function to handle arbitrary repositories
  -:  ---------- > 138:  a086f3dd11 commit: make free_commit_buffer and release_commit_memory repository agnostic
  -:  ---------- > 139:  878bd34799 path.h: make REPO_GIT_PATH_FUNC repository agnostic
  -:  ---------- > 140:  641824bbeb t/helper/test-repository: celebrate independence from the_repository

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/24] Makefile: add pending semantic patches
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-31  6:42   ` Junio C Hamano
  2018-11-08 20:52   ` [PATCH] " Stefan Beller
  2018-10-30 22:07 ` [PATCH 02/24] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy

From: SZEDER Gábor <szeder.dev@gmail.com>

There are basically two main use cases for semantic patches:

      - To avoid undesirable code patterns, e.g. we should not use
        sha1_to_hex(oid.hash) or strbuf_addf(&sb, "fixed string"), but
        use oid_to_hex(&oid) or strbuf_addstr(&sb, "fixed string")
        instead.  Note that in these cases we don't remove the
        functions sha1_to_hex() or strbuf_addf(), because there are
        good reasons to use them in other scenarios.

        Our semantic patches under 'contrib/coccinelle/' fall into
        this category, and we have 'make coccicheck' and the static
        analysis build job on Travis CI to catch these undesirable
        code patterns preferably early, and to prevent them from
        entering our codebase.

      - To perform one-off code transformations, e.g. to modify a
        function's name and/or signature and convert all its
        callsites; see e.g. commits abef9020e3 (sha1_file: convert
        sha1_object_info* to object_id, 2018-03-12) and b4f5aca40e
        (sha1_file: convert read_sha1_file to struct object_id,
        2018-03-12).

To allows semantic patches of the second category, we'll introduce
the concept of "pending" semantic patches, stored in
'contrib/coccinelle/<name>.pending.cocci' files, modifying
'make coccicheck' to skip them, and adding the new 'make
coccicheck-pending' target to make it convenient to apply them.

[Missing: SZEDERs sign off, so I also do not sign off]
---
 Makefile | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index b08d5ea258..6ea2bbd7f5 100644
--- a/Makefile
+++ b/Makefile
@@ -2739,9 +2739,11 @@ endif
 	then \
 		echo '    ' SPATCH result: $@; \
 	fi
-coccicheck: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.cocci))
+coccicheck: $(addsuffix .patch,$(filter-out %.pending.cocci,$(wildcard contrib/coccinelle/*.cocci)))
 
-.PHONY: coccicheck
+coccicheck-pending: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.pending.cocci))
+
+.PHONY: coccicheck coccicheck-pending
 
 ### Installation rules
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/24] sha1_file: allow read_object to read objects in arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
  2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-30 22:07 ` [PATCH 03/24] packfile: allow has_packed_and_bad to handle " Stefan Beller
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

Allow read_object (a file local functon in sha1_file) to
handle arbitrary repositories by passing the repository down
to oid_object_info_extended.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1-file.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/sha1-file.c b/sha1-file.c
index dd0b6aa873..b8ce21cbaf 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1361,7 +1361,9 @@ int oid_object_info(struct repository *r,
 	return type;
 }
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
+static void *read_object(struct repository *r,
+			 const unsigned char *sha1,
+			 enum object_type *type,
 			 unsigned long *size)
 {
 	struct object_id oid;
@@ -1373,7 +1375,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 
 	hashcpy(oid.hash, sha1);
 
-	if (oid_object_info_extended(the_repository, &oid, &oi, 0) < 0)
+	if (oid_object_info_extended(r, &oid, &oi, 0) < 0)
 		return NULL;
 	return content;
 }
@@ -1414,7 +1416,7 @@ void *read_object_file_extended(const struct object_id *oid,
 		lookup_replace_object(the_repository, oid) : oid;
 
 	errno = 0;
-	data = read_object(repl->hash, type, size);
+	data = read_object(the_repository, repl->hash, type, size);
 	if (data)
 		return data;
 
@@ -1755,7 +1757,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime)
 
 	if (has_loose_object(oid))
 		return 0;
-	buf = read_object(oid->hash, &type, &len);
+	buf = read_object(the_repository, oid->hash, &type, &len);
 	if (!buf)
 		return error(_("cannot read sha1_file for %s"), oid_to_hex(oid));
 	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", type_name(type), len) + 1;
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/24] packfile: allow has_packed_and_bad to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
  2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
  2018-10-30 22:07 ` [PATCH 02/24] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-30 22:07 ` [PATCH 04/24] object-store: allow read_object_file_extended to read from " Stefan Beller
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

has_packed_and_bad is not widely used, so just migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c  | 5 +++--
 packfile.h  | 2 +-
 sha1-file.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index f2850a00b5..63f8dc7b98 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1138,12 +1138,13 @@ void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
 	p->num_bad_objects++;
 }
 
-const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
+const struct packed_git *has_packed_and_bad(struct repository *r,
+					    const unsigned char *sha1)
 {
 	struct packed_git *p;
 	unsigned i;
 
-	for (p = the_repository->objects->packed_git; p; p = p->next)
+	for (p = r->objects->packed_git; p; p = p->next)
 		for (i = 0; i < p->num_bad_objects; i++)
 			if (hasheq(sha1,
 				   p->bad_object_sha1 + the_hash_algo->rawsz * i))
diff --git a/packfile.h b/packfile.h
index 6c4037605d..d70c6d9afb 100644
--- a/packfile.h
+++ b/packfile.h
@@ -146,7 +146,7 @@ extern int packed_object_info(struct repository *r,
 			      off_t offset, struct object_info *);
 
 extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
-extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1);
+extern const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
 
 /*
  * Iff a pack file in the given repository contains the object named by sha1,
diff --git a/sha1-file.c b/sha1-file.c
index b8ce21cbaf..856e000ee1 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1432,7 +1432,7 @@ void *read_object_file_extended(const struct object_id *oid,
 		die(_("loose object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), path);
 
-	if ((p = has_packed_and_bad(repl->hash)) != NULL)
+	if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/24] object-store: allow read_object_file_extended to read from arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (2 preceding siblings ...)
  2018-10-30 22:07 ` [PATCH 03/24] packfile: allow has_packed_and_bad to handle " Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-30 22:07 ` [PATCH 05/24] object-store: prepare read_object_file to deal with " Stefan Beller
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

read_object_file_extended is not widely used, so migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-store.h |  5 +++--
 sha1-file.c    | 11 ++++++-----
 streaming.c    |  2 +-
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/object-store.h b/object-store.h
index 63b7605a3e..3d98a682b2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -161,12 +161,13 @@ void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned cha
 
 void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned long *size);
 
-extern void *read_object_file_extended(const struct object_id *oid,
+extern void *read_object_file_extended(struct repository *r,
+				       const struct object_id *oid,
 				       enum object_type *type,
 				       unsigned long *size, int lookup_replace);
 static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
 {
-	return read_object_file_extended(oid, type, size, 1);
+	return read_object_file_extended(the_repository, oid, type, size, 1);
 }
 
 /* Read and unpack an object file into memory, write memory to an object file */
diff --git a/sha1-file.c b/sha1-file.c
index 856e000ee1..c5b704aec5 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1403,7 +1403,8 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
  * deal with them should arrange to call read_object() and give error
  * messages themselves.
  */
-void *read_object_file_extended(const struct object_id *oid,
+void *read_object_file_extended(struct repository *r,
+				const struct object_id *oid,
 				enum object_type *type,
 				unsigned long *size,
 				int lookup_replace)
@@ -1413,10 +1414,10 @@ void *read_object_file_extended(const struct object_id *oid,
 	const char *path;
 	struct stat st;
 	const struct object_id *repl = lookup_replace ?
-		lookup_replace_object(the_repository, oid) : oid;
+		lookup_replace_object(r, oid) : oid;
 
 	errno = 0;
-	data = read_object(the_repository, repl->hash, type, size);
+	data = read_object(r, repl->hash, type, size);
 	if (data)
 		return data;
 
@@ -1428,11 +1429,11 @@ void *read_object_file_extended(const struct object_id *oid,
 		die(_("replacement %s not found for %s"),
 		    oid_to_hex(repl), oid_to_hex(oid));
 
-	if (!stat_sha1_file(the_repository, repl->hash, &st, &path))
+	if (!stat_sha1_file(r, repl->hash, &st, &path))
 		die(_("loose object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), path);
 
-	if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL)
+	if ((p = has_packed_and_bad(r, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
 
diff --git a/streaming.c b/streaming.c
index d1e6b2dce6..c843a1230f 100644
--- a/streaming.c
+++ b/streaming.c
@@ -490,7 +490,7 @@ static struct stream_vtbl incore_vtbl = {
 
 static open_method_decl(incore)
 {
-	st->u.incore.buf = read_object_file_extended(oid, type, &st->size, 0);
+	st->u.incore.buf = read_object_file_extended(the_repository, oid, type, &st->size, 0);
 	st->u.incore.read_ptr = 0;
 	st->vtbl = &incore_vtbl;
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/24] object-store: prepare read_object_file to deal with arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (3 preceding siblings ...)
  2018-10-30 22:07 ` [PATCH 04/24] object-store: allow read_object_file_extended to read from " Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-30 22:07 ` [PATCH 06/24] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

As read_object_file is a widely used function (which is also regularly used
in new code in flight between master..pu), changing its signature is painful
is hard, as other series in flight rely on the original signature. It would
burden the maintainer if we'd just change the signature.

Introduce repo_read_object_file which takes the repository argument, and
hide the original read_object_file as a macro behind
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to
e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)

Add a coccinelle patch to convert existing callers, but do not apply
the resulting patch from 'make coccicheck' to keep the diff of this
patch small.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/coccinelle/the_repository.pending.cocci | 13 +++++++++++++
 object-store.h                                  | 10 ++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 contrib/coccinelle/the_repository.pending.cocci

diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
new file mode 100644
index 0000000000..3c7fa70502
--- /dev/null
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -0,0 +1,13 @@
+// This file is used for the ongoing refactoring of
+// bringing the index or repository struct in all of
+// our code base.
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- read_object_file(
++ repo_read_object_file(the_repository,
+  E, F, G)
+
diff --git a/object-store.h b/object-store.h
index 3d98a682b2..00a64622e6 100644
--- a/object-store.h
+++ b/object-store.h
@@ -165,10 +165,16 @@ extern void *read_object_file_extended(struct repository *r,
 				       const struct object_id *oid,
 				       enum object_type *type,
 				       unsigned long *size, int lookup_replace);
-static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
+static inline void *repo_read_object_file(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size)
 {
-	return read_object_file_extended(the_repository, oid, type, size, 1);
+	return read_object_file_extended(r, oid, type, size, 1);
 }
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size)
+#endif
 
 /* Read and unpack an object file into memory, write memory to an object file */
 int oid_object_info(struct repository *r, const struct object_id *, unsigned long *);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/24] object-store: prepare has_{sha1, object}_file[_with_flags] to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (4 preceding siblings ...)
  2018-10-30 22:07 ` [PATCH 05/24] object-store: prepare read_object_file to deal with " Stefan Beller
@ 2018-10-30 22:07 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 07/24] object: parse_object to honor its repository argument Stefan Beller
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:07 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 .../coccinelle/the_repository.pending.cocci   | 29 +++++++++++++++++++
 object-store.h                                | 22 ++++++++++----
 sha1-file.c                                   | 15 ++++++----
 3 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 3c7fa70502..46f3a1b23a 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -11,3 +11,32 @@ expression G;
 + repo_read_object_file(the_repository,
   E, F, G)
 
+@@
+expression E;
+@@
+- has_sha1_file(
++ repo_has_sha1_file(the_repository,
+  E)
+
+@@
+expression E;
+expression F;
+@@
+- has_sha1_file_with_flags(
++ repo_has_sha1_file_with_flags(the_repository,
+  E)
+
+@@
+expression E;
+@@
+- has_object_file(
++ repo_has_object_file(the_repository,
+  E)
+
+@@
+expression E;
+expression F;
+@@
+- has_object_file_with_flags(
++ repo_has_object_file_with_flags(the_repository,
+  E)
diff --git a/object-store.h b/object-store.h
index 00a64622e6..2b5e6ff1ed 100644
--- a/object-store.h
+++ b/object-store.h
@@ -212,15 +212,27 @@ int read_loose_object(const char *path,
  * object_info. OBJECT_INFO_SKIP_CACHED is automatically set; pass
  * nonzero flags to also set other flags.
  */
-extern int has_sha1_file_with_flags(const unsigned char *sha1, int flags);
-static inline int has_sha1_file(const unsigned char *sha1)
+int repo_has_sha1_file_with_flags(struct repository *r,
+				  const unsigned char *sha1, int flags);
+static inline int repo_has_sha1_file(struct repository *r,
+				     const unsigned char *sha1)
 {
-	return has_sha1_file_with_flags(sha1, 0);
+	return repo_has_sha1_file_with_flags(r, sha1, 0);
 }
 
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define has_sha1_file_with_flags(sha1, flags) repo_has_sha1_file_with_flags(the_repository, sha1, flags)
+#define has_sha1_file(sha1) repo_has_sha1_file(the_repository, sha1)
+#endif
+
 /* Same as the above, except for struct object_id. */
-extern int has_object_file(const struct object_id *oid);
-extern int has_object_file_with_flags(const struct object_id *oid, int flags);
+int repo_has_object_file(struct repository *r, const struct object_id *oid);
+int repo_has_object_file_with_flags(struct repository *r,
+				    const struct object_id *oid, int flags);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define has_object_file(oid) repo_has_object_file(the_repository, oid)
+#define has_object_file_with_flags(oid, flags) repo_has_object_file_with_flags(the_repository, oid, flags)
+#endif
 
 /*
  * Return true iff an alternate object database has a loose object
diff --git a/sha1-file.c b/sha1-file.c
index c5b704aec5..e77273ccfd 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1768,24 +1768,27 @@ int force_object_loose(const struct object_id *oid, time_t mtime)
 	return ret;
 }
 
-int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
+int repo_has_sha1_file_with_flags(struct repository *r,
+				  const unsigned char *sha1, int flags)
 {
 	struct object_id oid;
 	if (!startup_info->have_repository)
 		return 0;
 	hashcpy(oid.hash, sha1);
-	return oid_object_info_extended(the_repository, &oid, NULL,
+	return oid_object_info_extended(r, &oid, NULL,
 					flags | OBJECT_INFO_SKIP_CACHED) >= 0;
 }
 
-int has_object_file(const struct object_id *oid)
+int repo_has_object_file(struct repository *r,
+			 const struct object_id *oid)
 {
-	return has_sha1_file(oid->hash);
+	return repo_has_sha1_file(r, oid->hash);
 }
 
-int has_object_file_with_flags(const struct object_id *oid, int flags)
+int repo_has_object_file_with_flags(struct repository *r,
+				    const struct object_id *oid, int flags)
 {
-	return has_sha1_file_with_flags(oid->hash, flags);
+	return repo_has_sha1_file_with_flags(r, oid->hash, flags);
 }
 
 static void check_tree(const void *buf, size_t size)
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/24] object: parse_object to honor its repository argument
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (5 preceding siblings ...)
  2018-10-30 22:07 ` [PATCH 06/24] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 08/24] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

In 8e4b0b6047 (object.c: allow parse_object to handle
arbitrary repositories, 2018-06-28), we forgot to pass the
repository down to the read_object_file.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/object.c b/object.c
index e54160550c..003f870484 100644
--- a/object.c
+++ b/object.c
@@ -259,8 +259,8 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	if (obj && obj->parsed)
 		return obj;
 
-	if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
-	    (!obj && has_object_file(oid) &&
+	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
+	    (!obj && repo_has_object_file(r, oid) &&
 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
 		if (check_object_signature(repl, NULL, 0, NULL) < 0) {
 			error(_("sha1 mismatch %s"), oid_to_hex(oid));
@@ -270,7 +270,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 		return lookup_object(r, oid->hash);
 	}
 
-	buffer = read_object_file(oid, &type, &size);
+	buffer = repo_read_object_file(r, oid, &type, &size);
 	if (buffer) {
 		if (check_object_signature(repl, buffer, size, type_name(type)) < 0) {
 			free(buffer);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/24] commit: allow parse_commit* to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (6 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 07/24] object: parse_object to honor its repository argument Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 09/24] commit-reach.c: allow paint_down_to_common " Stefan Beller
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

Just like the previous commit, parse_commit and friends are used a lot
and are found in new patches, so we cannot change their signature easily.

Re-introduce these function prefixed with 'repo_' that take a repository
argument and keep the original as a shallow macro.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c                                      | 18 ++++++++------
 commit.h                                      | 17 +++++++++----
 .../coccinelle/the_repository.pending.cocci   | 24 +++++++++++++++++++
 3 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/commit.c b/commit.c
index dc8a39d52a..7a931d7fd4 100644
--- a/commit.c
+++ b/commit.c
@@ -443,7 +443,10 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	return 0;
 }
 
-int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph)
+int repo_parse_commit_internal(struct repository *r,
+			       struct commit *item,
+			       int quiet_on_missing,
+			       int use_commit_graph)
 {
 	enum object_type type;
 	void *buffer;
@@ -454,9 +457,9 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
+	if (use_commit_graph && parse_commit_in_graph(r, item))
 		return 0;
-	buffer = read_object_file(&item->object.oid, &type, &size);
+	buffer = repo_read_object_file(r, &item->object.oid, &type, &size);
 	if (!buffer)
 		return quiet_on_missing ? -1 :
 			error("Could not read %s",
@@ -467,18 +470,19 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 			     oid_to_hex(&item->object.oid));
 	}
 
-	ret = parse_commit_buffer(the_repository, item, buffer, size, 0);
+	ret = parse_commit_buffer(r, item, buffer, size, 0);
 	if (save_commit_buffer && !ret) {
-		set_commit_buffer(the_repository, item, buffer, size);
+		set_commit_buffer(r, item, buffer, size);
 		return 0;
 	}
 	free(buffer);
 	return ret;
 }
 
-int parse_commit_gently(struct commit *item, int quiet_on_missing)
+int repo_parse_commit_gently(struct repository *r,
+			     struct commit *item, int quiet_on_missing)
 {
-	return parse_commit_internal(item, quiet_on_missing, 1);
+	return repo_parse_commit_internal(r, item, quiet_on_missing, 1);
 }
 
 void parse_commit_or_die(struct commit *item)
diff --git a/commit.h b/commit.h
index 1d260d62f5..08935f9a19 100644
--- a/commit.h
+++ b/commit.h
@@ -79,12 +79,21 @@ struct commit *lookup_commit_reference_by_name(const char *name);
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name);
 
 int parse_commit_buffer(struct repository *r, struct commit *item, const void *buffer, unsigned long size, int check_graph);
-int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph);
-int parse_commit_gently(struct commit *item, int quiet_on_missing);
-static inline int parse_commit(struct commit *item)
+int repo_parse_commit_internal(struct repository *r, struct commit *item,
+			       int quiet_on_missing, int use_commit_graph);
+int repo_parse_commit_gently(struct repository *r,
+			     struct commit *item,
+			     int quiet_on_missing);
+static inline int repo_parse_commit(struct repository *r, struct commit *item)
 {
-	return parse_commit_gently(item, 0);
+	return repo_parse_commit_gently(r, item, 0);
 }
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use)
+#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)
+#define parse_commit(item) repo_parse_commit(the_repository, item)
+#endif
+
 void parse_commit_or_die(struct commit *item);
 
 struct buffer_slab;
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 46f3a1b23a..b185fe0a1d 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -40,3 +40,27 @@ expression F;
 - has_object_file_with_flags(
 + repo_has_object_file_with_flags(the_repository,
   E)
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- parse_commit_internal(
++ repo_parse_commit_internal(the_repository,
+  E, F, G)
+
+@@
+expression E;
+expression F;
+@@
+- parse_commit_gently(
++ repo_parse_commit_gently(the_repository,
+  E, F)
+
+@@
+expression E;
+@@
+- parse_commit(
++ repo_parse_commit(the_repository,
+  E)
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/24] commit-reach.c: allow paint_down_to_common to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (7 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 08/24] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 10/24] commit-reach.c: allow merge_bases_many " Stefan Beller
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

As the function is file local and not widely used, migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index a9da65c462..080ae0a758 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -30,7 +30,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct commit *one, int n,
+static struct commit_list *paint_down_to_common(struct repository *r,
+						struct commit *one, int n,
 						struct commit **twos,
 						int min_generation)
 {
@@ -83,7 +84,7 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (parse_commit(p))
+			if (repo_parse_commit(r, p))
 				return NULL;
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -116,7 +117,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
 			return NULL;
 	}
 
-	list = paint_down_to_common(one, n, twos, 0);
+	list = paint_down_to_common(the_repository, one, n, twos, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -187,8 +188,8 @@ static int remove_redundant(struct commit **array, int cnt)
 			if (array[j]->generation < min_generation)
 				min_generation = array[j]->generation;
 		}
-		common = paint_down_to_common(array[i], filled, work,
-					      min_generation);
+		common = paint_down_to_common(the_repository, array[i], filled,
+					      work, min_generation);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -322,7 +323,9 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 	if (commit->generation > min_generation)
 		return ret;
 
-	bases = paint_down_to_common(commit, nr_reference, reference, commit->generation);
+	bases = paint_down_to_common(the_repository, commit,
+				     nr_reference, reference,
+				     commit->generation);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/24] commit-reach.c: allow merge_bases_many to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (8 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 09/24] commit-reach.c: allow paint_down_to_common " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 11/24] commit-reach.c: allow remove_redundant " Stefan Beller
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 080ae0a758..4cf471bfaf 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -95,7 +95,9 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	return result;
 }
 
-static struct commit_list *merge_bases_many(struct commit *one, int n, struct commit **twos)
+static struct commit_list *merge_bases_many(struct repository *r,
+					    struct commit *one, int n,
+					    struct commit **twos)
 {
 	struct commit_list *list = NULL;
 	struct commit_list *result = NULL;
@@ -110,14 +112,14 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
 			return commit_list_insert(one, &result);
 	}
 
-	if (parse_commit(one))
+	if (repo_parse_commit(r, one))
 		return NULL;
 	for (i = 0; i < n; i++) {
-		if (parse_commit(twos[i]))
+		if (repo_parse_commit(r, twos[i]))
 			return NULL;
 	}
 
-	list = paint_down_to_common(the_repository, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -224,7 +226,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	struct commit_list *result;
 	int cnt, i;
 
-	result = merge_bases_many(one, n, twos);
+	result = merge_bases_many(the_repository, one, n, twos);
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/24] commit-reach.c: allow remove_redundant to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (9 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 10/24] commit-reach.c: allow merge_bases_many " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 12/24] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 4cf471bfaf..ab2bb1e5d5 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -156,7 +156,7 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 	return ret;
 }
 
-static int remove_redundant(struct commit **array, int cnt)
+static int remove_redundant(struct repository *r, struct commit **array, int cnt)
 {
 	/*
 	 * Some commit in the array may be an ancestor of
@@ -174,7 +174,7 @@ static int remove_redundant(struct commit **array, int cnt)
 	ALLOC_ARRAY(filled_index, cnt - 1);
 
 	for (i = 0; i < cnt; i++)
-		parse_commit(array[i]);
+		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
 		struct commit_list *common;
 		uint32_t min_generation = array[i]->generation;
@@ -190,7 +190,7 @@ static int remove_redundant(struct commit **array, int cnt)
 			if (array[j]->generation < min_generation)
 				min_generation = array[j]->generation;
 		}
-		common = paint_down_to_common(the_repository, array[i], filled,
+		common = paint_down_to_common(r, array[i], filled,
 					      work, min_generation);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
@@ -249,7 +249,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
 
-	cnt = remove_redundant(rslt, cnt);
+	cnt = remove_redundant(the_repository, rslt, cnt);
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -370,7 +370,7 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 			p->item->object.flags &= ~STALE;
 		}
 	}
-	num_head = remove_redundant(array, num_head);
+	num_head = remove_redundant(the_repository, array, num_head);
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/24] commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (10 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 11/24] commit-reach.c: allow remove_redundant " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 13/24] commit-reach: prepare get_merge_bases " Stefan Beller
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index ab2bb1e5d5..bf7a513991 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -216,7 +216,8 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return filled;
 }
 
-static struct commit_list *get_merge_bases_many_0(struct commit *one,
+static struct commit_list *get_merge_bases_many_0(struct repository *r,
+						  struct commit *one,
 						  int n,
 						  struct commit **twos,
 						  int cleanup)
@@ -226,7 +227,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	struct commit_list *result;
 	int cnt, i;
 
-	result = merge_bases_many(the_repository, one, n, twos);
+	result = merge_bases_many(r, one, n, twos);
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
@@ -249,7 +250,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one,
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
 
-	cnt = remove_redundant(the_repository, rslt, cnt);
+	cnt = remove_redundant(r, rslt, cnt);
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -261,19 +262,19 @@ struct commit_list *get_merge_bases_many(struct commit *one,
 					 int n,
 					 struct commit **twos)
 {
-	return get_merge_bases_many_0(one, n, twos, 1);
+	return get_merge_bases_many_0(the_repository, one, n, twos, 1);
 }
 
 struct commit_list *get_merge_bases_many_dirty(struct commit *one,
 					       int n,
 					       struct commit **twos)
 {
-	return get_merge_bases_many_0(one, n, twos, 0);
+	return get_merge_bases_many_0(the_repository, one, n, twos, 0);
 }
 
 struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
 {
-	return get_merge_bases_many_0(one, 1, &two, 1);
+	return get_merge_bases_many_0(the_repository, one, 1, &two, 1);
 }
 
 /*
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/24] commit-reach: prepare get_merge_bases to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (11 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 12/24] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 14/24] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Similarly to previous patches, the get_merge_base functions are used
often in the code base, which makes migrating them hard.

Implement the new functions, prefixed with 'repo_' and hide the old
functions behind a wrapper macro.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c                                | 24 ++++++++++-------
 commit-reach.h                                | 26 +++++++++++-------
 .../coccinelle/the_repository.pending.cocci   | 27 +++++++++++++++++++
 3 files changed, 57 insertions(+), 20 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index bf7a513991..3be5526957 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -258,23 +258,27 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	return result;
 }
 
-struct commit_list *get_merge_bases_many(struct commit *one,
-					 int n,
-					 struct commit **twos)
+struct commit_list *repo_get_merge_bases_many(struct repository *r,
+					      struct commit *one,
+					      int n,
+					      struct commit **twos)
 {
-	return get_merge_bases_many_0(the_repository, one, n, twos, 1);
+	return get_merge_bases_many_0(r, one, n, twos, 1);
 }
 
-struct commit_list *get_merge_bases_many_dirty(struct commit *one,
-					       int n,
-					       struct commit **twos)
+struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
+						    struct commit *one,
+						    int n,
+						    struct commit **twos)
 {
-	return get_merge_bases_many_0(the_repository, one, n, twos, 0);
+	return get_merge_bases_many_0(r, one, n, twos, 0);
 }
 
-struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
+struct commit_list *repo_get_merge_bases(struct repository *r,
+					 struct commit *one,
+					 struct commit *two)
 {
-	return get_merge_bases_many_0(the_repository, one, 1, &two, 1);
+	return get_merge_bases_many_0(r, one, 1, &two, 1);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 122a23a24d..c647447263 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -8,17 +8,23 @@ struct commit_list;
 struct contains_cache;
 struct ref_filter;
 
-struct commit_list *get_merge_bases_many(struct commit *one,
-					 int n,
-					 struct commit **twos);
-struct commit_list *get_merge_bases_many_dirty(struct commit *one,
-					       int n,
-					       struct commit **twos);
-struct commit_list *get_merge_bases(struct commit *one, struct commit *two);
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
-
+struct commit_list *repo_get_merge_bases(struct repository *r,
+					 struct commit *rev1,
+					 struct commit *rev2);
+struct commit_list *repo_get_merge_bases_many(struct repository *r,
+					      struct commit *one, int n,
+					      struct commit **twos);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos);
+struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
+						    struct commit *one, int n,
+						    struct commit **twos);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define get_merge_bases(r1, r2)           repo_get_merge_bases(the_repository, r1, r2)
+#define get_merge_bases_many(one, n, two) repo_get_merge_bases_many(the_repository, one, n, two)
+#define get_merge_bases_many_dirty(one, n, twos) repo_get_merge_bases_many_dirty(the_repository, one, n, twos)
+#endif
+
+struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
 int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index b185fe0a1d..5e037fe428 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -64,3 +64,30 @@ expression E;
 - parse_commit(
 + repo_parse_commit(the_repository,
   E)
+
+@@
+expression E;
+expression F;
+@@
+- get_merge_bases(
++ repo_get_merge_bases(the_repository,
+  E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- get_merge_bases_many(
++ repo_get_merge_bases_many(the_repository,
+  E, F, G);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- get_merge_bases_many_dirty(
++ repo_get_merge_bases_many_dirty(the_repository,
+  E, F, G);
+
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/24] commit-reach: prepare in_merge_bases[_many] to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (12 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 13/24] commit-reach: prepare get_merge_bases " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 15/24] commit: prepare get_commit_buffer " Stefan Beller
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-reach.c                                  | 15 +++++++++------
 commit-reach.h                                  | 12 ++++++++++--
 contrib/coccinelle/the_repository.pending.cocci | 16 ++++++++++++++++
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 3be5526957..88a65f854f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -312,16 +312,17 @@ int is_descendant_of(struct commit *commit, struct commit_list *with_commit)
 /*
  * Is "commit" an ancestor of one of the "references"?
  */
-int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference)
+int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
+			     int nr_reference, struct commit **reference)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
 	uint32_t min_generation = GENERATION_NUMBER_INFINITY;
 
-	if (parse_commit(commit))
+	if (repo_parse_commit(r, commit))
 		return ret;
 	for (i = 0; i < nr_reference; i++) {
-		if (parse_commit(reference[i]))
+		if (repo_parse_commit(r, reference[i]))
 			return ret;
 		if (reference[i]->generation < min_generation)
 			min_generation = reference[i]->generation;
@@ -330,7 +331,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 	if (commit->generation > min_generation)
 		return ret;
 
-	bases = paint_down_to_common(the_repository, commit,
+	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
 				     commit->generation);
 	if (commit->object.flags & PARENT2)
@@ -344,9 +345,11 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
 /*
  * Is "commit" an ancestor of (i.e. reachable from) the "reference"?
  */
-int in_merge_bases(struct commit *commit, struct commit *reference)
+int repo_in_merge_bases(struct repository *r,
+			struct commit *commit,
+			struct commit *reference)
 {
-	return in_merge_bases_many(commit, 1, &reference);
+	return repo_in_merge_bases_many(r, commit, 1, &reference);
 }
 
 struct commit_list *reduce_heads(struct commit_list *heads)
diff --git a/commit-reach.h b/commit-reach.h
index c647447263..6b16f12467 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -27,8 +27,16 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit);
-int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference);
-int in_merge_bases(struct commit *commit, struct commit *reference);
+int repo_in_merge_bases(struct repository *r,
+			struct commit *commit,
+			struct commit *reference);
+int repo_in_merge_bases_many(struct repository *r,
+			     struct commit *commit,
+			     int nr_reference, struct commit **reference);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define in_merge_bases(c1, c2) repo_in_merge_bases(the_repository, c1, c2)
+#define in_merge_bases_many(c1, n, cs) repo_in_merge_bases_many(the_repository, c1, n, cs)
+#endif
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 5e037fe428..8c6a71bf64 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -91,3 +91,19 @@ expression G;
 + repo_get_merge_bases_many_dirty(the_repository,
   E, F, G);
 
+@@
+expression E;
+expression F;
+@@
+- in_merge_bases(
++ repo_in_merge_bases(the_repository,
+  E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- in_merge_bases_many(
++ repo_in_merge_bases_many(the_repository,
+  E, F, G);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/24] commit: prepare get_commit_buffer to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (13 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 14/24] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 16/24] commit: prepare repo_unuse_commit_buffer " Stefan Beller
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit.c                                        | 8 +++++---
 commit.h                                        | 7 ++++++-
 contrib/coccinelle/the_repository.pending.cocci | 8 ++++++++
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index 7a931d7fd4..4034def16c 100644
--- a/commit.c
+++ b/commit.c
@@ -297,13 +297,15 @@ const void *get_cached_commit_buffer(struct repository *r, const struct commit *
 	return v->buffer;
 }
 
-const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep)
+const void *repo_get_commit_buffer(struct repository *r,
+				   const struct commit *commit,
+				   unsigned long *sizep)
 {
-	const void *ret = get_cached_commit_buffer(the_repository, commit, sizep);
+	const void *ret = get_cached_commit_buffer(r, commit, sizep);
 	if (!ret) {
 		enum object_type type;
 		unsigned long size;
-		ret = read_object_file(&commit->object.oid, &type, &size);
+		ret = repo_read_object_file(r, &commit->object.oid, &type, &size);
 		if (!ret)
 			die("cannot read commit object %s",
 			    oid_to_hex(&commit->object.oid));
diff --git a/commit.h b/commit.h
index 08935f9a19..591a77a5bb 100644
--- a/commit.h
+++ b/commit.h
@@ -117,7 +117,12 @@ const void *get_cached_commit_buffer(struct repository *, const struct commit *,
  * from disk. The resulting memory should not be modified, and must be given
  * to unuse_commit_buffer when the caller is done.
  */
-const void *get_commit_buffer(const struct commit *, unsigned long *size);
+const void *repo_get_commit_buffer(struct repository *r,
+				   const struct commit *,
+				   unsigned long *size);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define get_commit_buffer(c, s) repo_get_commit_buffer(the_repository, c, s)
+#endif
 
 /*
  * Tell the commit subsytem that we are done with a particular commit buffer.
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 8c6a71bf64..4018e6eaf7 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -107,3 +107,11 @@ expression G;
 - in_merge_bases_many(
 + repo_in_merge_bases_many(the_repository,
   E, F, G);
+
+@@
+expression E;
+expression F;
+@@
+- get_commit_buffer(
++ repo_get_commit_buffer(the_repository,
+  E, F);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/24] commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (14 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 15/24] commit: prepare get_commit_buffer " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 17/24] commit: prepare logmsg_reencode " Stefan Beller
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit.c                                        | 6 ++++--
 commit.h                                        | 7 ++++++-
 contrib/coccinelle/the_repository.pending.cocci | 8 ++++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/commit.c b/commit.c
index 4034def16c..7d2f3a9a93 100644
--- a/commit.c
+++ b/commit.c
@@ -318,10 +318,12 @@ const void *repo_get_commit_buffer(struct repository *r,
 	return ret;
 }
 
-void unuse_commit_buffer(const struct commit *commit, const void *buffer)
+void repo_unuse_commit_buffer(struct repository *r,
+			      const struct commit *commit,
+			      const void *buffer)
 {
 	struct commit_buffer *v = buffer_slab_peek(
-		the_repository->parsed_objects->buffer_slab, commit);
+		r->parsed_objects->buffer_slab, commit);
 	if (!(v && v->buffer == buffer))
 		free((void *)buffer);
 }
diff --git a/commit.h b/commit.h
index 591a77a5bb..57375e3239 100644
--- a/commit.h
+++ b/commit.h
@@ -130,7 +130,12 @@ const void *repo_get_commit_buffer(struct repository *r,
  * from an earlier call to get_commit_buffer.  The buffer may or may not be
  * freed by this call; callers should not access the memory afterwards.
  */
-void unuse_commit_buffer(const struct commit *, const void *buffer);
+void repo_unuse_commit_buffer(struct repository *r,
+			      const struct commit *,
+			      const void *buffer);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define unuse_commit_buffer(c, b) repo_unuse_commit_buffer(the_repository, c, b)
+#endif
 
 /*
  * Free any cached object buffer associated with the commit.
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 4018e6eaf7..516f19ffee 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -115,3 +115,11 @@ expression F;
 - get_commit_buffer(
 + repo_get_commit_buffer(the_repository,
   E, F);
+
+@@
+expression E;
+expression F;
+@@
+- unuse_commit_buffer(
++ repo_unuse_commit_buffer(the_repository,
+  E, F);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 17/24] commit: prepare logmsg_reencode to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (15 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 16/24] commit: prepare repo_unuse_commit_buffer " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 18/24] pretty: prepare format_commit_message " Stefan Beller
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.h                                        |  8 ++++++++
 contrib/coccinelle/the_repository.pending.cocci |  9 +++++++++
 pretty.c                                        | 13 +++++++------
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/commit.h b/commit.h
index 57375e3239..2e6b799b26 100644
--- a/commit.h
+++ b/commit.h
@@ -180,6 +180,14 @@ extern int has_non_ascii(const char *text);
 extern const char *logmsg_reencode(const struct commit *commit,
 				   char **commit_encoding,
 				   const char *output_encoding);
+const char *repo_logmsg_reencode(struct repository *r,
+				 const struct commit *commit,
+				 char **commit_encoding,
+				 const char *output_encoding);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define logmsg_reencode(c, enc, out) repo_logmsg_reencode(the_repository, c, enc, out)
+#endif
+
 extern const char *skip_blank_lines(const char *msg);
 
 /** Removes the first commit from a list sorted by date, and adds all
diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 516f19ffee..f5b42cfc62 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -123,3 +123,12 @@ expression F;
 - unuse_commit_buffer(
 + repo_unuse_commit_buffer(the_repository,
   E, F);
+
+@@
+expression E;
+expression F;
+expression G;
+@@
+- logmsg_reencode(
++ repo_logmsg_reencode(the_repository,
+  E, F, G);
diff --git a/pretty.c b/pretty.c
index 8ca29e9281..b359b68750 100644
--- a/pretty.c
+++ b/pretty.c
@@ -595,14 +595,15 @@ static char *replace_encoding_header(char *buf, const char *encoding)
 	return strbuf_detach(&tmp, NULL);
 }
 
-const char *logmsg_reencode(const struct commit *commit,
-			    char **commit_encoding,
-			    const char *output_encoding)
+const char *repo_logmsg_reencode(struct repository *r,
+				 const struct commit *commit,
+				 char **commit_encoding,
+				 const char *output_encoding)
 {
 	static const char *utf8 = "UTF-8";
 	const char *use_encoding;
 	char *encoding;
-	const char *msg = get_commit_buffer(commit, NULL);
+	const char *msg = repo_get_commit_buffer(r, commit, NULL);
 	char *out;
 
 	if (!output_encoding || !*output_encoding) {
@@ -630,7 +631,7 @@ const char *logmsg_reencode(const struct commit *commit,
 		 * the cached copy from get_commit_buffer, we need to duplicate it
 		 * to avoid munging the cached copy.
 		 */
-		if (msg == get_cached_commit_buffer(the_repository, commit, NULL))
+		if (msg == get_cached_commit_buffer(r, commit, NULL))
 			out = xstrdup(msg);
 		else
 			out = (char *)msg;
@@ -644,7 +645,7 @@ const char *logmsg_reencode(const struct commit *commit,
 		 */
 		out = reencode_string(msg, output_encoding, use_encoding);
 		if (out)
-			unuse_commit_buffer(commit, msg);
+			repo_unuse_commit_buffer(r, commit, msg);
 	}
 
 	/*
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 18/24] pretty: prepare format_commit_message to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (16 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 17/24] commit: prepare logmsg_reencode " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 19/24] submodule: use submodule repos for object lookup Stefan Beller
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller, Junio C Hamano

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/coccinelle/the_repository.pending.cocci | 10 ++++++++++
 pretty.c                                        | 15 ++++++++-------
 pretty.h                                        |  7 ++++++-
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index f5b42cfc62..2ee702ecf7 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -132,3 +132,13 @@ expression G;
 - logmsg_reencode(
 + repo_logmsg_reencode(the_repository,
   E, F, G);
+
+@@
+expression E;
+expression F;
+expression G;
+expression H;
+@@
+- format_commit_message(
++ repo_format_commit_message(the_repository,
+  E, F, G, H);
diff --git a/pretty.c b/pretty.c
index b359b68750..3240495308 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1508,9 +1508,10 @@ void userformat_find_requirements(const char *fmt, struct userformat_want *w)
 	strbuf_release(&dummy);
 }
 
-void format_commit_message(const struct commit *commit,
-			   const char *format, struct strbuf *sb,
-			   const struct pretty_print_context *pretty_ctx)
+void repo_format_commit_message(struct repository *r,
+				const struct commit *commit,
+				const char *format, struct strbuf *sb,
+				const struct pretty_print_context *pretty_ctx)
 {
 	struct format_commit_context context;
 	const char *output_enc = pretty_ctx->output_encoding;
@@ -1524,9 +1525,9 @@ void format_commit_message(const struct commit *commit,
 	 * convert a commit message to UTF-8 first
 	 * as far as 'format_commit_item' assumes it in UTF-8
 	 */
-	context.message = logmsg_reencode(commit,
-					  &context.commit_encoding,
-					  utf8);
+	context.message = repo_logmsg_reencode(r, commit,
+					       &context.commit_encoding,
+					       utf8);
 
 	strbuf_expand(sb, format, format_commit_item, &context);
 	rewrap_message_tail(sb, &context, 0, 0, 0);
@@ -1550,7 +1551,7 @@ void format_commit_message(const struct commit *commit,
 	}
 
 	free(context.commit_encoding);
-	unuse_commit_buffer(commit, context.message);
+	repo_unuse_commit_buffer(r, commit, context.message);
 }
 
 static void pp_header(struct pretty_print_context *pp,
diff --git a/pretty.h b/pretty.h
index 7359d318a9..e6625269cf 100644
--- a/pretty.h
+++ b/pretty.h
@@ -103,9 +103,14 @@ void pp_remainder(struct pretty_print_context *pp, const char **msg_p,
  * Put the result to "sb".
  * Please use this function for custom formats.
  */
-void format_commit_message(const struct commit *commit,
+void repo_format_commit_message(struct repository *r,
+			const struct commit *commit,
 			const char *format, struct strbuf *sb,
 			const struct pretty_print_context *context);
+#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
+#define format_commit_message(c, f, s, con) \
+	repo_format_commit_message(the_repository, c, f, s, con)
+#endif
 
 /*
  * Parse given arguments from "arg", check it for correctness and
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 19/24] submodule: use submodule repos for object lookup
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (17 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 18/24] pretty: prepare format_commit_message " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-11-02 13:03   ` Derrick Stolee
  2018-10-30 22:08 ` [PATCH 20/24] submodule: don't add submodule as odb for push Stefan Beller
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

This converts the 'show_submodule_header' function to use
the repository API properly, such that the submodule objects
are not added to the main object store.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 76 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 61 insertions(+), 15 deletions(-)

diff --git a/submodule.c b/submodule.c
index d9d3046689..0fdda45252 100644
--- a/submodule.c
+++ b/submodule.c
@@ -443,7 +443,7 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
 	return prepare_revision_walk(rev);
 }
 
-static void print_submodule_summary(struct rev_info *rev, struct diff_options *o)
+static void print_submodule_summary(struct repository *r, struct rev_info *rev, struct diff_options *o)
 {
 	static const char format[] = "  %m %s";
 	struct strbuf sb = STRBUF_INIT;
@@ -454,7 +454,8 @@ static void print_submodule_summary(struct rev_info *rev, struct diff_options *o
 		ctx.date_mode = rev->date_mode;
 		ctx.output_encoding = get_log_output_encoding();
 		strbuf_setlen(&sb, 0);
-		format_commit_message(commit, format, &sb, &ctx);
+		repo_format_commit_message(r, commit, format, &sb,
+				      &ctx);
 		strbuf_addch(&sb, '\n');
 		if (commit->object.flags & SYMMETRIC_LEFT)
 			diff_emit_submodule_del(o, sb.buf);
@@ -481,14 +482,47 @@ void prepare_submodule_repo_env(struct argv_array *out)
 			 DEFAULT_GIT_DIR_ENVIRONMENT);
 }
 
-/* Helper function to display the submodule header line prior to the full
- * summary output. If it can locate the submodule objects directory it will
- * attempt to lookup both the left and right commits and put them into the
- * left and right pointers.
+/*
+ * Initialize 'out' based on the provided submodule path.
+ *
+ * Unlike repo_submodule_init, this tolerates submodules not present
+ * in .gitmodules. This function exists only to preserve historical behavior,
+ *
+ * Returns 0 on success, -1 when the submodule is not present.
  */
-static void show_submodule_header(struct diff_options *o, const char *path,
+static struct repository *open_submodule(const char *path)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct repository *out = xmalloc(sizeof(*out));
+
+	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
+		strbuf_release(&sb);
+		free(out);
+		return NULL;
+	}
+
+	out->submodule_prefix = xstrfmt("%s%s/",
+					the_repository->submodule_prefix ?
+					the_repository->submodule_prefix :
+					"", path);
+
+	strbuf_release(&sb);
+	return out;
+}
+
+/*
+ * Helper function to display the submodule header line prior to the full
+ * summary output.
+ *
+ * If it can locate the submodule git directory it will create a repository
+ * handle for the submodule and lookup both the left and right commits and
+ * put them into the left and right pointers.
+ */
+static void show_submodule_header(struct diff_options *o,
+		const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule,
+		struct repository *sub,
 		struct commit **left, struct commit **right,
 		struct commit_list **merge_bases)
 {
@@ -507,7 +541,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	else if (is_null_oid(two))
 		message = "(submodule deleted)";
 
-	if (add_submodule_odb(path)) {
+	if (!sub) {
 		if (!message)
 			message = "(commits not present)";
 		goto output_header;
@@ -517,8 +551,8 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	 * Attempt to lookup the commit references, and determine if this is
 	 * a fast forward or fast backwards update.
 	 */
-	*left = lookup_commit_reference(the_repository, one);
-	*right = lookup_commit_reference(the_repository, two);
+	*left = lookup_commit_reference(sub, one);
+	*right = lookup_commit_reference(sub, two);
 
 	/*
 	 * Warn about missing commits in the submodule project, but only if
@@ -528,7 +562,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = get_merge_bases(*left, *right);
+	*merge_bases = repo_get_merge_bases(sub, *left, *right);
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
@@ -562,16 +596,18 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 	struct rev_info rev;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
+	struct repository *sub;
 
+	sub = open_submodule(path);
 	show_submodule_header(o, path, one, two, dirty_submodule,
-			      &left, &right, &merge_bases);
+			      sub, &left, &right, &merge_bases);
 
 	/*
 	 * If we don't have both a left and a right pointer, there is no
 	 * reason to try and display a summary. The header line should contain
 	 * all the information the user needs.
 	 */
-	if (!left || !right)
+	if (!left || !right || !sub)
 		goto out;
 
 	/* Treat revision walker failure the same as missing commits */
@@ -580,13 +616,17 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 		goto out;
 	}
 
-	print_submodule_summary(&rev, o);
+	print_submodule_summary(sub, &rev, o);
 
 out:
 	if (merge_bases)
 		free_commit_list(merge_bases);
 	clear_commit_marks(left, ~0);
 	clear_commit_marks(right, ~0);
+	if (sub) {
+		repo_clear(sub);
+		free(sub);
+	}
 }
 
 void show_submodule_inline_diff(struct diff_options *o, const char *path,
@@ -598,9 +638,11 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 	struct commit_list *merge_bases = NULL;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf sb = STRBUF_INIT;
+	struct repository *sub;
 
+	sub = open_submodule(path);
 	show_submodule_header(o, path, one, two, dirty_submodule,
-			      &left, &right, &merge_bases);
+			      sub, &left, &right, &merge_bases);
 
 	/* We need a valid left and right commit to display a difference */
 	if (!(left || is_null_oid(one)) ||
@@ -661,6 +703,10 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		clear_commit_marks(left, ~0);
 	if (right)
 		clear_commit_marks(right, ~0);
+	if (sub) {
+		repo_clear(sub);
+		free(sub);
+	}
 }
 
 int should_update_submodules(void)
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 20/24] submodule: don't add submodule as odb for push
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (18 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 19/24] submodule: use submodule repos for object lookup Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 21/24] commit-graph: convert remaining function to handle arbitrary repositories Stefan Beller
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

In push_submodule(), because we do not actually need access to objects
in the submodule, do not invoke add_submodule_odb().
(for_each_remote_ref_submodule() does not require access to those
objects, and the actual push is done by spawning another process,
which handles object access by itself.)

This code of push_submodule() is exercised in t5531 and continues
to work, showing that the submodule odbc is not needed.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 0fdda45252..da1ac1e6f8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1024,9 +1024,6 @@ static int push_submodule(const char *path,
 			  const struct string_list *push_options,
 			  int dry_run)
 {
-	if (add_submodule_odb(path))
-		return 1;
-
 	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
 		struct child_process cp = CHILD_PROCESS_INIT;
 		argv_array_push(&cp.args, "push");
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 21/24] commit-graph: convert remaining function to handle arbitrary repositories
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (19 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 20/24] submodule: don't add submodule as odb for push Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 22/24] commit: make free_commit_buffer and release_commit_memory repository agnostic Stefan Beller
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Convert all functions to handle arbitrary repositories in commit-graph.c
that are used by functions taking a repository argument already.

Notable exclusion is write_commit_graph and its local functions as that
only works on the_repository.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 commit-graph.c | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..f78a8e96b5 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -292,7 +292,8 @@ static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t
 			    g->chunk_oid_lookup, g->hash_len, pos);
 }
 
-static struct commit_list **insert_parent_or_die(struct commit_graph *g,
+static struct commit_list **insert_parent_or_die(struct repository *r,
+						 struct commit_graph *g,
 						 uint64_t pos,
 						 struct commit_list **pptr)
 {
@@ -303,7 +304,7 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g,
 		die("invalid parent position %"PRIu64, pos);
 
 	hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos);
-	c = lookup_commit(the_repository, &oid);
+	c = lookup_commit(r, &oid);
 	if (!c)
 		die(_("could not find commit %s"), oid_to_hex(&oid));
 	c->graph_pos = pos;
@@ -317,7 +318,9 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 	item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
 }
 
-static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, uint32_t pos)
+static int fill_commit_in_graph(struct repository *r,
+				struct commit *item,
+				struct commit_graph *g, uint32_t pos)
 {
 	uint32_t edge_value;
 	uint32_t *parent_data_ptr;
@@ -341,13 +344,13 @@ static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	edge_value = get_be32(commit_data + g->hash_len);
 	if (edge_value == GRAPH_PARENT_NONE)
 		return 1;
-	pptr = insert_parent_or_die(g, edge_value, pptr);
+	pptr = insert_parent_or_die(r, g, edge_value, pptr);
 
 	edge_value = get_be32(commit_data + g->hash_len + 4);
 	if (edge_value == GRAPH_PARENT_NONE)
 		return 1;
 	if (!(edge_value & GRAPH_OCTOPUS_EDGES_NEEDED)) {
-		pptr = insert_parent_or_die(g, edge_value, pptr);
+		pptr = insert_parent_or_die(r, g, edge_value, pptr);
 		return 1;
 	}
 
@@ -355,7 +358,7 @@ static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 			  4 * (uint64_t)(edge_value & GRAPH_EDGE_LAST_MASK));
 	do {
 		edge_value = get_be32(parent_data_ptr);
-		pptr = insert_parent_or_die(g,
+		pptr = insert_parent_or_die(r, g,
 					    edge_value & GRAPH_EDGE_LAST_MASK,
 					    pptr);
 		parent_data_ptr++;
@@ -374,7 +377,9 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	}
 }
 
-static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
+static int parse_commit_in_graph_one(struct repository *r,
+				     struct commit_graph *g,
+				     struct commit *item)
 {
 	uint32_t pos;
 
@@ -382,7 +387,7 @@ static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item
 		return 1;
 
 	if (find_commit_in_graph(item, g, &pos))
-		return fill_commit_in_graph(item, g, pos);
+		return fill_commit_in_graph(r, item, g, pos);
 
 	return 0;
 }
@@ -391,7 +396,7 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
 {
 	if (!prepare_commit_graph(r))
 		return 0;
-	return parse_commit_in_graph_one(r->objects->commit_graph, item);
+	return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -403,19 +408,22 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
 		fill_commit_graph_info(item, r->objects->commit_graph, pos);
 }
 
-static struct tree *load_tree_for_commit(struct commit_graph *g, struct commit *c)
+static struct tree *load_tree_for_commit(struct repository *r,
+					 struct commit_graph *g,
+					 struct commit *c)
 {
 	struct object_id oid;
 	const unsigned char *commit_data = g->chunk_commit_data +
 					   GRAPH_DATA_WIDTH * (c->graph_pos);
 
 	hashcpy(oid.hash, commit_data);
-	c->maybe_tree = lookup_tree(the_repository, &oid);
+	c->maybe_tree = lookup_tree(r, &oid);
 
 	return c->maybe_tree;
 }
 
-static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
+static struct tree *get_commit_tree_in_graph_one(struct repository *r,
+						 struct commit_graph *g,
 						 const struct commit *c)
 {
 	if (c->maybe_tree)
@@ -423,12 +431,12 @@ static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
 	if (c->graph_pos == COMMIT_NOT_FROM_GRAPH)
 		BUG("get_commit_tree_in_graph_one called from non-commit-graph commit");
 
-	return load_tree_for_commit(g, (struct commit *)c);
+	return load_tree_for_commit(r, g, (struct commit *)c);
 }
 
 struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit *c)
 {
-	return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
+	return get_commit_tree_in_graph_one(r, r->objects->commit_graph, c);
 }
 
 static void write_graph_chunk_fanout(struct hashfile *f,
@@ -1025,7 +1033,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit))
 			graph_report("failed to parse %s from commit-graph",
 				     oid_to_hex(&cur_oid));
 	}
@@ -1061,7 +1069,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 			continue;
 		}
 
-		if (!oideq(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid,
+		if (!oideq(&get_commit_tree_in_graph_one(r, g, graph_commit)->object.oid,
 			   get_commit_tree_oid(odb_commit)))
 			graph_report("root tree OID for commit %s in commit-graph is %s != %s",
 				     oid_to_hex(&cur_oid),
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 22/24] commit: make free_commit_buffer and release_commit_memory repository agnostic
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (20 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 21/24] commit-graph: convert remaining function to handle arbitrary repositories Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 23/24] path.h: make REPO_GIT_PATH_FUNC " Stefan Beller
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

Pass the object pool to free_commit_buffer and release_commit_memory,
such that we can eliminate access to 'the_repository'.

Also remove the TODO in release_commit_memory, as commit->util was
removed in 9d2c97016f (commit.h: delete 'util' field in struct commit,
2018-05-19)

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fsck.c     | 3 ++-
 builtin/log.c      | 6 ++++--
 builtin/rev-list.c | 3 ++-
 commit.c           | 9 ++++-----
 commit.h           | 4 ++--
 object.c           | 2 +-
 6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 06eb421720..c476ac6983 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -382,7 +382,8 @@ static int fsck_obj(struct object *obj, void *buffer, unsigned long size)
 	if (obj->type == OBJ_TREE)
 		free_tree_buffer((struct tree *)obj);
 	if (obj->type == OBJ_COMMIT)
-		free_commit_buffer((struct commit *)obj);
+		free_commit_buffer(the_repository->parsed_objects,
+				   (struct commit *)obj);
 	return err;
 }
 
diff --git a/builtin/log.c b/builtin/log.c
index 061d4fd864..64c2649c7c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -395,7 +395,8 @@ static int cmd_log_walk(struct rev_info *rev)
 			 * We may show a given commit multiple times when
 			 * walking the reflogs.
 			 */
-			free_commit_buffer(commit);
+			free_commit_buffer(the_repository->parsed_objects,
+					   commit);
 			free_commit_list(commit->parents);
 			commit->parents = NULL;
 		}
@@ -1922,7 +1923,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		    open_next_file(rev.numbered_files ? NULL : commit, NULL, &rev, quiet))
 			die(_("Failed to create output files"));
 		shown = log_tree_commit(&rev, commit);
-		free_commit_buffer(commit);
+		free_commit_buffer(the_repository->parsed_objects,
+				   commit);
 
 		/* We put one extra blank line between formatted
 		 * patches and this flag is used by log-tree code
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 5064d08e1b..a8867f0c4b 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -197,7 +197,8 @@ static void finish_commit(struct commit *commit, void *data)
 		free_commit_list(commit->parents);
 		commit->parents = NULL;
 	}
-	free_commit_buffer(commit);
+	free_commit_buffer(the_repository->parsed_objects,
+			   commit);
 }
 
 static inline void finish_object__ma(struct object *obj)
diff --git a/commit.c b/commit.c
index 7d2f3a9a93..4fe74aa4bc 100644
--- a/commit.c
+++ b/commit.c
@@ -328,10 +328,10 @@ void repo_unuse_commit_buffer(struct repository *r,
 		free((void *)buffer);
 }
 
-void free_commit_buffer(struct commit *commit)
+void free_commit_buffer(struct parsed_object_pool *pool, struct commit *commit)
 {
 	struct commit_buffer *v = buffer_slab_peek(
-		the_repository->parsed_objects->buffer_slab, commit);
+		pool->buffer_slab, commit);
 	if (v) {
 		FREE_AND_NULL(v->buffer);
 		v->size = 0;
@@ -354,13 +354,12 @@ struct object_id *get_commit_tree_oid(const struct commit *commit)
 	return &get_commit_tree(commit)->object.oid;
 }
 
-void release_commit_memory(struct commit *c)
+void release_commit_memory(struct parsed_object_pool *pool, struct commit *c)
 {
 	c->maybe_tree = NULL;
 	c->index = 0;
-	free_commit_buffer(c);
+	free_commit_buffer(pool, c);
 	free_commit_list(c->parents);
-	/* TODO: what about commit->util? */
 
 	c->object.parsed = 0;
 }
diff --git a/commit.h b/commit.h
index 2e6b799b26..d2779a23f6 100644
--- a/commit.h
+++ b/commit.h
@@ -140,7 +140,7 @@ void repo_unuse_commit_buffer(struct repository *r,
 /*
  * Free any cached object buffer associated with the commit.
  */
-void free_commit_buffer(struct commit *);
+void free_commit_buffer(struct parsed_object_pool *pool, struct commit *);
 
 struct tree *get_commit_tree(const struct commit *);
 struct object_id *get_commit_tree_oid(const struct commit *);
@@ -149,7 +149,7 @@ struct object_id *get_commit_tree_oid(const struct commit *);
  * Release memory related to a commit, including the parent list and
  * any cached object buffer.
  */
-void release_commit_memory(struct commit *c);
+void release_commit_memory(struct parsed_object_pool *pool, struct commit *c);
 
 /*
  * Disassociate any cached object buffer from the commit, but do not free it.
diff --git a/object.c b/object.c
index 003f870484..c4170d2d0c 100644
--- a/object.c
+++ b/object.c
@@ -540,7 +540,7 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
 		if (obj->type == OBJ_TREE)
 			free_tree_buffer((struct tree*)obj);
 		else if (obj->type == OBJ_COMMIT)
-			release_commit_memory((struct commit*)obj);
+			release_commit_memory(o, (struct commit*)obj);
 		else if (obj->type == OBJ_TAG)
 			release_tag_memory((struct tag*)obj);
 	}
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 23/24] path.h: make REPO_GIT_PATH_FUNC repository agnostic
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (21 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 22/24] commit: make free_commit_buffer and release_commit_memory repository agnostic Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-30 22:08 ` [PATCH 24/24] t/helper/test-repository: celebrate independence from the_repository Stefan Beller
  2018-10-31  6:41 ` [PATCHv2 00/24] Bring more repository handles into our code base] Junio C Hamano
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

git_pathdup uses the_repository internally, but the macro
REPO_GIT_PATH_FUNC is specifically made for arbitrary repositories.
Switch to repo_git_path which works on arbitrary repositories.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 path.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/path.h b/path.h
index b654ea8ff5..651e6157fc 100644
--- a/path.h
+++ b/path.h
@@ -165,7 +165,7 @@ extern void report_linked_checkout_garbage(void);
 	const char *git_path_##var(struct repository *r) \
 	{ \
 		if (!r->cached_paths.var) \
-			r->cached_paths.var = git_pathdup(filename); \
+			r->cached_paths.var = repo_git_path(r, filename); \
 		return r->cached_paths.var; \
 	}
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 24/24] t/helper/test-repository: celebrate independence from the_repository
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (22 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 23/24] path.h: make REPO_GIT_PATH_FUNC " Stefan Beller
@ 2018-10-30 22:08 ` Stefan Beller
  2018-10-31  6:41 ` [PATCHv2 00/24] Bring more repository handles into our code base] Junio C Hamano
  24 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-10-30 22:08 UTC (permalink / raw)
  To: git; +Cc: szeder.dev, jonathantanmy, Stefan Beller

dade47c06c (commit-graph: add repo arg to graph readers, 2018-07-11)
brought more independence from the_repository to the commit graph, however
it was not completely independent of the_repository, as the previous
patches show.

To ensure we're not accessing the_repository by accident, we'd ideally
assign NULL to the_repository to trigger a segfault on access.

We currently have a temporary hack in cache.h, which relies on
the_hash_algo (which is a short form of the_repository->hash_algo) to
be set, so we cannot do that. The next best thing is to set all fields of
the_repository to 0, so any accidental access is more likely to be found.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 cache.h                    |  2 ++
 t/helper/test-repository.c | 10 ++++++++++
 2 files changed, 12 insertions(+)

diff --git a/cache.h b/cache.h
index f7fabdde8f..9f535040af 100644
--- a/cache.h
+++ b/cache.h
@@ -1035,6 +1035,8 @@ static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
 	 *
 	 * This will need to be extended or ripped out when we learn about
 	 * hashes of different sizes.
+	 *
+	 * When ripping this out, see TODO in test-repository.c.
 	 */
 	if (the_hash_algo->rawsz != 20)
 		BUG("hash size not yet supported by hashcmp");
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index 6a84a53efb..f7f8618445 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -17,6 +17,11 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 
 	setup_git_env(gitdir);
 
+	memset(the_repository, 0, sizeof(*the_repository));
+
+	/* TODO: Needed for temporary hack in hashcmp, see 183a638b7da. */
+	repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	if (repo_init(&r, gitdir, worktree))
 		die("Couldn't init repo");
 
@@ -43,6 +48,11 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 
 	setup_git_env(gitdir);
 
+	memset(the_repository, 0, sizeof(*the_repository));
+
+	/* TODO: Needed for temporary hack in hashcmp, see 183a638b7da. */
+	repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	if (repo_init(&r, gitdir, worktree))
 		die("Couldn't init repo");
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCHv2 00/24] Bring more repository handles into our code base]
  2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
                   ` (23 preceding siblings ...)
  2018-10-30 22:08 ` [PATCH 24/24] t/helper/test-repository: celebrate independence from the_repository Stefan Beller
@ 2018-10-31  6:41 ` Junio C Hamano
  2018-11-01 19:45   ` Stefan Beller
  24 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2018-10-31  6:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, szeder.dev, jonathantanmy

Stefan Beller <sbeller@google.com> writes:

> I also picked up the patch for pending semantic patches, as the
> first patch, can I have your sign off, please?

I find this step quite lacking.  

What was posted would have been perfectly fine as a "how about doing
it this way" weatherbaloon patch, but as a part of real series, it
needs to explain to the developers what the distinctions between two
classes are, and who is to use the cocci patch at what point in the
development cycle, etc., in an accompanying documentation update.

So can we have both sign-off and write-up (perhaps cut&paste from
the original e-mail discussion)?

> Stefan Beller (23):
>   sha1_file: allow read_object to read objects in arbitrary repositories
>   packfile: allow has_packed_and_bad to handle arbitrary repositories
>   object-store: allow read_object_file_extended to read from arbitrary
>     repositories
>   object-store: prepare read_object_file to deal with arbitrary
>     repositories
>   object-store: prepare has_{sha1, object}_file[_with_flags] to handle
>     arbitrary repositories
>   object: parse_object to honor its repository argument
>   commit: allow parse_commit* to handle arbitrary repositories
>   commit-reach.c: allow paint_down_to_common to handle arbitrary
>     repositories
>   commit-reach.c: allow merge_bases_many to handle arbitrary
>     repositories
>   commit-reach.c: allow remove_redundant to handle arbitrary
>     repositories
>   commit-reach.c: allow get_merge_bases_many_0 to handle arbitrary
>     repositories
>   commit-reach: prepare get_merge_bases to handle arbitrary repositories
>   commit-reach: prepare in_merge_bases[_many] to handle arbitrary
>     repositories
>   commit: prepare get_commit_buffer to handle arbitrary repositories
>   commit: prepare repo_unuse_commit_buffer to handle arbitrary
>     repositories
>   commit: prepare logmsg_reencode to handle arbitrary repositories
>   pretty: prepare format_commit_message to handle arbitrary repositories
>   submodule: use submodule repos for object lookup
>   submodule: don't add submodule as odb for push
>   commit-graph: convert remaining function to handle arbitrary
>     repositories
>   commit: make free_commit_buffer and release_commit_memory repository
>     agnostic
>   path.h: make REPO_GIT_PATH_FUNC repository agnostic
>   t/helper/test-repository: celebrate independence from the_repository

It seems that this topic is full of commits with overly long title.

>
>  Makefile                                      |   6 +-
>  builtin/fsck.c                                |   3 +-
>  builtin/log.c                                 |   6 +-
>  builtin/rev-list.c                            |   3 +-
>  cache.h                                       |   2 +
>  commit-graph.c                                |  40 +++--
>  commit-reach.c                                |  73 +++++----
>  commit-reach.h                                |  38 +++--
>  commit.c                                      |  41 ++---
>  commit.h                                      |  43 +++++-
>  .../coccinelle/the_repository.pending.cocci   | 144 ++++++++++++++++++
>  object-store.h                                |  35 ++++-
>  object.c                                      |   8 +-
>  packfile.c                                    |   5 +-
>  packfile.h                                    |   2 +-
>  path.h                                        |   2 +-
>  pretty.c                                      |  28 ++--
>  pretty.h                                      |   7 +-
>  sha1-file.c                                   |  34 +++--
>  streaming.c                                   |   2 +-
>  submodule.c                                   |  79 +++++++---
>  t/helper/test-repository.c                    |  10 ++
>  22 files changed, 459 insertions(+), 152 deletions(-)
>  create mode 100644 contrib/coccinelle/the_repository.pending.cocci
>
> git range-diff origin/sb/more-repo-in-api...
> [...] # rebased to origin/master

I offhand do not recall what happened during these 100+ pacthes.
DId we acquire something significantly new that would have
conflicted with this new round of the topic?  I do not mind at all
seeing that a series gets adjusted to the updated codebase, but I do
dislike seeing it done without explanation---an unexplained rebase
to a new base is a wasted opportunity to learn what areas of the
codebase are so hot that multiple and separate topics would want to
touch them.

>   -:  ---------- > 116:  4ede3d42df Seventh batch for 2.20
>   -:  ---------- > 117:  b1de196491 Makefile: add pending semantic patches
>   1:  1b9b5c695e = 118:  f1be5eb487 sha1_file: allow read_object to read objects in arbitrary repositories
>   2:  33b94066f2 = 119:  9100a5705d packfile: allow has_packed_and_bad to handle arbitrary repositories
>   3:  5217b6b1e1 = 120:  a4ad7791da object-store: allow read_object_file_extended to read from arbitrary repositories
>   4:  2b7239b55b ! 121:  5d7b3a6dd9 object-store: prepare read_object_file to deal with arbitrary repositories
>     @@ -19,10 +19,10 @@
>          Signed-off-by: Stefan Beller <sbeller@google.com>
>          Signed-off-by: Junio C Hamano <gitster@pobox.com>
>      
>     - diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci
>     + diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
>       new file mode 100644
>       --- /dev/null
>     - +++ b/contrib/coccinelle/the_repository.cocci
>     + +++ b/contrib/coccinelle/the_repository.pending.cocci
>      @@
>      +// This file is used for the ongoing refactoring of
>      +// bringing the index or repository struct in all of
>     @@ -36,6 +36,7 @@
>      +- read_object_file(
>      ++ repo_read_object_file(the_repository,
>      +  E, F, G)
>     ++

Is it necessary for this file to end with a trailing blank line?



Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/24] Makefile: add pending semantic patches
  2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
@ 2018-10-31  6:42   ` Junio C Hamano
  2018-11-08 20:52   ` [PATCH] " Stefan Beller
  1 sibling, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2018-10-31  6:42 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, szeder.dev, jonathantanmy

Stefan Beller <sbeller@google.com> writes:

> [Missing: SZEDERs sign off, so I also do not sign off]

At least to me, based on my reading of DCO in
Documentation/SubmittingPatches, this reasoning does not make much
sense.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv2 00/24] Bring more repository handles into our code base]
  2018-10-31  6:41 ` [PATCHv2 00/24] Bring more repository handles into our code base] Junio C Hamano
@ 2018-11-01 19:45   ` Stefan Beller
  2018-11-02  2:00     ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Stefan Beller @ 2018-11-01 19:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, SZEDER Gábor, Jonathan Tan

On Tue, Oct 30, 2018 at 11:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Stefan Beller <sbeller@google.com> writes:
>
> > I also picked up the patch for pending semantic patches, as the
> > first patch, can I have your sign off, please?
>
> I find this step quite lacking.
>
> What was posted would have been perfectly fine as a "how about doing
> it this way" weatherbaloon patch, but as a part of real series, it
> needs to explain to the developers what the distinctions between two
> classes are, and who is to use the cocci patch at what point in the
> development cycle, etc., in an accompanying documentation update.

if only we had documentation [as found via "git grep coccicheck"]
that I could update ... I'd totally do that.
I guess this asks for documentation to begin with, now?

> So can we have both sign-off and write-up (perhaps cut&paste from
> the original e-mail discussion)?

I'll see where to put the docs; I assumed commit messages are enough.
63f0a758a0 (add coccicheck make target, 2016-09-15)
is what I found nice.


> >   t/helper/test-repository: celebrate independence from the_repository
>
> It seems that this topic is full of commits with overly long title.

yep.
> > git range-diff origin/sb/more-repo-in-api...
> > [...] # rebased to origin/master
>
> I offhand do not recall what happened during these 100+ pacthes.
> DId we acquire something significantly new that would have
> conflicted with this new round of the topic?  I do not mind at all
> seeing that a series gets adjusted to the updated codebase, but I do
> dislike seeing it done without explanation---an unexplained rebase
> to a new base is a wasted opportunity to learn what areas of the
> codebase are so hot that multiple and separate topics would want to
> touch them.

From the point of view of these large scale refactorings,
all of the code is hot, e.g. the patch that was present in the RFC
"apply all semantic patches" would clash with nearly any topic.

As I do not carry that patch any more, I do not recall any conflicts
that needed to be resolved.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv2 00/24] Bring more repository handles into our code base]
  2018-11-01 19:45   ` Stefan Beller
@ 2018-11-02  2:00     ` Junio C Hamano
  0 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2018-11-02  2:00 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, SZEDER Gábor, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

>> What was posted would have been perfectly fine as a "how about doing
>> it this way" weatherbaloon patch, but as a part of real series, it
>> needs to explain to the developers what the distinctions between two
>> classes are, and who is to use the cocci patch at what point in the
>> development cycle, etc., in an accompanying documentation update.
>
> if only we had documentation [as found via "git grep coccicheck"]
> that I could update ... I'd totally do that.
> I guess this asks for documentation to begin with, now?

So far, we didn't even need any, as there was no "workflow" to speak
of.  It's just "any random developer finds a suggested update,
either by running 'make coccicheck' oneself or by peeking Travis
log, and sends in a patch".

But the "pending" thing has a lot more workflow elements, who is
responsible for noticing update suggested by "pending" ones, for
updating the code, and for promoting "pending" to the normal.  These
are all new, and these are all needed as ongoing basis to help
developers---I'd say the way Documentation/SubmittingPatches helps
developers is very close to the way this new document would help them.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/24] submodule: use submodule repos for object lookup
  2018-10-30 22:08 ` [PATCH 19/24] submodule: use submodule repos for object lookup Stefan Beller
@ 2018-11-02 13:03   ` Derrick Stolee
  2018-11-02 17:23     ` Stefan Beller
  0 siblings, 1 reply; 43+ messages in thread
From: Derrick Stolee @ 2018-11-02 13:03 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: szeder.dev, jonathantanmy

On 10/30/2018 6:08 PM, Stefan Beller wrote:
> This converts the 'show_submodule_header' function to use
> the repository API properly, such that the submodule objects
> are not added to the main object store.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>

A couple tests are broken in 'pu' when run with GIT_TEST_COMMIT_GRAPH=1, 
including t4041-diff-submodule-option.sh. The failure bisects to this patch.

Here is a verbose output of the first failure in that script:;


expecting success:
         git diff-index -p --submodule=log HEAD >actual &&
         cat >expected <<-EOF &&
         Submodule sm1 $head2..$head3 (rewind):
           < Add foo3 ($added foo3)
           < Add foo2 ($added foo2)
         EOF
         test_cmp expected actual

+ git diff-index -p --submodule=log HEAD
+ cat
+ test_cmp expected actual
+ diff -u expected actual
--- expected    2018-11-02 12:58:43.429262380 +0000
+++ actual      2018-11-02 12:58:43.429262380 +0000
@@ -1,3 +1,5 @@
-Submodule sm1 30b9670..dafb207 (rewind):
+Submodule sm1 30b9670...dafb207:
    < Add foo3 (hinzugefügt foo3)
    < Add foo2 (hinzugefügt foo2)
+  > Add foo1 (hinzugefügt foo1)
+  < Add foo1 (hinzugefügt foo1)
error: last command exited with $?=1
not ok 9 - modified submodule(backward)

I've been looking into the patch below to see if there is an obvious 
problem, but the best I can think is that open_submodule() creates an 
alternate 'struct repository' and somehow the commit-graph feature is 
interacting poorly with that struct.

Stefan, do you know what's going on?

Thanks,

-Stolee

> ---
>   submodule.c | 76 ++++++++++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index d9d3046689..0fdda45252 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -443,7 +443,7 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
>   	return prepare_revision_walk(rev);
>   }
>   
> -static void print_submodule_summary(struct rev_info *rev, struct diff_options *o)
> +static void print_submodule_summary(struct repository *r, struct rev_info *rev, struct diff_options *o)
>   {
>   	static const char format[] = "  %m %s";
>   	struct strbuf sb = STRBUF_INIT;
> @@ -454,7 +454,8 @@ static void print_submodule_summary(struct rev_info *rev, struct diff_options *o
>   		ctx.date_mode = rev->date_mode;
>   		ctx.output_encoding = get_log_output_encoding();
>   		strbuf_setlen(&sb, 0);
> -		format_commit_message(commit, format, &sb, &ctx);
> +		repo_format_commit_message(r, commit, format, &sb,
> +				      &ctx);
>   		strbuf_addch(&sb, '\n');
>   		if (commit->object.flags & SYMMETRIC_LEFT)
>   			diff_emit_submodule_del(o, sb.buf);
> @@ -481,14 +482,47 @@ void prepare_submodule_repo_env(struct argv_array *out)
>   			 DEFAULT_GIT_DIR_ENVIRONMENT);
>   }
>   
> -/* Helper function to display the submodule header line prior to the full
> - * summary output. If it can locate the submodule objects directory it will
> - * attempt to lookup both the left and right commits and put them into the
> - * left and right pointers.
> +/*
> + * Initialize 'out' based on the provided submodule path.
> + *
> + * Unlike repo_submodule_init, this tolerates submodules not present
> + * in .gitmodules. This function exists only to preserve historical behavior,
> + *
> + * Returns 0 on success, -1 when the submodule is not present.
>    */
> -static void show_submodule_header(struct diff_options *o, const char *path,
> +static struct repository *open_submodule(const char *path)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +	struct repository *out = xmalloc(sizeof(*out));
> +
> +	if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
> +		strbuf_release(&sb);
> +		free(out);
> +		return NULL;
> +	}
> +
> +	out->submodule_prefix = xstrfmt("%s%s/",
> +					the_repository->submodule_prefix ?
> +					the_repository->submodule_prefix :
> +					"", path);
> +
> +	strbuf_release(&sb);
> +	return out;
> +}
> +
> +/*
> + * Helper function to display the submodule header line prior to the full
> + * summary output.
> + *
> + * If it can locate the submodule git directory it will create a repository
> + * handle for the submodule and lookup both the left and right commits and
> + * put them into the left and right pointers.
> + */
> +static void show_submodule_header(struct diff_options *o,
> +		const char *path,
>   		struct object_id *one, struct object_id *two,
>   		unsigned dirty_submodule,
> +		struct repository *sub,
>   		struct commit **left, struct commit **right,
>   		struct commit_list **merge_bases)
>   {
> @@ -507,7 +541,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
>   	else if (is_null_oid(two))
>   		message = "(submodule deleted)";
>   
> -	if (add_submodule_odb(path)) {
> +	if (!sub) {
>   		if (!message)
>   			message = "(commits not present)";
>   		goto output_header;
> @@ -517,8 +551,8 @@ static void show_submodule_header(struct diff_options *o, const char *path,
>   	 * Attempt to lookup the commit references, and determine if this is
>   	 * a fast forward or fast backwards update.
>   	 */
> -	*left = lookup_commit_reference(the_repository, one);
> -	*right = lookup_commit_reference(the_repository, two);
> +	*left = lookup_commit_reference(sub, one);
> +	*right = lookup_commit_reference(sub, two);
>   
>   	/*
>   	 * Warn about missing commits in the submodule project, but only if
> @@ -528,7 +562,7 @@ static void show_submodule_header(struct diff_options *o, const char *path,
>   	     (!is_null_oid(two) && !*right))
>   		message = "(commits not present)";
>   
> -	*merge_bases = get_merge_bases(*left, *right);
> +	*merge_bases = repo_get_merge_bases(sub, *left, *right);
>   	if (*merge_bases) {
>   		if ((*merge_bases)->item == *left)
>   			fast_forward = 1;
> @@ -562,16 +596,18 @@ void show_submodule_summary(struct diff_options *o, const char *path,
>   	struct rev_info rev;
>   	struct commit *left = NULL, *right = NULL;
>   	struct commit_list *merge_bases = NULL;
> +	struct repository *sub;
>   
> +	sub = open_submodule(path);
>   	show_submodule_header(o, path, one, two, dirty_submodule,
> -			      &left, &right, &merge_bases);
> +			      sub, &left, &right, &merge_bases);
>   
>   	/*
>   	 * If we don't have both a left and a right pointer, there is no
>   	 * reason to try and display a summary. The header line should contain
>   	 * all the information the user needs.
>   	 */
> -	if (!left || !right)
> +	if (!left || !right || !sub)
>   		goto out;
>   
>   	/* Treat revision walker failure the same as missing commits */
> @@ -580,13 +616,17 @@ void show_submodule_summary(struct diff_options *o, const char *path,
>   		goto out;
>   	}
>   
> -	print_submodule_summary(&rev, o);
> +	print_submodule_summary(sub, &rev, o);
>   
>   out:
>   	if (merge_bases)
>   		free_commit_list(merge_bases);
>   	clear_commit_marks(left, ~0);
>   	clear_commit_marks(right, ~0);
> +	if (sub) {
> +		repo_clear(sub);
> +		free(sub);
> +	}
>   }
>   
>   void show_submodule_inline_diff(struct diff_options *o, const char *path,
> @@ -598,9 +638,11 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
>   	struct commit_list *merge_bases = NULL;
>   	struct child_process cp = CHILD_PROCESS_INIT;
>   	struct strbuf sb = STRBUF_INIT;
> +	struct repository *sub;
>   
> +	sub = open_submodule(path);
>   	show_submodule_header(o, path, one, two, dirty_submodule,
> -			      &left, &right, &merge_bases);
> +			      sub, &left, &right, &merge_bases);
>   
>   	/* We need a valid left and right commit to display a difference */
>   	if (!(left || is_null_oid(one)) ||
> @@ -661,6 +703,10 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
>   		clear_commit_marks(left, ~0);
>   	if (right)
>   		clear_commit_marks(right, ~0);
> +	if (sub) {
> +		repo_clear(sub);
> +		free(sub);
> +	}
>   }
>   
>   int should_update_submodules(void)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/24] submodule: use submodule repos for object lookup
  2018-11-02 13:03   ` Derrick Stolee
@ 2018-11-02 17:23     ` Stefan Beller
  2018-11-02 17:27       ` Derrick Stolee
  0 siblings, 1 reply; 43+ messages in thread
From: Stefan Beller @ 2018-11-02 17:23 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, SZEDER Gábor, Jonathan Tan

On Fri, Nov 2, 2018 at 6:03 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/30/2018 6:08 PM, Stefan Beller wrote:
> > This converts the 'show_submodule_header' function to use
> > the repository API properly, such that the submodule objects
> > are not added to the main object store.
> >
> > Signed-off-by: Stefan Beller <sbeller@google.com>
>
> A couple tests are broken in 'pu' when run with GIT_TEST_COMMIT_GRAPH=1,
> including t4041-diff-submodule-option.sh. The failure bisects to this patch.
>
> Here is a verbose output of the first failure in that script:;
>
>
> expecting success:
>          git diff-index -p --submodule=log HEAD >actual &&
>          cat >expected <<-EOF &&
>          Submodule sm1 $head2..$head3 (rewind):
>            < Add foo3 ($added foo3)
>            < Add foo2 ($added foo2)
>          EOF
>          test_cmp expected actual
>
> + git diff-index -p --submodule=log HEAD
> + cat
> + test_cmp expected actual
> + diff -u expected actual
> --- expected    2018-11-02 12:58:43.429262380 +0000
> +++ actual      2018-11-02 12:58:43.429262380 +0000
> @@ -1,3 +1,5 @@
> -Submodule sm1 30b9670..dafb207 (rewind):
> +Submodule sm1 30b9670...dafb207:
>     < Add foo3 (hinzugefügt foo3)
>     < Add foo2 (hinzugefügt foo2)
> +  > Add foo1 (hinzugefügt foo1)
> +  < Add foo1 (hinzugefügt foo1)
> error: last command exited with $?=1
> not ok 9 - modified submodule(backward)
>
> I've been looking into the patch below to see if there is an obvious
> problem, but the best I can think is that open_submodule() creates an
> alternate 'struct repository' and somehow the commit-graph feature is
> interacting poorly with that struct.
>
> Stefan, do you know what's going on?

Sure, see the last four patches of this series
https://public-inbox.org/git/20181030220817.61691-1-sbeller@google.com/
(to which you also reply to? Junio did not queue this one, yet).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/24] submodule: use submodule repos for object lookup
  2018-11-02 17:23     ` Stefan Beller
@ 2018-11-02 17:27       ` Derrick Stolee
  0 siblings, 0 replies; 43+ messages in thread
From: Derrick Stolee @ 2018-11-02 17:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, SZEDER Gábor, Jonathan Tan

On 11/2/2018 1:23 PM, Stefan Beller wrote:
> On Fri, Nov 2, 2018 at 6:03 AM Derrick Stolee <stolee@gmail.com> wrote:
>> On 10/30/2018 6:08 PM, Stefan Beller wrote:
>>> This converts the 'show_submodule_header' function to use
>>> the repository API properly, such that the submodule objects
>>> are not added to the main object store.
>>>
>>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> A couple tests are broken in 'pu' when run with GIT_TEST_COMMIT_GRAPH=1,
>> including t4041-diff-submodule-option.sh. The failure bisects to this patch.
>>
>> Here is a verbose output of the first failure in that script:;
>>
>>
>> expecting success:
>>           git diff-index -p --submodule=log HEAD >actual &&
>>           cat >expected <<-EOF &&
>>           Submodule sm1 $head2..$head3 (rewind):
>>             < Add foo3 ($added foo3)
>>             < Add foo2 ($added foo2)
>>           EOF
>>           test_cmp expected actual
>>
>> + git diff-index -p --submodule=log HEAD
>> + cat
>> + test_cmp expected actual
>> + diff -u expected actual
>> --- expected    2018-11-02 12:58:43.429262380 +0000
>> +++ actual      2018-11-02 12:58:43.429262380 +0000
>> @@ -1,3 +1,5 @@
>> -Submodule sm1 30b9670..dafb207 (rewind):
>> +Submodule sm1 30b9670...dafb207:
>>      < Add foo3 (hinzugefügt foo3)
>>      < Add foo2 (hinzugefügt foo2)
>> +  > Add foo1 (hinzugefügt foo1)
>> +  < Add foo1 (hinzugefügt foo1)
>> error: last command exited with $?=1
>> not ok 9 - modified submodule(backward)
>>
>> I've been looking into the patch below to see if there is an obvious
>> problem, but the best I can think is that open_submodule() creates an
>> alternate 'struct repository' and somehow the commit-graph feature is
>> interacting poorly with that struct.
>>
>> Stefan, do you know what's going on?
> Sure, see the last four patches of this series
> https://public-inbox.org/git/20181030220817.61691-1-sbeller@google.com/
> (to which you also reply to? Junio did not queue this one, yet).

Sorry! Got a bit mixed up looking at everything. I didn't realize that 
the current 'pu' didn't have your latest. Thanks!


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] Makefile: add pending semantic patches
  2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
  2018-10-31  6:42   ` Junio C Hamano
@ 2018-11-08 20:52   ` Stefan Beller
  2018-11-09  4:55     ` Martin Ågren
                       ` (2 more replies)
  1 sibling, 3 replies; 43+ messages in thread
From: Stefan Beller @ 2018-11-08 20:52 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, szeder.dev

From: SZEDER Gábor <szeder.dev@gmail.com>

Add a description and place on how to use coccinelle for large refactorings
that happen only once.

Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

I consider including this patch in a resend instead.
It outlays the basics of such a new workflow, which we can refine later.

Thanks,
Stefan

 Makefile                  |  7 +++--
 contrib/coccinelle/README | 60 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index b08d5ea258..e5abfe4cef 100644
--- a/Makefile
+++ b/Makefile
@@ -2739,9 +2739,12 @@ endif
 	then \
 		echo '    ' SPATCH result: $@; \
 	fi
-coccicheck: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.cocci))
+coccicheck: $(addsuffix .patch,$(filter-out %.pending.cocci,$(wildcard contrib/coccinelle/*.cocci)))
 
-.PHONY: coccicheck
+# See contrib/coccinelle/README
+coccicheck-pending: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.pending.cocci))
+
+.PHONY: coccicheck coccicheck-pending
 
 ### Installation rules
 
diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
index 9c2f8879c2..fa09d1abcc 100644
--- a/contrib/coccinelle/README
+++ b/contrib/coccinelle/README
@@ -1,2 +1,62 @@
 This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
 semantic patches that might be useful to developers.
+
+There are two types of semantic patches:
+
+ * Using the semantic transformation to check for bad patterns in the code;
+   This is what the original target 'make coccicheck' is designed to do and
+   it is expected that any resulting patch indicates a regression.
+   The patches resulting from 'make coccicheck' are small and infrequent,
+   so once they are found, they can be sent to the mailing list as per usual.
+
+   Example for introducing new patterns:
+   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
+   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
+
+   Example of fixes using this approach:
+   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
+               a strbuf, 2018-03-25)
+   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
+
+   These types of semantic patches are usually part of testing, c.f.
+   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
+               to transform, 2018-07-23)
+
+ * Using semantic transformations in large scale refactorings throughout
+   the code base.
+
+   When applying the semantic patch into a real patch, sending it to the
+   mailing list in the usual way, such a patch would be expected to have a
+   lot of textual and semantic conflicts as such large scale refactorings
+   change function signatures that are used widely in the code base.
+   A textual conflict would arise if surrounding code near any call of such
+   function changes. A semantic conflict arises when other patch series in
+   flight introduce calls to such functions.
+
+   So to aid these large scale refactorings, semantic patches can be used,
+   using the process as follows:
+
+   1) Figure out what kind of large scale refactoring we need
+      -> This is usually done by another unrelated series.
+   2) Create the sematic patch needed for the large scale refactoring
+      and store it in contrib/coccinelle/*.pending.cocci
+      -> The suffix containing 'pending' is important to differentiate
+      this case from the other use case of checking for bad patterns.
+   3) Apply the semantic patch only partially, where needed for the patch series
+      that motivates the large scale refactoring and then build that series
+      on top of it.
+      By applying the semantic patch only partially where needed, the series
+      is less likely to conflict with other series in flight.
+      To make it possible to apply the semantic patch partially, there needs
+      to be mechanism for backwards compatibility to keep those places working
+      where the semantic patch is not applied. This can be done via a
+      wrapper function that has the exact name and signature as the function
+      to be changed.
+
+   4) Send the series as usual, including only the needed parts of the
+      large scale refactoring
+
+   Later steps (not necessarily by the original author) are to apply the
+   semantic patch in a way that do not produce lots of conflicts, for example
+   by consulting `git diff --numstat origin/master..origin/pu` and the changes
+   of the semantic patch.
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-08 20:52   ` [PATCH] " Stefan Beller
@ 2018-11-09  4:55     ` Martin Ågren
  2018-11-09 20:50       ` Stefan Beller
  2018-11-09  5:18     ` Junio C Hamano
  2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
  2 siblings, 1 reply; 43+ messages in thread
From: Martin Ågren @ 2018-11-09  4:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git Mailing List, Jonathan Tan, SZEDER Gábor

On Thu, 8 Nov 2018 at 21:53, Stefan Beller <sbeller@google.com> wrote:
>
> From: SZEDER Gábor <szeder.dev@gmail.com>
>

I haven't followed the original discussion too carefully, so I'll read
this like someone new to the topic probably would.

A nit, perhaps, but I was genuinely confused at first. The subject is
"Makefile: add pending semantic patches", but this patch doesn't add
any. It adds Makefile-support for such patches though, and it defines
the entire concept of pending semantic patches. How about "coccicheck:
introduce 'pending' semantic patches"?

> Add a description and place on how to use coccinelle for large refactorings
> that happen only once.

A bit confused about "and place". Based on my understanding from reading
the remainder of this patch, maybe:

  Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
  handle them separately in a new `make coccicheck-pending` instead.
  This means that we can separate "critical" patches from "FYI" patches.
  The former target can continue causing Travis to fail its static
  analysis job, while the latter can let us keep an eye on ongoing
  (pending) transitions without them causing too much fallout.

  Document the intended use-cases and processes around these two
  targets.

>  This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
>  semantic patches that might be useful to developers.
> +
> +There are two types of semantic patches:
> +
> + * Using the semantic transformation to check for bad patterns in the code;
> +   This is what the original target 'make coccicheck' is designed to do and

Is it relevant that this was the "original" target? (Genuine question.)

> +   it is expected that any resulting patch indicates a regression.
> +   The patches resulting from 'make coccicheck' are small and infrequent,
> +   so once they are found, they can be sent to the mailing list as per usual.
> +
> +   Example for introducing new patterns:
> +   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
> +   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
> +
> +   Example of fixes using this approach:
> +   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
> +               a strbuf, 2018-03-25)
> +   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
> +
> +   These types of semantic patches are usually part of testing, c.f.
> +   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
> +               to transform, 2018-07-23)

Very nicely described, nice with the examples to quickly give a feeling
about how/when to use this.

> + * Using semantic transformations in large scale refactorings throughout
> +   the code base.
> +
> +   When applying the semantic patch into a real patch, sending it to the
> +   mailing list in the usual way, such a patch would be expected to have a
> +   lot of textual and semantic conflicts as such large scale refactorings
> +   change function signatures that are used widely in the code base.
> +   A textual conflict would arise if surrounding code near any call of such
> +   function changes. A semantic conflict arises when other patch series in
> +   flight introduce calls to such functions.

OK, I'm with you.

> +   So to aid these large scale refactorings, semantic patches can be used,
> +   using the process as follows:
> +
> +   1) Figure out what kind of large scale refactoring we need
> +      -> This is usually done by another unrelated series.

"This"? The figuring out, or the refactoring? Also, "unrelated"?

> +   2) Create the sematic patch needed for the large scale refactoring

s/sematic/semantic/

> +      and store it in contrib/coccinelle/*.pending.cocci
> +      -> The suffix containing 'pending' is important to differentiate
> +      this case from the other use case of checking for bad patterns.

Good.

> +   3) Apply the semantic patch only partially, where needed for the patch series
> +      that motivates the large scale refactoring and then build that series
> +      on top of it.
> +      By applying the semantic patch only partially where needed, the series
> +      is less likely to conflict with other series in flight.
> +      To make it possible to apply the semantic patch partially, there needs
> +      to be mechanism for backwards compatibility to keep those places working
> +      where the semantic patch is not applied. This can be done via a
> +      wrapper function that has the exact name and signature as the function
> +      to be changed.
> +
> +   4) Send the series as usual, including only the needed parts of the
> +      large scale refactoring

Trailing period.

OK, I think I get it, but I wonder if this might not work equally well
or better under certain circumstances:

  - introduce new API
  - add pending semantic patch
  - convert quiet areas to use the new API

On the other hand, listing all possible flows might be needlessly
limiting. I guess it boils down to this:

"Create a pending semantic patch. Make sure the old way of doing things
still works. Apply the semantic patch to the quieter areas of the
codebase. If in doubt, convert fewer places in the original series --
remaining spots can always be converted at a later time."

Martin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-08 20:52   ` [PATCH] " Stefan Beller
  2018-11-09  4:55     ` Martin Ågren
@ 2018-11-09  5:18     ` Junio C Hamano
  2018-11-09 21:58       ` Stefan Beller
  2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
  2 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2018-11-09  5:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy, szeder.dev

Stefan Beller <sbeller@google.com> writes:

> From: SZEDER Gábor <szeder.dev@gmail.com>
>
> Add a description and place on how to use coccinelle for large refactorings
> that happen only once.
>
> Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
> I consider including this patch in a resend instead.
> It outlays the basics of such a new workflow, which we can refine later.

Thanks for tying loose ends.

> diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
> index 9c2f8879c2..fa09d1abcc 100644
> --- a/contrib/coccinelle/README
> +++ b/contrib/coccinelle/README
> @@ -1,2 +1,62 @@
>  This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
>  semantic patches that might be useful to developers.
> +
> +There are two types of semantic patches:
> +
> + * Using the semantic transformation to check for bad patterns in the code;
> +   This is what the original target 'make coccicheck' is designed to do and
> +   it is expected that any resulting patch indicates a regression.
> +   The patches resulting from 'make coccicheck' are small and infrequent,
> +   so once they are found, they can be sent to the mailing list as per usual.
> +
> +   Example for introducing new patterns:
> +   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
> +   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
> +
> +   Example of fixes using this approach:
> +   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
> +               a strbuf, 2018-03-25)
> +   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
> +
> +   These types of semantic patches are usually part of testing, c.f.
> +   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
> +               to transform, 2018-07-23)

Yup, and I think what we have in 'pu' (including your the_repository
stuff) falls into this category.

I've been paying attention to "make coccicheck" produced patches for
the past few weeks, and so far, I found it _much_ _much_ more
pleasant, compared to having to worry about merge conflicts with the
topics in flight that changes day to day (not just because we add
new topics or update existing topics, but also the order of the
merge changes as topics mature at different rates and jumps over
each other in master..pu history), that "make coccicheck" after
topics are merged to integration branches fixes these issues up as
needed.

> +   3) Apply the semantic patch only partially, where needed for the patch series
> +      that motivates the large scale refactoring and then build that series
> +      on top of it.

It is not quite clear what "needed for the patch series" really
means in the context of this paragraph.  What are the changes that
are not needed, that is still produced if we ran "make coccicheck"?

Are they wrong changes (e.g. a carelessly written read_cache() to
read_index(&the_index) conversion may munge the implementation of
read_cache(...) { return read_index(&the_index, ...); } and make
inifinite recursion)?  Or are they "correct but not immediately
necessary" (e.g. because calling read_cache() does not hurt until
that function gets removed, so rewriting the callers to call
read_index() with &the_index may be correct but not immediately
necessary)?

> +      By applying the semantic patch only partially where needed, the series
> +      is less likely to conflict with other series in flight.

That is correct.

> +      To make it possible to apply the semantic patch partially, there needs
> +      to be mechanism for backwards compatibility to keep those places working
> +      where the semantic patch is not applied. This can be done via a
> +      wrapper function that has the exact name and signature as the function
> +      to be changed.

OK, so this argues for leaving read_cache()-like things to help
other in-flight topics, while a change to encourage the use of
read_index() takes place.  That also makes sense, and this directly
relates to "less likely to conflict" benefit you mentioned above.

But it is still unclear to me then what are "necessary".

... goes and thinks ...

OK, so a series that allows a codepath to work on an arbitrary
in-core istate, for example, may need to update a function to take
istate and use it to call read_index(istate...), and the old code in
such a call chain must have used read_cache(), always operating on
&the_index.  For the purpose of that series, it does not matter if
other codepaths that are not involved in the callchain the series
wants to update are still only working on &the_index, so a change to
turn read_cache() into read_index(&the_index) is *not* necessary
(but still would be correct) and should be left out of the series.
But any change the series makes to the callchain in question that
turns read_cache() into read_index() with something call-specific
(not &the_index) is a necesary one.  Is that a reasonable example
of what these paragraphs wanted to convey with the distinction
between "needed for the patch series" and other changes?


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-09  4:55     ` Martin Ågren
@ 2018-11-09 20:50       ` Stefan Beller
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Beller @ 2018-11-09 20:50 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Jonathan Tan, SZEDER Gábor

On Thu, Nov 8, 2018 at 8:56 PM Martin Ågren <martin.agren@gmail.com> wrote:
> I haven't followed the original discussion too carefully, so I'll read
> this like someone new to the topic probably would.

Thanks!

> A nit, perhaps, but I was genuinely confused at first. The subject is
> "Makefile: add pending semantic patches", but this patch doesn't add
> any. It adds Makefile-support for such patches though, and it defines
> the entire concept of pending semantic patches. How about "coccicheck:
> introduce 'pending' semantic patches"?
>
> > Add a description and place on how to use coccinelle for large refactorings
> > that happen only once.
>
> A bit confused about "and place". Based on my understanding from reading
> the remainder of this patch, maybe:
>
>   Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
>   handle them separately in a new `make coccicheck-pending` instead.
>   This means that we can separate "critical" patches from "FYI" patches.
>   The former target can continue causing Travis to fail its static
>   analysis job, while the latter can let us keep an eye on ongoing
>   (pending) transitions without them causing too much fallout.
>
>   Document the intended use-cases and processes around these two
>   targets.

Both suggested title and new commit message make sense.

>
> >  This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
> >  semantic patches that might be useful to developers.
> > +
> > +There are two types of semantic patches:
> > +
> > + * Using the semantic transformation to check for bad patterns in the code;
> > +   This is what the original target 'make coccicheck' is designed to do and
>
> Is it relevant that this was the "original" target? (Genuine question.)

No. I can omit that part.

>
> > +   it is expected that any resulting patch indicates a regression.
> > +   The patches resulting from 'make coccicheck' are small and infrequent,
> > +   so once they are found, they can be sent to the mailing list as per usual.
> > +
> > +   Example for introducing new patterns:
> > +   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
> > +   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
> > +
> > +   Example of fixes using this approach:
> > +   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
> > +               a strbuf, 2018-03-25)
> > +   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
> > +
> > +   These types of semantic patches are usually part of testing, c.f.
> > +   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
> > +               to transform, 2018-07-23)
>
> Very nicely described, nice with the examples to quickly give a feeling
> about how/when to use this.

Thanks.


>
> > + * Using semantic transformations in large scale refactorings throughout
> > +   the code base.
> > +
> > +   When applying the semantic patch into a real patch, sending it to the
> > +   mailing list in the usual way, such a patch would be expected to have a
> > +   lot of textual and semantic conflicts as such large scale refactorings
> > +   change function signatures that are used widely in the code base.
> > +   A textual conflict would arise if surrounding code near any call of such
> > +   function changes. A semantic conflict arises when other patch series in
> > +   flight introduce calls to such functions.
>
> OK, I'm with you.
>
> > +   So to aid these large scale refactorings, semantic patches can be used,
> > +   using the process as follows:
> > +
> > +   1) Figure out what kind of large scale refactoring we need
> > +      -> This is usually done by another unrelated series.
>
> "This"? The figuring out, or the refactoring? Also, "unrelated"?

The need and type of what kind of large scale refactoring are
usually determined by a patch series, that is not refactoring for the
sake of refactoring, but it wants to achieve a specific goal, unrelated
to large refactorings per se.

The large refactoring is just another tool that a developer can use
to make their original series happen much faster.

So "unrelated" == "not the large scale refactoring, as that may
come as an preparatory series, but to have a preparatory series
it may be good to showcase why we need the preparatory series"

>
> > +   2) Create the sematic patch needed for the large scale refactoring
>
> s/sematic/semantic/

yup.

>
> > +      and store it in contrib/coccinelle/*.pending.cocci
> > +      -> The suffix containing 'pending' is important to differentiate
> > +      this case from the other use case of checking for bad patterns.
>
> Good.
>
> > +   3) Apply the semantic patch only partially, where needed for the patch series
> > +      that motivates the large scale refactoring and then build that series
> > +      on top of it.
> > +      By applying the semantic patch only partially where needed, the series
> > +      is less likely to conflict with other series in flight.
> > +      To make it possible to apply the semantic patch partially, there needs
> > +      to be mechanism for backwards compatibility to keep those places working
> > +      where the semantic patch is not applied. This can be done via a
> > +      wrapper function that has the exact name and signature as the function
> > +      to be changed.
> > +
> > +   4) Send the series as usual, including only the needed parts of the
> > +      large scale refactoring
>
> Trailing period.

ok.

> OK, I think I get it, but I wonder if this might not work equally well
> or better under certain circumstances:
>
>   - introduce new API

The  "new API" is similar enough to the old API and may even be
a superset.

>   - add pending semantic patch
>   - convert quiet areas to use the new API
>
> On the other hand, listing all possible flows might be needlessly
> limiting. I guess it boils down to this:
>
> "Create a pending semantic patch. Make sure the old way of doing things
> still works. Apply the semantic patch to the quieter areas of the
> codebase. If in doubt, convert fewer places in the original series --
> remaining spots can always be converted at a later time."

That's the gist of it. :)

Thanks for the review,
Stefan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-09  5:18     ` Junio C Hamano
@ 2018-11-09 21:58       ` Stefan Beller
  2018-11-13 15:48         ` SZEDER Gábor
  0 siblings, 1 reply; 43+ messages in thread
From: Stefan Beller @ 2018-11-09 21:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, SZEDER Gábor

On Thu, Nov 8, 2018 at 9:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Stefan Beller <sbeller@google.com> writes:
>
> > From: SZEDER Gábor <szeder.dev@gmail.com>
> >
> > Add a description and place on how to use coccinelle for large refactorings
> > that happen only once.
> >
> > Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>
> > Signed-off-by: Stefan Beller <sbeller@google.com>
> > ---
> >
> > I consider including this patch in a resend instead.
> > It outlays the basics of such a new workflow, which we can refine later.
>
> Thanks for tying loose ends.
>
> > diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
> > index 9c2f8879c2..fa09d1abcc 100644
> > --- a/contrib/coccinelle/README
> > +++ b/contrib/coccinelle/README
> > @@ -1,2 +1,62 @@
> >  This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
> >  semantic patches that might be useful to developers.
> > +
> > +There are two types of semantic patches:
> > +
> > + * Using the semantic transformation to check for bad patterns in the code;
> > +   This is what the original target 'make coccicheck' is designed to do and
> > +   it is expected that any resulting patch indicates a regression.
> > +   The patches resulting from 'make coccicheck' are small and infrequent,
> > +   so once they are found, they can be sent to the mailing list as per usual.
> > +
> > +   Example for introducing new patterns:
> > +   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
> > +   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
> > +
> > +   Example of fixes using this approach:
> > +   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
> > +               a strbuf, 2018-03-25)
> > +   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
> > +
> > +   These types of semantic patches are usually part of testing, c.f.
> > +   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
> > +               to transform, 2018-07-23)
>
> Yup, and I think what we have in 'pu' (including your the_repository
> stuff) falls into this category.

My impression was that the_repository is strongly second category
as the_repository.cocci doesn't fix bad smells of code, but proposes
a refactoring that we agreed on doing.

> I've been paying attention to "make coccicheck" produced patches for
> the past few weeks, and so far, I found it _much_ _much_ more
> pleasant, compared to having to worry about merge conflicts with the
> topics in flight that changes day to day (not just because we add
> new topics or update existing topics, but also the order of the
> merge changes as topics mature at different rates and jumps over
> each other in master..pu history), that "make coccicheck" after
> topics are merged to integration branches fixes these issues up as
> needed.

So from your end we would not need the "pending" category as long
as the follow ups come in a timely manner?

>
> > +   3) Apply the semantic patch only partially, where needed for the patch series
> > +      that motivates the large scale refactoring and then build that series
> > +      on top of it.
>
> It is not quite clear what "needed for the patch series" really
> means in the context of this paragraph.  What are the changes that
> are not needed, that is still produced if we ran "make coccicheck"?

An example for "needed" would be
3f21279f50..bd8737176b
whereas "not needed" is what is in
"treewide: apply cocci patch".

The treewide conversion of e.g. unuse_commit_buffer to
repo_unuse_commit_buffer is nice, but "needed" only in
its followup patch that converts logmsg_reencode as that
calls into the unuse_commit_buffer.

> Are they wrong changes (e.g. a carelessly written read_cache() to
> read_index(&the_index) conversion may munge the implementation of
> read_cache(...) { return read_index(&the_index, ...); } and make
> inifinite recursion)?  Or are they "correct but not immediately
> necessary" (e.g. because calling read_cache() does not hurt until
> that function gets removed, so rewriting the callers to call
> read_index() with &the_index may be correct but not immediately
> necessary)?

the latter. I assume correctness (of the semantic patch) to be a given,
but this is all about timing, i.e. how can I send the series without breaking
others in flight.

>
> > +      By applying the semantic patch only partially where needed, the series
> > +      is less likely to conflict with other series in flight.
>
> That is correct.
>
> > +      To make it possible to apply the semantic patch partially, there needs
> > +      to be mechanism for backwards compatibility to keep those places working
> > +      where the semantic patch is not applied. This can be done via a
> > +      wrapper function that has the exact name and signature as the function
> > +      to be changed.
>
> OK, so this argues for leaving read_cache()-like things to help
> other in-flight topics, while a change to encourage the use of
> read_index() takes place.  That also makes sense, and this directly
> relates to "less likely to conflict" benefit you mentioned above.

ok.

>
> But it is still unclear to me then what are "necessary".
>
> ... goes and thinks ...
>
> OK, so a series that allows a codepath to work on an arbitrary
> in-core istate, for example, may need to update a function to take
> istate and use it to call read_index(istate...), and the old code in
> such a call chain must have used read_cache(), always operating on
> &the_index.  For the purpose of that series, it does not matter if
> other codepaths that are not involved in the callchain the series
> wants to update are still only working on &the_index, so a change to
> turn read_cache() into read_index(&the_index) is *not* necessary
> (but still would be correct) and should be left out of the series.
> But any change the series makes to the callchain in question that
> turns read_cache() into read_index() with something call-specific
> (not &the_index) is a necesary one.  Is that a reasonable example
> of what these paragraphs wanted to convey with the distinction
> between "needed for the patch series" and other changes?

Exactly.

Maybe another way to phrase it is to explain the two series
independently of each other:

 1) Create the semantic patch series containing
 1a) - a *.pending.cocci semantic patch
 1b) - forward/backward compatibility providers
         (wrapper/defines)
 1c) then send the semantic patch series to the list

 2a) Write the other series as if (1) doesn't exist
    This means there will be some upgrades of
    the call chain from read_cache() to read_index()
 2b) Coincidentally these upgrades are the same
   as (1) would have produced. That's the whole
   trick.
 2c) send of this series independently of (1)

This can be done for read_cache / read_index as
they both exist already, but when read_index is
new to the code base, we'd need (2) to rely on (1b).

And that is why this patch sounded complicated.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] coccicheck: introduce 'pending' semantic patches
  2018-11-08 20:52   ` [PATCH] " Stefan Beller
  2018-11-09  4:55     ` Martin Ågren
  2018-11-09  5:18     ` Junio C Hamano
@ 2018-11-10  0:10     ` Stefan Beller
  2018-11-10 20:37       ` Martin Ågren
                         ` (2 more replies)
  2 siblings, 3 replies; 43+ messages in thread
From: Stefan Beller @ 2018-11-10  0:10 UTC (permalink / raw)
  To: sbeller; +Cc: git, jonathantanmy, szeder.dev, martin.agren, gitster

From: SZEDER Gábor <szeder.dev@gmail.com>

Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
handle them separately in a new `make coccicheck-pending` instead.
This means that we can separate "critical" patches from "FYI" patches.
The former target can continue causing Travis to fail its static
analysis job, while the latter can let us keep an eye on ongoing
(pending) transitions without them causing too much fallout.

Document the intended use-cases around these two targets.
As the process around the pending patches is not yet fully explored,
leave that out.

Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

I dialed back on the workflow, as we may want to explore it first
before writing it down.

Stefan

 Makefile                  |  7 +++++--
 contrib/coccinelle/README | 41 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index bbfbb4292d..6bc4654828 100644
--- a/Makefile
+++ b/Makefile
@@ -2740,9 +2740,12 @@ endif
 	then \
 		echo '    ' SPATCH result: $@; \
 	fi
-coccicheck: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.cocci))
+coccicheck: $(addsuffix .patch,$(filter-out %.pending.cocci,$(wildcard contrib/coccinelle/*.cocci)))
 
-.PHONY: coccicheck
+# See contrib/coccinelle/README
+coccicheck-pending: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.pending.cocci))
+
+.PHONY: coccicheck coccicheck-pending
 
 ### Installation rules
 
diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
index 9c2f8879c2..f0e80bd7f0 100644
--- a/contrib/coccinelle/README
+++ b/contrib/coccinelle/README
@@ -1,2 +1,43 @@
 This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
 semantic patches that might be useful to developers.
+
+There are two types of semantic patches:
+
+ * Using the semantic transformation to check for bad patterns in the code;
+   The target 'make coccicheck' is designed to check for these patterns and
+   it is expected that any resulting patch indicates a regression.
+   The patches resulting from 'make coccicheck' are small and infrequent,
+   so once they are found, they can be sent to the mailing list as per usual.
+
+   Example for introducing new patterns:
+   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
+   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)
+
+   Example of fixes using this approach:
+   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
+               a strbuf, 2018-03-25)
+   f919ffebed (Use MOVE_ARRAY, 2018-01-22)
+
+   These types of semantic patches are usually part of testing, c.f.
+   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
+               to transform, 2018-07-23)
+
+ * Using semantic transformations in large scale refactorings throughout
+   the code base.
+
+   When applying the semantic patch into a real patch, sending it to the
+   mailing list in the usual way, such a patch would be expected to have a
+   lot of textual and semantic conflicts as such large scale refactorings
+   change function signatures that are used widely in the code base.
+   A textual conflict would arise if surrounding code near any call of such
+   function changes. A semantic conflict arises when other patch series in
+   flight introduce calls to such functions.
+
+   So to aid these large scale refactorings, semantic patches can be used.
+   However we do not want to store them in the same place as the checks for
+   bad patterns, as then automated builds would fail.
+   That is why semantic patches 'contrib/coccinelle/*.pending.cocci'
+   are ignored for checks, and can be applied using 'make coccicheck-pending'.
+
+   This allows to expose plans of pending large scale refactorings without
+   impacting the bad pattern checks.
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] coccicheck: introduce 'pending' semantic patches
  2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
@ 2018-11-10 20:37       ` Martin Ågren
  2018-11-12 21:19       ` Josh Steadmon
  2018-11-13 16:02       ` SZEDER Gábor
  2 siblings, 0 replies; 43+ messages in thread
From: Martin Ågren @ 2018-11-10 20:37 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git Mailing List, Jonathan Tan, SZEDER Gábor, Junio C Hamano

On Sat, 10 Nov 2018 at 01:10, Stefan Beller <sbeller@google.com> wrote:
> I dialed back on the workflow, as we may want to explore it first
> before writing it down.

Makes sense.

FWIW, this iteration looks good to me.

Martin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] coccicheck: introduce 'pending' semantic patches
  2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
  2018-11-10 20:37       ` Martin Ågren
@ 2018-11-12 21:19       ` Josh Steadmon
  2018-11-13 16:02       ` SZEDER Gábor
  2 siblings, 0 replies; 43+ messages in thread
From: Josh Steadmon @ 2018-11-12 21:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy, szeder.dev, martin.agren, gitster

On 2018.11.09 16:10, Stefan Beller wrote:
> From: SZEDER Gábor <szeder.dev@gmail.com>
> 
> Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
> handle them separately in a new `make coccicheck-pending` instead.
> This means that we can separate "critical" patches from "FYI" patches.
> The former target can continue causing Travis to fail its static
> analysis job, while the latter can let us keep an eye on ongoing
> (pending) transitions without them causing too much fallout.
> 
> Document the intended use-cases around these two targets.
> As the process around the pending patches is not yet fully explored,
> leave that out.
> 
> Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> 
> I dialed back on the workflow, as we may want to explore it first
> before writing it down.
> 
> Stefan

Looks good to me.

Reviewed-by: Josh Steadmon <steadmon@google.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-09 21:58       ` Stefan Beller
@ 2018-11-13 15:48         ` SZEDER Gábor
  2018-11-14  4:15           ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: SZEDER Gábor @ 2018-11-13 15:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, git, Jonathan Tan

On Fri, Nov 09, 2018 at 01:58:01PM -0800, Stefan Beller wrote:
> On Thu, Nov 8, 2018 at 9:18 PM Junio C Hamano <gitster@pobox.com> wrote:
> > Are they wrong changes (e.g. a carelessly written read_cache() to
> > read_index(&the_index) conversion may munge the implementation of
> > read_cache(...) { return read_index(&the_index, ...); } and make
> > inifinite recursion)?  Or are they "correct but not immediately
> > necessary" (e.g. because calling read_cache() does not hurt until
> > that function gets removed, so rewriting the callers to call
> > read_index() with &the_index may be correct but not immediately
> > necessary)?
> 
> the latter. I assume correctness (of the semantic patch) to be a given,

I'm afraid we can't assume that.  As far as correctness is concerned,
I think semantic patches are not different from any other code we
write, i.e. they are potentially buggy.  Perhaps even more so than the
"usual" Git code, because we have long experience writing C and shell
code (with all their quirks and error-proneness), but none of us is
really an expert in writing semantic patches.

Cases in point:

  - 6afedba8c9 (object_id.cocci: match only expressions of type
    'struct object_id', 2018-10-15)
  - 279ffad17d (coccinelle: avoid wrong transformation suggestions
    from commit.cocci, 2018-04-30)
  - cd9a4b6d93 (cocci: use format keyword instead of a literal string,
    2018-01-19); though this one is probably a bug in Coccinelle
    itself
  - c2bb0c1d1e (cocci: avoid self-references in object_id
    transformations, 2016-11-01)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] coccicheck: introduce 'pending' semantic patches
  2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
  2018-11-10 20:37       ` Martin Ågren
  2018-11-12 21:19       ` Josh Steadmon
@ 2018-11-13 16:02       ` SZEDER Gábor
  2 siblings, 0 replies; 43+ messages in thread
From: SZEDER Gábor @ 2018-11-13 16:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy, martin.agren, gitster

On Fri, Nov 09, 2018 at 04:10:52PM -0800, Stefan Beller wrote:
> From: SZEDER Gábor <szeder.dev@gmail.com>
> 
> Teach `make coccicheck` to avoid patches named "*.pending.cocci" and
> handle them separately in a new `make coccicheck-pending` instead.
> This means that we can separate "critical" patches from "FYI" patches.
> The former target can continue causing Travis to fail its static
> analysis job, while the latter can let us keep an eye on ongoing
> (pending) transitions without them causing too much fallout.
> 
> Document the intended use-cases around these two targets.
> As the process around the pending patches is not yet fully explored,
> leave that out.
> 
> Based-on-work-by: SZEDER Gábor <szeder.dev@gmail.com>

Hm, do I need to sign off?
Well, I sign off now anyway, and then Junio can take it if necessary
or just ignore it if not.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>

> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> 
> I dialed back on the workflow, as we may want to explore it first
> before writing it down.

Yeah, I too think that's better to wait with the workflow details
until we gather some experience about how it works out in practice.

Thanks for following it through.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] Makefile: add pending semantic patches
  2018-11-13 15:48         ` SZEDER Gábor
@ 2018-11-14  4:15           ` Junio C Hamano
  0 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2018-11-14  4:15 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Stefan Beller, git, Jonathan Tan

SZEDER Gábor <szeder.dev@gmail.com> writes:

>> > inifinite recursion)?  Or are they "correct but not immediately
>> > necessary" (e.g. because calling read_cache() does not hurt until
>> > that function gets removed, so rewriting the callers to call
>> > read_index() with &the_index may be correct but not immediately
>> > necessary)?
>> 
>> the latter. I assume correctness (of the semantic patch) to be a given,
>
> I'm afraid we can't assume that.  As far as correctness is concerned,
> I think semantic patches are not different from any other code we
> write, i.e. they are potentially buggy.  Perhaps even more so than the
> "usual" Git code, because we have long experience writing C and shell
> code (with all their quirks and error-proneness), but none of us is
> really an expert in writing semantic patches.

All correct.

And applying semantic patches generated from buggy sources can
produce buggy code, just like merging a buggy topic branch does.

These days, I ran coccicheck at the tip of 'pu' (even though this
cost me quite a lot a few times every day) and feed its findings
back to authors of topics that introduce new ones, so that their
topics next time do not need the fix-up at the tip of 'pu' in the
next integration cycle.  That way, the changes mechanically
suggested by coccicheck can still be reviewed in small bite-sized
chunks and hopefully possible problems caused by buggy sources can
either be found in the review process, or discovered at the tip of
'pu'.



^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2018-11-14  4:15 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-30 22:07 [PATCHv2 00/24] Bring more repository handles into our code base] Stefan Beller
2018-10-30 22:07 ` [PATCH 01/24] Makefile: add pending semantic patches Stefan Beller
2018-10-31  6:42   ` Junio C Hamano
2018-11-08 20:52   ` [PATCH] " Stefan Beller
2018-11-09  4:55     ` Martin Ågren
2018-11-09 20:50       ` Stefan Beller
2018-11-09  5:18     ` Junio C Hamano
2018-11-09 21:58       ` Stefan Beller
2018-11-13 15:48         ` SZEDER Gábor
2018-11-14  4:15           ` Junio C Hamano
2018-11-10  0:10     ` [PATCH] coccicheck: introduce 'pending' " Stefan Beller
2018-11-10 20:37       ` Martin Ågren
2018-11-12 21:19       ` Josh Steadmon
2018-11-13 16:02       ` SZEDER Gábor
2018-10-30 22:07 ` [PATCH 02/24] sha1_file: allow read_object to read objects in arbitrary repositories Stefan Beller
2018-10-30 22:07 ` [PATCH 03/24] packfile: allow has_packed_and_bad to handle " Stefan Beller
2018-10-30 22:07 ` [PATCH 04/24] object-store: allow read_object_file_extended to read from " Stefan Beller
2018-10-30 22:07 ` [PATCH 05/24] object-store: prepare read_object_file to deal with " Stefan Beller
2018-10-30 22:07 ` [PATCH 06/24] object-store: prepare has_{sha1, object}_file[_with_flags] to handle " Stefan Beller
2018-10-30 22:08 ` [PATCH 07/24] object: parse_object to honor its repository argument Stefan Beller
2018-10-30 22:08 ` [PATCH 08/24] commit: allow parse_commit* to handle arbitrary repositories Stefan Beller
2018-10-30 22:08 ` [PATCH 09/24] commit-reach.c: allow paint_down_to_common " Stefan Beller
2018-10-30 22:08 ` [PATCH 10/24] commit-reach.c: allow merge_bases_many " Stefan Beller
2018-10-30 22:08 ` [PATCH 11/24] commit-reach.c: allow remove_redundant " Stefan Beller
2018-10-30 22:08 ` [PATCH 12/24] commit-reach.c: allow get_merge_bases_many_0 " Stefan Beller
2018-10-30 22:08 ` [PATCH 13/24] commit-reach: prepare get_merge_bases " Stefan Beller
2018-10-30 22:08 ` [PATCH 14/24] commit-reach: prepare in_merge_bases[_many] " Stefan Beller
2018-10-30 22:08 ` [PATCH 15/24] commit: prepare get_commit_buffer " Stefan Beller
2018-10-30 22:08 ` [PATCH 16/24] commit: prepare repo_unuse_commit_buffer " Stefan Beller
2018-10-30 22:08 ` [PATCH 17/24] commit: prepare logmsg_reencode " Stefan Beller
2018-10-30 22:08 ` [PATCH 18/24] pretty: prepare format_commit_message " Stefan Beller
2018-10-30 22:08 ` [PATCH 19/24] submodule: use submodule repos for object lookup Stefan Beller
2018-11-02 13:03   ` Derrick Stolee
2018-11-02 17:23     ` Stefan Beller
2018-11-02 17:27       ` Derrick Stolee
2018-10-30 22:08 ` [PATCH 20/24] submodule: don't add submodule as odb for push Stefan Beller
2018-10-30 22:08 ` [PATCH 21/24] commit-graph: convert remaining function to handle arbitrary repositories Stefan Beller
2018-10-30 22:08 ` [PATCH 22/24] commit: make free_commit_buffer and release_commit_memory repository agnostic Stefan Beller
2018-10-30 22:08 ` [PATCH 23/24] path.h: make REPO_GIT_PATH_FUNC " Stefan Beller
2018-10-30 22:08 ` [PATCH 24/24] t/helper/test-repository: celebrate independence from the_repository Stefan Beller
2018-10-31  6:41 ` [PATCHv2 00/24] Bring more repository handles into our code base] Junio C Hamano
2018-11-01 19:45   ` Stefan Beller
2018-11-02  2:00     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).