git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Heiko Voigt <hvoigt@hvoigt.net>
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Jens Lehmann <Jens.Lehmann@web.de>,
	Fredrik Gustafsson <iveqy@iveqy.com>,
	Leandro Lucarella <leandro.lucarella@sociomantic.com>
Subject: Re: [PATCH v2 1/3] serialize collection of changed submodules
Date: Fri, 7 Oct 2016 10:59:29 -0700	[thread overview]
Message-ID: <CAGZ79kZiY56-84aThH1F02E_HzCTAK3KSYLbyP1D5GUAt892cw@mail.gmail.com> (raw)
In-Reply-To: <10cd5be93601bc52388100e80b6c6735a7cacfb4.1475851621.git.hvoigt@hvoigt.net>

On Fri, Oct 7, 2016 at 8:06 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> To check whether a submodule needs to be pushed we need to collect all
> changed submodules. Lets collect them first and then execute the
> possibly expensive test whether certain revisions are already pushed
> only once per submodule.
>
> There is further potential for optimization since we can assemble one
> command and only issued that instead of one call for each remote ref in
> the submodule.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
>  submodule.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 58 insertions(+), 5 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index 2de06a3351..59c9d15905 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -554,19 +554,34 @@ static int submodule_needs_pushing(const char *path, const unsigned char sha1[20
>         return 0;
>  }
>
> +static struct sha1_array *get_sha1s_from_list(struct string_list *submodules,
> +               const char *path)

So this will take the stringlist `submodules` and insert the path into it,
if it wasn't already in there. In case it is newly inserted, add a sha1_array
as util, so each inserted path has it's own empty array.

So it is both init of the data structures as well as retrieving them. I was
initially confused by the name as I assumed it would give you sha1s out
of a string list (e.g. transform strings to internal sha1 things).
Maybe it's just
me having a hard time to understand that, but I feel like the name could be
improved.

    lookup_sha1_list_by_path,
    insert_path_and_return_sha1_list ?

> +{
> +       struct string_list_item *item;
> +
> +       item = string_list_insert(submodules, path);
> +       if (item->util)
> +               return (struct sha1_array *) item->util;
> +
> +       /* NEEDSWORK: should we have sha1_array_init()? */
> +       item->util = xcalloc(1, sizeof(struct sha1_array));
> +       return (struct sha1_array *) item->util;
> +}
> +
>  static void collect_submodules_from_diff(struct diff_queue_struct *q,
>                                          struct diff_options *options,
>                                          void *data)
>  {
>         int i;
> -       struct string_list *needs_pushing = data;
> +       struct string_list *submodules = data;
>
>         for (i = 0; i < q->nr; i++) {
>                 struct diff_filepair *p = q->queue[i];
> +               struct sha1_array *hashes;
>                 if (!S_ISGITLINK(p->two->mode))
>                         continue;
> -               if (submodule_needs_pushing(p->two->path, p->two->oid.hash))
> -                       string_list_insert(needs_pushing, p->two->path);
> +               hashes = get_sha1s_from_list(submodules, p->two->path);
> +               sha1_array_append(hashes, p->two->oid.hash);
>         }
>  }
>
> @@ -582,14 +597,41 @@ static void find_unpushed_submodule_commits(struct commit *commit,
>         diff_tree_combined_merge(commit, 1, &rev);
>  }
>
> +struct collect_submodule_from_sha1s_data {
> +       char *submodule_path;
> +       struct string_list *needs_pushing;
> +};
> +
> +static void collect_submodules_from_sha1s(const unsigned char sha1[20],
> +               void *data)
> +{
> +       struct collect_submodule_from_sha1s_data *me =
> +               (struct collect_submodule_from_sha1s_data *) data;
> +
> +       if (submodule_needs_pushing(me->submodule_path, sha1))
> +               string_list_insert(me->needs_pushing, me->submodule_path);
> +}
> +
> +static void free_submodules_sha1s(struct string_list *submodules)
> +{
> +       int i;
> +       for (i = 0; i < submodules->nr; i++) {
> +               struct string_list_item *item = &submodules->items[i];

You do not seem to make use of `i` explicitely, so
for_each_string_list_item might be more readable here?


> +               struct sha1_array *hashes = (struct sha1_array *) item->util;
> +               sha1_array_clear(hashes);
> +       }
> +       string_list_clear(submodules, 1);
> +}
> +
>  int find_unpushed_submodules(unsigned char new_sha1[20],
>                 const char *remotes_name, struct string_list *needs_pushing)
>  {
>         struct rev_info rev;
>         struct commit *commit;
>         const char *argv[] = {NULL, NULL, "--not", "NULL", NULL};
> -       int argc = ARRAY_SIZE(argv) - 1;
> +       int argc = ARRAY_SIZE(argv) - 1, i;
>         char *sha1_copy;
> +       struct string_list submodules = STRING_LIST_INIT_DUP;
>
>         struct strbuf remotes_arg = STRBUF_INIT;
>
> @@ -603,12 +645,23 @@ int find_unpushed_submodules(unsigned char new_sha1[20],
>                 die("revision walk setup failed");
>
>         while ((commit = get_revision(&rev)) != NULL)
> -               find_unpushed_submodule_commits(commit, needs_pushing);
> +               find_unpushed_submodule_commits(commit, &submodules);
>
>         reset_revision_walk();
>         free(sha1_copy);
>         strbuf_release(&remotes_arg);
>
> +       for (i = 0; i < submodules.nr; i++) {
> +               struct string_list_item *item = &submodules.items[i];

You do not seem to make use of `i` explicitely, so
for_each_string_list_item might be more readable here?


> +               struct collect_submodule_from_sha1s_data data;
> +               data.submodule_path = item->string;
> +               data.needs_pushing = needs_pushing;
> +               sha1_array_for_each_unique((struct sha1_array *) item->util,
> +                               collect_submodules_from_sha1s,
> +                               &data);
> +       }
> +       free_submodules_sha1s(&submodules);
> +
>         return needs_pushing->nr;
>  }
>
> --
> 2.10.1.637.g09b28c5
>

  reply	other threads:[~2016-10-07 18:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-07 15:06 [PATCH v2 0/3] Speedup finding of unpushed submodules Heiko Voigt
2016-10-07 15:06 ` [PATCH v2 1/3] serialize collection of changed submodules Heiko Voigt
2016-10-07 17:59   ` Stefan Beller [this message]
2016-10-10 22:43     ` Junio C Hamano
2016-10-12 13:00       ` Heiko Voigt
2016-10-12 17:18         ` Junio C Hamano
2016-10-13 15:27           ` Heiko Voigt
2016-10-12 13:11     ` Heiko Voigt
2016-10-07 15:06 ` [PATCH v2 2/3] serialize collection of refs that contain submodule changes Heiko Voigt
2016-10-07 18:16   ` Stefan Beller
2016-10-12 13:10     ` Heiko Voigt
2016-10-20 23:00       ` Stefan Beller
2016-10-10 22:48   ` Junio C Hamano
2016-10-07 15:06 ` [PATCH v2 3/3] batch check whether submodule needs pushing into one call Heiko Voigt
2016-10-07 18:30   ` Stefan Beller
2016-10-10 22:56   ` Junio C Hamano
2016-10-12 13:33     ` Heiko Voigt
2016-10-12 17:37       ` Junio C Hamano
2016-10-13 15:59         ` Heiko Voigt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kZiY56-84aThH1F02E_HzCTAK3KSYLbyP1D5GUAt892cw@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hvoigt@hvoigt.net \
    --cc=iveqy@iveqy.com \
    --cc=leandro.lucarella@sociomantic.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).