From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.6 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 511681FCA9 for ; Wed, 14 Sep 2016 17:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761999AbcINRbd (ORCPT ); Wed, 14 Sep 2016 13:31:33 -0400 Received: from smtprelay06.ispgateway.de ([80.67.31.101]:49379 "EHLO smtprelay06.ispgateway.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757740AbcINRbc (ORCPT ); Wed, 14 Sep 2016 13:31:32 -0400 Received: from [84.131.240.40] (helo=sandbox) by smtprelay06.ispgateway.de with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.84) (envelope-from ) id 1bkE1u-0002r0-9P; Wed, 14 Sep 2016 19:31:26 +0200 Date: Wed, 14 Sep 2016 19:31:24 +0200 From: Heiko Voigt To: Jeff King Cc: Stefan Beller , Junio C Hamano , "git@vger.kernel.org" , Jens Lehmann , Fredrik Gustafsson , Leandro Lucarella Subject: [PATCH 1/2] serialize collection of changed submodules Message-ID: <20160914173124.GA7613@sandbox> References: <20160824173017.24782-1-sbeller@google.com> <20160824183112.ceekegpzavnbybxp@sigill.intra.peff.net> <20160824230115.jhmcr4r7wobj5ejb@sigill.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160824230115.jhmcr4r7wobj5ejb@sigill.intra.peff.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Df-Sender: aHZvaWd0QGh2b2lndC5uZXQ= Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To check whether a submodule needs to be pushed we need to collect all changed submodules. Lets collect them first and then execute the possibly expensive test whether certain revisions are already pushed only once per submodule. There is further potential for optimization since we can assemble one command and only issued that instead of one call for each remote ref in the submodule. Signed-off-by: Heiko Voigt --- Sorry about the late reply. I was not able to process emails until now. Here are two patches that should help to improve the situation and batch up some processing. This one is for repositories with submodules, so that they do not iterate over the same submodule twice with the same hash. The second one will be the one people without submodules are interested in. Cheers Heiko submodule.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 62 insertions(+), 5 deletions(-) diff --git a/submodule.c b/submodule.c index 0ef2ff4..b04c066 100644 --- a/submodule.c +++ b/submodule.c @@ -554,19 +554,38 @@ static int submodule_needs_pushing(const char *path, const unsigned char sha1[20 return 0; } +static struct sha1_array *get_sha1s_from_list(struct string_list *submodules, + const char *path) +{ + struct string_list_item *item; + struct sha1_array *hashes; + + item = string_list_insert(submodules, path); + if (item->util) + return (struct sha1_array *) item->util; + + hashes = (struct sha1_array *) xmalloc(sizeof(struct sha1_array)); + /* NEEDSWORK: should we add an initializer function for + * sha1_array ? */ + memset(hashes, 0, sizeof(struct sha1_array)); + item->util = hashes; + return hashes; +} + static void collect_submodules_from_diff(struct diff_queue_struct *q, struct diff_options *options, void *data) { int i; - struct string_list *needs_pushing = data; + struct string_list *submodules = data; for (i = 0; i < q->nr; i++) { struct diff_filepair *p = q->queue[i]; + struct sha1_array *hashes; if (!S_ISGITLINK(p->two->mode)) continue; - if (submodule_needs_pushing(p->two->path, p->two->oid.hash)) - string_list_insert(needs_pushing, p->two->path); + hashes = get_sha1s_from_list(submodules, p->two->path); + sha1_array_append(hashes, p->two->oid.hash); } } @@ -582,14 +601,41 @@ static void find_unpushed_submodule_commits(struct commit *commit, diff_tree_combined_merge(commit, 1, &rev); } +struct collect_submodule_from_sha1s_data { + char *submodule_path; + struct string_list *needs_pushing; +}; + +static void collect_submodules_from_sha1s(const unsigned char sha1[20], + void *data) +{ + struct collect_submodule_from_sha1s_data *me = + (struct collect_submodule_from_sha1s_data *) data; + + if (submodule_needs_pushing(me->submodule_path, sha1)) + string_list_insert(me->needs_pushing, me->submodule_path); +} + +static void free_submodules_sha1s(struct string_list *submodules) +{ + int i; + for (i = 0; i < submodules->nr; i++) { + struct string_list_item *item = &submodules->items[i]; + struct sha1_array *hashes = (struct sha1_array *) item->util; + sha1_array_clear(hashes); + } + string_list_clear(submodules, 1); +} + int find_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name, struct string_list *needs_pushing) { struct rev_info rev; struct commit *commit; const char *argv[] = {NULL, NULL, "--not", "NULL", NULL}; - int argc = ARRAY_SIZE(argv) - 1; + int argc = ARRAY_SIZE(argv) - 1, i; char *sha1_copy; + struct string_list submodules = STRING_LIST_INIT_DUP; struct strbuf remotes_arg = STRBUF_INIT; @@ -603,12 +649,23 @@ int find_unpushed_submodules(unsigned char new_sha1[20], die("revision walk setup failed"); while ((commit = get_revision(&rev)) != NULL) - find_unpushed_submodule_commits(commit, needs_pushing); + find_unpushed_submodule_commits(commit, &submodules); reset_revision_walk(); free(sha1_copy); strbuf_release(&remotes_arg); + for (i = 0; i < submodules.nr; i++) { + struct string_list_item *item = &submodules.items[i]; + struct collect_submodule_from_sha1s_data data; + data.submodule_path = item->string; + data.needs_pushing = needs_pushing; + sha1_array_for_each_unique((struct sha1_array *) item->util, + collect_submodules_from_sha1s, + &data); + } + free_submodules_sha1s(&submodules); + return needs_pushing->nr; } -- 2.0.2.832.g083c931