git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Heiko Voigt <hvoigt@hvoigt.net>
Cc: Jeff King <peff@peff.net>, Stefan Beller <sbeller@google.com>,
	"git\@vger.kernel.org" <git@vger.kernel.org>,
	Jens Lehmann <Jens.Lehmann@web.de>,
	Fredrik Gustafsson <iveqy@iveqy.com>,
	Leandro Lucarella <leandro.lucarella@sociomantic.com>
Subject: Re: [PATCH 3/2] batch check whether submodule needs pushing into one call
Date: Fri, 16 Sep 2016 10:59:37 -0700	[thread overview]
Message-ID: <xmqq8turlo8m.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160915121044.GA96648@book.hvoigt.net> (Heiko Voigt's message of "Thu, 15 Sep 2016 14:10:44 +0200")

Heiko Voigt <hvoigt@hvoigt.net> writes:

> +static void append_hash_to_argv(const unsigned char sha1[20], void *data)
>  {
> -	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
> +	struct argv_array *argv = (struct argv_array *) data;
> +	argv_array_push(argv, sha1_to_hex(sha1));
> +}

Hmph, why do I think I've seen this before in the previous patch?

    ... scans through this patch and finds that a similar one is
    removed ;-)

OK.  This makes sense.

> +static void check_has_hash(const unsigned char sha1[20], void *data)
> +{
> +	int *has_hash = (int *) data;
> +
> +	if (!lookup_commit_reference(sha1))
> +		*has_hash = 0;
> +}
> +
> +static int submodule_has_hashes(const char *path, struct sha1_array *hashes)
> +{
> +	int has_hash = 1;
> +
> +	if (add_submodule_odb(path))
> +		return 0;
> +
> +	sha1_array_for_each_unique(hashes, check_has_hash, &has_hash);
> +	return has_hash;
> +}
> +
> +static int submodule_needs_pushing(const char *path, struct sha1_array *hashes)
> +{
> +	if (!submodule_has_hashes(path, hashes))
>  		return 0;

I think you meant well, but this optimization is wrong.  A mere
presence of an object does not mean that the current tip can reach
that object.  Imagine you pushed commit A earlier to them at the
tip, then pushed commit A~ to them at the tip, which is the current
state of the remote of the submodule, and since them they may have
GC'ed.  They no longer have the commit A.

For that matter, because you are doing this check by pretending as
if all the submodule objects are in the object store of the current
superproject you are working in, and saying "it exists there in the
submodule repository" when the only thing you know is it exists in
an object store of either the submodule repository, the superproject
repository, or any of the other submodule repositories, you really
cannot tell much from a mere presence of an object.  Not just the
remote of the submodule repository you are interested in, but the
submodule repository you are interested in itself, may not have that
object.

Drop the previous two helper functions and this short-cut.

>  	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
>  		struct child_process cp = CHILD_PROCESS_INIT;
> -		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
> +
> +		argv_array_push(&cp.args, "rev-list");
> +		sha1_array_for_each_unique(hashes, append_hash_to_argv, &cp.args);
> +		argv_array_pushl(&cp.args, "--not", "--remotes", "-n", "1" , NULL);
> +
>  		struct strbuf buf = STRBUF_INIT;
>  		int needs_pushing = 0;
>  
> -		argv[1] = sha1_to_hex(sha1);
> -		cp.argv = argv;
>  		prepare_submodule_repo_env(&cp.env_array);
>  		cp.git_cmd = 1;
>  		cp.no_stdin = 1;
>  		cp.out = -1;
>  		cp.dir = path;
>  		if (start_command(&cp))
> -			die("Could not run 'git rev-list %s --not --remotes -n 1' command in submodule %s",
> -				sha1_to_hex(sha1), path);
> +			die("Could not run 'git rev-list <hashes> --not --remotes -n 1' command in submodule %s",
> +					path);
>  		if (strbuf_read(&buf, cp.out, 41))
>  			needs_pushing = 1;
>  		finish_command(&cp);
> @@ -601,21 +628,6 @@ static void find_unpushed_submodule_commits(struct commit *commit,
>  	diff_tree_combined_merge(commit, 1, &rev);
>  }

Good.  This is the optimization I alluded to in the review of the
first one in the series.

  parent reply	other threads:[~2016-09-16 17:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-24 17:30 [PATCHv2] push: change submodule default to check Stefan Beller
2016-08-24 18:38 ` Junio C Hamano
     [not found] ` <20160824183112.ceekegpzavnbybxp@sigill.intra.peff.net>
2016-08-24 19:37   ` Junio C Hamano
2016-08-24 21:26     ` Junio C Hamano
2016-08-24 22:37     ` Stefan Beller
2016-08-24 23:01       ` Jeff King
2016-09-14 17:31         ` [PATCH 1/2] serialize collection of changed submodules Heiko Voigt
2016-09-14 22:30           ` Junio C Hamano
2016-09-15 12:10             ` [PATCH 3/2] batch check whether submodule needs pushing into one call Heiko Voigt
2016-09-15 21:08               ` Junio C Hamano
2016-09-16  9:40                 ` Heiko Voigt
2016-09-16 12:31                   ` Heiko Voigt
2016-09-16 18:13                     ` Junio C Hamano
2016-09-19 20:08                       ` Heiko Voigt
2016-09-16 17:59               ` Junio C Hamano [this message]
2016-09-19 19:58                 ` Heiko Voigt
2016-09-15 12:18             ` [PATCH 4/2] use actual start hashes for submodule push check instead of local refs Heiko Voigt
2016-09-16 17:27           ` [PATCH 1/2] serialize collection of changed submodules Junio C Hamano
2016-09-19 19:44             ` Heiko Voigt
2016-09-14 17:51         ` [PATCH 2/2] serialize collection of refs that contain submodule changes Heiko Voigt
2016-09-14 19:46           ` Heiko Voigt
2016-09-14 20:04             ` Stefan Beller
2016-09-16 17:47           ` Junio C Hamano
2016-09-19 19:51             ` Heiko Voigt
2016-09-19 20:09               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq8turlo8m.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=hvoigt@hvoigt.net \
    --cc=iveqy@iveqy.com \
    --cc=leandro.lucarella@sociomantic.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).