git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Heiko Voigt <hvoigt@hvoigt.net>
To: Stefan Zager <szager@google.com>
Cc: git@vger.kernel.org, gitster@pobox.com, jens.lehmann@web.de
Subject: Re: [PATCH] Enable parallelism in git submodule update.
Date: Sat, 28 Jul 2012 12:22:11 +0200	[thread overview]
Message-ID: <20120728102209.GA13370@book.hvoigt.net> (raw)
In-Reply-To: <20120727185925.793121C0FDC@stefro.sfo.corp.google.com>

Hi Stefan,

neat patch. See below for a few notes.

On Fri, Jul 27, 2012 at 11:37:34AM -0700, Stefan Zager wrote:
> diff --git a/git-submodule.sh b/git-submodule.sh
> index dba4d39..761420a 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -491,6 +492,20 @@ cmd_update()
>  		-r|--rebase)
>  			update="rebase"
>  			;;
> +		-j|--jobs)
> +			case "$2" in
> +			''|-*)
> +				jobs="0"
> +				;;
> +			*)
> +				jobs="$2"
> +				shift
> +				;;
> +			esac
> +			# Don't preserve this arg.
> +			shift
> +			continue
> +			;;
>  		--reference)
>  			case "$2" in '') usage ;; esac
>  			reference="--reference=$2"
> @@ -529,6 +544,12 @@ cmd_update()
>  		cmd_init "--" "$@" || return
>  	fi
>  
> +	if test "$jobs" != "1"
> +	then
> +		module_list "$@" | awk '{print $4}' | xargs -L 1 -P "$jobs" git submodule update $orig_args

I do not see orig_args set anywhere in submodule.sh. It seems the
existing usage of it in cmd_status() is a leftover from commit
98dbe63 when this variable got renamed to orig_flags.

I will follow up with a patch to that location.

Another problem here is the passing of arguments. Have a look at
a7eff1a8 to see how this was solved for other locations.

The next thing I noticed is that the parallelism is not recursive. You
drop the option and only execute the first depth in parallel. How about
using the amount of modules defined by arguments left in $@ as an
indicator whether you need to fork parallel execution or not. If there
is exactly one you do the update if there are more you do the parallel
thing. That way you can just keep passing the --jobs flag to the
subprocesses.

The next question to solve is UI: Since the output lines of the parallel
update jobs will be mixed we need some way to distinguish them. Imagine
one of the update fails somewhere how do we find out which it was?

Two possible solutions come to my mind:

 1. Prefix each line with a job number. This way you can distinguish
    which process outputted what and still have immediate feedback.

 2. Cache the output (to stderr and stdout) of each job and output it
    once one job is done. I imagine this needs some infrastructure which
    we need to implement. We already have some ideas how to collect such
    output in C here[1].

I would prefer solution 2 since the output of 1 will be hard to read but
I guess we could start with 1 and then move over to 2 later on.

Cheers Heiko

[1] http://article.gmane.org/gmane.comp.version-control.git/197747

  parent reply	other threads:[~2012-07-28 10:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-27 18:37 [PATCH] Enable parallelism in git submodule update Stefan Zager
2012-07-27 21:38 ` Junio C Hamano
     [not found]   ` <CAHOQ7J_jYAe7r1q6Cg9OJb8f+79UfS=JfRk9NrS4R4a+oLM8LA@mail.gmail.com>
2012-07-27 23:25     ` Junio C Hamano
2012-07-28 10:52       ` Heiko Voigt
2012-07-29 21:59         ` Junio C Hamano
2012-07-28 10:22 ` Heiko Voigt [this message]
2012-07-28 12:19   ` [PATCH] cleanup argument passing in submodule status command Heiko Voigt
2012-07-29  6:22     ` Junio C Hamano
2012-07-29 15:29       ` Jens Lehmann
2012-07-29 21:57         ` Junio C Hamano
2012-07-29 15:37 ` [PATCH] Enable parallelism in git submodule update Jens Lehmann
2012-11-03 19:07   ` Jens Lehmann
  -- strict thread matches above, loose matches on Subject: below --
2012-10-30 18:03 szager
2012-10-30 18:03 szager
2012-10-30 18:11 ` Stefan Zager
2012-11-02 21:49   ` Stefan Zager
2012-11-03 15:42   ` Jens Lehmann
2012-11-03 18:44     ` Phil Hord
2012-11-03 19:13       ` Jens Lehmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120728102209.GA13370@book.hvoigt.net \
    --to=hvoigt@hvoigt.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jens.lehmann@web.de \
    --cc=szager@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).