[PATCH 0/1] ci: split linux-gcc into linux-gcc and linux-gcc-extra

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/1] ci: split linux-gcc into linux-gcc and linux-gcc-extra
@ 2019-06-13 12:53 Johannes Schindelin via GitGitGadget
  2019-06-13 12:53 ` [PATCH 1/1] ci: split the `linux-gcc` job into two jobs Johannes Schindelin via GitGitGadget
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-06-13 12:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

For people like me, who often look at our CI builds, it is hard to tell
whether test suite failures in the linux-gcc job stem from the first make
test run, or from the second one, after setting all kinds of GIT_TEST_* 
variables to non-default values.

Let's make it easier on people like me.

This also helps the problem where the CI builds often finish the other jobs
waaaay before linux-gcc finally finishes, too: linux-gcc and linux-gcc-extra 
can be run in parallel, on different agents.

Johannes Schindelin (1):
  ci: split the `linux-gcc` job into two jobs

 .travis.yml                |  4 ++++
 azure-pipelines.yml        | 39 ++++++++++++++++++++++++++++++++++++++
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  4 ++--
 ci/run-build-and-tests.sh  |  5 ++---
 5 files changed, 49 insertions(+), 7 deletions(-)

base-commit: b697d92f56511e804b8ba20ccbe7bdc85dc66810
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-266%2Fdscho%2Fsplit-gcc-ci-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-266/dscho/split-gcc-ci-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/266
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 12:53 [PATCH 0/1] ci: split linux-gcc into linux-gcc and linux-gcc-extra Johannes Schindelin via GitGitGadget
@ 2019-06-13 12:53 ` Johannes Schindelin via GitGitGadget
  2019-06-13 15:33   ` SZEDER Gábor
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-06-13 12:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This job was abused to not only run the test suite in a regular way but
also with all kinds of `GIT_TEST_*` options set to non-default values.

Let's split this into two, with the `linux-gcc` job running the default
test suite, and the newly-introduced `linux-gcc-extra` job running the
test suite in the "special" ways.

Technically, we would have to build Git only once, but it would not be
obvious how to teach Travis to transport build artifacts, so we keep it
simple and just build Git in both jobs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .travis.yml                |  4 ++++
 azure-pipelines.yml        | 39 ++++++++++++++++++++++++++++++++++++++
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  4 ++--
 ci/run-build-and-tests.sh  |  5 ++---
 5 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index ffb1bc46f2..a6444ee3ab 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -16,6 +16,10 @@ compiler:
 
 matrix:
   include:
+    - env: jobname=linux-gcc-extra
+      os: linux
+      compiler: gcc
+      addons:
     - env: jobname=GIT_TEST_GETTEXT_POISON
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index c329b7218b..4080aa3c45 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -206,6 +206,45 @@ jobs:
       PathtoPublish: t/failed-test-artifacts
       ArtifactName: failed-test-artifacts
 
+- job: linux_gcc_extra
+  displayName: linux-gcc-extra
+  condition: succeeded()
+  pool: Hosted Ubuntu 1604
+  steps:
+  - bash: |
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
+
+       sudo add-apt-repository ppa:ubuntu-toolchain-r/test &&
+       sudo apt-get update &&
+       sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2 language-pack-is git-svn gcc-8 || exit 1
+
+       export jobname=linux-gcc-extra &&
+
+       ci/install-dependencies.sh || exit 1
+       ci/run-build-and-tests.sh || {
+           ci/print-test-failures.sh
+           exit 1
+       }
+
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1
+    displayName: 'ci/run-build-and-tests.sh'
+    env:
+      GITFILESHAREPWD: $(gitfileshare.pwd)
+  - task: PublishTestResults@2
+    displayName: 'Publish Test Results **/TEST-*.xml'
+    inputs:
+      mergeTestResults: true
+      testRunTitle: 'linux-gcc-extra'
+      platform: Linux
+      publishRunAttachments: false
+    condition: succeededOrFailed()
+  - task: PublishBuildArtifacts@1
+    displayName: 'Publish trash directories of failed tests'
+    condition: failed()
+    inputs:
+      PathtoPublish: t/failed-test-artifacts
+      ArtifactName: failed-test-artifacts
+
 - job: osx_clang
   displayName: osx-clang
   condition: succeeded()
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 7f6acdd803..c25bdcdf66 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -9,12 +9,12 @@ P4WHENCE=http://filehost.perforce.com/perforce/r$LINUX_P4_VERSION
 LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VERSION
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-gcc-extra)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-gcc-extra)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
diff --git a/ci/lib.sh b/ci/lib.sh
index 288a5b3884..b16a1454f1 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -154,8 +154,8 @@ export DEFAULT_TEST_TARGET=prove
 export GIT_TEST_CLONE_2GB=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
+linux-clang|linux-gcc|linux-gcc-extra)
+	if [ "$jobname" = linux-gcc -o "$jobname" = linux-gcc-extra ]
 	then
 		export CC=gcc-8
 	fi
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index cdd2913440..b252ff859d 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -11,8 +11,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 esac
 
 make
-make test
-if test "$jobname" = "linux-gcc"
+if test "$jobname" = "linux-gcc-extra"
 then
 	export GIT_TEST_SPLIT_INDEX=yes
 	export GIT_TEST_FULL_IN_PACK_ARRAY=true
@@ -20,8 +19,8 @@ then
 	export GIT_TEST_OE_DELTA_SIZE=5
 	export GIT_TEST_COMMIT_GRAPH=1
 	export GIT_TEST_MULTI_PACK_INDEX=1
-	make test
 fi
+make test
 
 check_unignored_build_artifacts
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 12:53 ` [PATCH 1/1] ci: split the `linux-gcc` job into two jobs Johannes Schindelin via GitGitGadget
@ 2019-06-13 15:33   ` SZEDER Gábor
  2019-06-13 15:56     ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: SZEDER Gábor @ 2019-06-13 15:33 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Junio C Hamano, Johannes Schindelin

On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> This job was abused to not only run the test suite in a regular way but
> also with all kinds of `GIT_TEST_*` options set to non-default values.
> 
> Let's split this into two

Why...?

> with the `linux-gcc` job running the default
> test suite, and the newly-introduced `linux-gcc-extra` job running the
> test suite in the "special" ways.
> 
> Technically, we would have to build Git only once, but it would not be
> obvious how to teach Travis to transport build artifacts, so we keep it
> simple and just build Git in both jobs.
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  .travis.yml                |  4 ++++
>  azure-pipelines.yml        | 39 ++++++++++++++++++++++++++++++++++++++
>  ci/install-dependencies.sh |  4 ++--
>  ci/lib.sh                  |  4 ++--
>  ci/run-build-and-tests.sh  |  5 ++---
>  5 files changed, 49 insertions(+), 7 deletions(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index ffb1bc46f2..a6444ee3ab 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -16,6 +16,10 @@ compiler:
>  
>  matrix:
>    include:
> +    - env: jobname=linux-gcc-extra
> +      os: linux
> +      compiler: gcc
> +      addons:
>      - env: jobname=GIT_TEST_GETTEXT_POISON
>        os: linux
>        compiler:
> diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> index c329b7218b..4080aa3c45 100644
> --- a/azure-pipelines.yml
> +++ b/azure-pipelines.yml
> @@ -206,6 +206,45 @@ jobs:
>        PathtoPublish: t/failed-test-artifacts
>        ArtifactName: failed-test-artifacts
>  
> +- job: linux_gcc_extra
> +  displayName: linux-gcc-extra
> +  condition: succeeded()
> +  pool: Hosted Ubuntu 1604
> +  steps:
> +  - bash: |
> +       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> +
> +       sudo add-apt-repository ppa:ubuntu-toolchain-r/test &&
> +       sudo apt-get update &&
> +       sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2 language-pack-is git-svn gcc-8 || exit 1

This installs packages that will be installed by
'ci/install-dependencies.sh' anyway.

> +
> +       export jobname=linux-gcc-extra &&
> +
> +       ci/install-dependencies.sh || exit 1
> +       ci/run-build-and-tests.sh || {
> +           ci/print-test-failures.sh
> +           exit 1
> +       }
> +
> +       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1
> +    displayName: 'ci/run-build-and-tests.sh'
> +    env:
> +      GITFILESHAREPWD: $(gitfileshare.pwd)
> +  - task: PublishTestResults@2
> +    displayName: 'Publish Test Results **/TEST-*.xml'
> +    inputs:
> +      mergeTestResults: true
> +      testRunTitle: 'linux-gcc-extra'
> +      platform: Linux
> +      publishRunAttachments: false
> +    condition: succeededOrFailed()
> +  - task: PublishBuildArtifacts@1
> +    displayName: 'Publish trash directories of failed tests'
> +    condition: failed()
> +    inputs:
> +      PathtoPublish: t/failed-test-artifacts
> +      ArtifactName: failed-test-artifacts
> +
>  - job: osx_clang
>    displayName: osx-clang
>    condition: succeeded()
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index 7f6acdd803..c25bdcdf66 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -9,12 +9,12 @@ P4WHENCE=http://filehost.perforce.com/perforce/r$LINUX_P4_VERSION
>  LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VERSION
>  
>  case "$jobname" in
> -linux-clang|linux-gcc)
> +linux-clang|linux-gcc|linux-gcc-extra)
>  	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
>  	sudo apt-get -q update
>  	sudo apt-get -q -y install language-pack-is libsvn-perl apache2
>  	case "$jobname" in
> -	linux-gcc)
> +	linux-gcc|linux-gcc-extra)
>  		sudo apt-get -q -y install gcc-8
>  		;;
>  	esac
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 288a5b3884..b16a1454f1 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -154,8 +154,8 @@ export DEFAULT_TEST_TARGET=prove
>  export GIT_TEST_CLONE_2GB=YesPlease
>  
>  case "$jobname" in
> -linux-clang|linux-gcc)
> -	if [ "$jobname" = linux-gcc ]
> +linux-clang|linux-gcc|linux-gcc-extra)
> +	if [ "$jobname" = linux-gcc -o "$jobname" = linux-gcc-extra ]
>  	then
>  		export CC=gcc-8
>  	fi
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index cdd2913440..b252ff859d 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -11,8 +11,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
>  esac
>  
>  make
> -make test
> -if test "$jobname" = "linux-gcc"
> +if test "$jobname" = "linux-gcc-extra"
>  then
>  	export GIT_TEST_SPLIT_INDEX=yes
>  	export GIT_TEST_FULL_IN_PACK_ARRAY=true
> @@ -20,8 +19,8 @@ then
>  	export GIT_TEST_OE_DELTA_SIZE=5
>  	export GIT_TEST_COMMIT_GRAPH=1
>  	export GIT_TEST_MULTI_PACK_INDEX=1
> -	make test
>  fi
> +make test
>  
>  check_unignored_build_artifacts
>  
> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 15:33   ` SZEDER Gábor
@ 2019-06-13 15:56     ` Junio C Hamano
  2019-06-13 16:51       ` Johannes Schindelin
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2019-06-13 15:56 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin

SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via GitGitGadget wrote:
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> 
>> This job was abused to not only run the test suite in a regular way but
>> also with all kinds of `GIT_TEST_*` options set to non-default values.
>> 
>> Let's split this into two
>
> Why...?
>
>> with the `linux-gcc` job running the default
>> test suite, and the newly-introduced `linux-gcc-extra` job running the
>> test suite in the "special" ways.
>> 
>> Technically, we would have to build Git only once, but it would not be
>> obvious how to teach Travis to transport build artifacts, so we keep it
>> simple and just build Git in both jobs.

I had the same reaction.

If it said something like:

    There is no logical reason why these extras need to be tied to
    the linux-gcc platform.  instead of tying the extra
    configuration tests only to linux-gcc, split it so that they are
    also run on all other combinations, and this is merely a first
    step of doing so

it might at least have been possible to judge if the motivation is
sane, but the proposed log message, while it is quite clear what is
being done and what its shortcomings are, is silent on why we would
want to do this in the first place, and that makes it even harder to
swallow the shortcomings.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 15:56     ` Junio C Hamano
@ 2019-06-13 16:51       ` Johannes Schindelin
  2019-06-13 17:43         ` SZEDER Gábor
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2019-06-13 16:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Johannes Schindelin via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 3110 bytes --]

Hi,

On Thu, 13 Jun 2019, Junio C Hamano wrote:

> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
> > On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via GitGitGadget wrote:
> >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >>
> >> This job was abused to not only run the test suite in a regular way but
> >> also with all kinds of `GIT_TEST_*` options set to non-default values.
> >>
> >> Let's split this into two
> >
> > Why...?
> >
> >> with the `linux-gcc` job running the default
> >> test suite, and the newly-introduced `linux-gcc-extra` job running the
> >> test suite in the "special" ways.
> >>
> >> Technically, we would have to build Git only once, but it would not be
> >> obvious how to teach Travis to transport build artifacts, so we keep it
> >> simple and just build Git in both jobs.
>
> I had the same reaction.

So basically you are saying that the cover letter was the wrong location
for this:

	For people like me, who often look at our CI builds, it is hard to
	tell whether test suite failures in the linux-gcc job stem from
	the first make test run, or from the second one, after setting all
	kinds of GIT_TEST_* variables to non-default values.

	Let's make it easier on people like me.

	This also helps the problem where the CI builds often finish the other
	jobs waaaay before linux-gcc finally finishes, too: linux-gcc and
	linux-gcc-extra can be run in parallel, on different agents.

Of course, I would rephrase that a little, because it is kinda okay for a
cover letter, but not for a commit message that needs to be understandable
out of context.

If you agree that this would make the change easier to swallow, I will
make it so.

> If it said something like:
>
>     There is no logical reason why these extras need to be tied to
>     the linux-gcc platform.  instead of tying the extra
>     configuration tests only to linux-gcc, split it so that they are
>     also run on all other combinations, and this is merely a first
>     step of doing so

No, that would double the time taken by the CI build. It already taxes the
patience so much that most contributors already don't bother with it, and
you can see how many times I find issues and have to report them (which
flies in the face of the idea of using automated builds to catch bugs
early): we are seeing the detrimental effects of a regression test suite
that takes too long.

> it might at least have been possible to judge if the motivation is
> sane, but the proposed log message, while it is quite clear what is
> being done and what its shortcomings are, is silent on why we would
> want to do this in the first place, and that makes it even harder to
> swallow the shortcomings.

Okay, two things:

	- it makes it easier to identify *which* settings to use when a
	  test case fails in `linux-gcc` or `linux-gcc-extra`.

	- the overall CI build runtime is no longer dictated by the
	  runtime of this single build, which easily was the slowest of
	  the bunch.

Does that clear up the picture?
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 16:51       ` Johannes Schindelin
@ 2019-06-13 17:43         ` SZEDER Gábor
  2019-06-14 19:35           ` Johannes Schindelin
  0 siblings, 1 reply; 8+ messages in thread
From: SZEDER Gábor @ 2019-06-13 17:43 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git

On Thu, Jun 13, 2019 at 06:51:04PM +0200, Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 13 Jun 2019, Junio C Hamano wrote:
> 
> > SZEDER Gábor <szeder.dev@gmail.com> writes:
> >
> > > On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via GitGitGadget wrote:
> > >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > >>
> > >> This job was abused to not only run the test suite in a regular way but
> > >> also with all kinds of `GIT_TEST_*` options set to non-default values.
> > >>
> > >> Let's split this into two
> > >
> > > Why...?
> > >
> > >> with the `linux-gcc` job running the default
> > >> test suite, and the newly-introduced `linux-gcc-extra` job running the
> > >> test suite in the "special" ways.
> > >>
> > >> Technically, we would have to build Git only once, but it would not be
> > >> obvious how to teach Travis to transport build artifacts, so we keep it
> > >> simple and just build Git in both jobs.
> >
> > I had the same reaction.
> 
> So basically you are saying that the cover letter was the wrong location
> for this:
> 
> 	For people like me, who often look at our CI builds, it is hard to
> 	tell whether test suite failures in the linux-gcc job stem from
> 	the first make test run, or from the second one, after setting all
> 	kinds of GIT_TEST_* variables to non-default values.

Is this really an issue in practice?  In my experience there are only
two (and a half) cases:

  - if both the 'linux-gcc' and 'linux-clang' build jobs fail, then
    it's some sort of a general breakage.

  - if only the 'linux-gcc' build job fails, the 'linux-clang'
    succeeds, then it's a breakage in the test run with the various
    'GIT_TEST_*' test knobs enabled (unless the failing 'linux-gcc'
    build job's runtime is below, say, 5 minutes, in which case it's a
    build error only triggered by GCC(-8), and, as I recall, is rather
    rare).


> 	Let's make it easier on people like me.
> 
> 	This also helps the problem where the CI builds often finish the other
> 	jobs waaaay before linux-gcc finally finishes

This is not the case on Travis CI, where the runtime of the macOS
build jobs are far the longest, so this change won't help anything
there... on the contrary, it would make things slower by spending time
on installing dependencies and building Git in one more build job.

>       too: linux-gcc and
> 	linux-gcc-extra can be run in parallel, on different agents.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-13 17:43         ` SZEDER Gábor
@ 2019-06-14 19:35           ` Johannes Schindelin
  2019-06-25  8:56             ` Johannes Schindelin
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2019-06-14 19:35 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 7198 bytes --]

Hi Gábor,

On Thu, 13 Jun 2019, SZEDER Gábor wrote:

> On Thu, Jun 13, 2019 at 06:51:04PM +0200, Johannes Schindelin wrote:
>
> > On Thu, 13 Jun 2019, Junio C Hamano wrote:
> >
> > > SZEDER Gábor <szeder.dev@gmail.com> writes:
> > >
> > > > On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via
> > > > GitGitGadget wrote:
> > > >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > > >>
> > > >> This job was abused to not only run the test suite in a regular way but
> > > >> also with all kinds of `GIT_TEST_*` options set to non-default values.
> > > >>
> > > >> Let's split this into two
> > > >
> > > > Why...?
> > > >
> > > >> with the `linux-gcc` job running the default
> > > >> test suite, and the newly-introduced `linux-gcc-extra` job running the
> > > >> test suite in the "special" ways.
> > > >>
> > > >> Technically, we would have to build Git only once, but it would not be
> > > >> obvious how to teach Travis to transport build artifacts, so we keep it
> > > >> simple and just build Git in both jobs.
> > >
> > > I had the same reaction.
> >
> > So basically you are saying that the cover letter was the wrong location
> > for this:
> >
> > 	For people like me, who often look at our CI builds, it is hard to
> > 	tell whether test suite failures in the linux-gcc job stem from
> > 	the first make test run, or from the second one, after setting all
> > 	kinds of GIT_TEST_* variables to non-default values.
>
> Is this really an issue in practice?

I don't think that this is the right question. The right question would
be: is this issue possible? And the answer is: yes, quite. The clang and
the GCC toolchains are different enough that they have different bugs and
strengths. And the test suite with extra knobs vs without them *also* is
different enough to expose different bugs. So obviously, you would want
to discern between them [*1*].

But I can even answer the wrong question. The answer is still: yes, quite.

For example, I saw quite a few flaky tests "prefer" one over the other. I
do not recall the specifics (as I investigated at least half a dozeny
flaky tests in the past months, and I am prone to confuse them with one
another), but I distinctly remember debugging via patching
azure-pipelines.yml and ci/ heavily, using the one job that was failing *a
lot* more often (and deleting all the other jobs from that .yml file,
which accelerated the turn-around time, which is *everything* in
debugging).

And even if I had not experienced this. As I said, clang and GCC are
different enough, that's why we have both jobs in the first place. It
sounds rather curious to me that you suggest that they essentially do the
same further below:

> In my experience there are only two (and a half) cases:
>
>   - if both the 'linux-gcc' and 'linux-clang' build jobs fail, then
>     it's some sort of a general breakage.

Sure, that's the easy case.

What I want to help with this patch is the *hard* cases.

>   - if only the 'linux-gcc' build job fails, the 'linux-clang'
>     succeeds, then it's a breakage in the test run with the various
>     'GIT_TEST_*' test knobs enabled (unless the failing 'linux-gcc'
>     build job's runtime is below, say, 5 minutes, in which case it's a
>     build error only triggered by GCC(-8), and, as I recall, is rather
>     rare).

So what you are suggesting is that the part of the `linux-gcc` job where
it tests without all those knobs is totally useless because `linux-clang`
already tested the same stuff?

That does not sound right.

Because by that token, you would want to simply remove that part from the
`linux-gcc` job (instead of splitting out the rest, as my patch does).

I refuse to believe that you are syaing that.

That would sound almost like "We don't need the test suite because 99.9%
of all test cases pass, anyway".

> > 	Let's make it easier on people like me.
> >
> > 	This also helps the problem where the CI builds often finish the
> > 	other jobs waaaay before linux-gcc finally finishes
>
> This is not the case on Travis CI, where the runtime of the macOS
> build jobs are far the longest, so this change won't help anything
> there...

Right, Travis' macOS agents are ridiculously slow.

> on the contrary, it would make things slower by spending time on
> installing dependencies and building Git in one more build job.

No, it wouldn't. Because instead of waiting for the macOS jobs and the
linux-gcc job, we would only wait for the macOS jobs.

The fallacy here is that the 2-3 minutes spent in *two* instead of *one*
agent would accumulate to 2-3 minutes. It's parallel instead.

And once Travis gets faster macOS agents, the Travis build will be overall
faster (instead of now waiting for `linux-gcc` all the time).

Or am I missing anything obvious? I am quite puzzled by your objections,
given your experience with the CI builds. You, too, have *got* to have
experienced the benefits of parallelizing longer-running jobs.

To me, it looks like a no-brainer to split apart a long-running job, to
benefit from running jobs side by side.

Of course, there is also the presentation of the test results, but then,
Travis does not have that. You cannot publish the test results in a visual
manner, nor analyze breakages over time. So in Travis, it does matter less
than in Azure Pipelines (although not by much) what is the name of the job
in which a test failed, it really leaves the developer struggling to get
to the root cause by digging through the entire log. In Azure Pipelines, I
click on the Tests tab (see e.g.
https://dev.azure.com/git/git/_build/results?buildId=677&view=ms.vss-test-web.build-test-results-tab)
and I see immediately not only what test script, not only what test case
failed, being able to see the corresponding part of the verbose output by
clicking on the test case title, I also immediately see in what job it
failed, which can help me debug a lot faster. Also, the analytics section
allows me to see in which jobs tests failed consistently.

And with the split I proposed, it would be obvious from that page, at one
glance, whether I need to use the GIT_TEST_* knobs to reproduce a test
failure locally or not.

So: I am still very, very puzzled why you think it to be a good idea to
have a job that runs twice as long as all the other Linux jobs, that makes
regressions harder to investigate than necessary, and that makes the
overall analysis e.g. of flaky tests more difficult than with my patch.

Ciao,
Dscho

Footnote *1*: Now, a question that Junio raised was whether we should have
the test runs with the GIT_TEST_* knobs *also* for clang. Alas, here I
would like to throw in the argument that a "too complete" test suite is so
useless as to be a wasted effort because *nobody runs it if it takes too
long*. And given the impression I have that Junio does not bother looking
at the CI builds, I wonder why he wanted this in the first place, it's not
like it would benefit him.

> >       too: linux-gcc and
> > 	linux-gcc-extra can be run in parallel, on different agents.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] ci: split the `linux-gcc` job into two jobs
  2019-06-14 19:35           ` Johannes Schindelin
@ 2019-06-25  8:56             ` Johannes Schindelin
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Schindelin @ 2019-06-25  8:56 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 2308 bytes --]

Hi Gábor,

On Fri, 14 Jun 2019, Johannes Schindelin wrote:

> On Thu, 13 Jun 2019, SZEDER Gábor wrote:
>
> > On Thu, Jun 13, 2019 at 06:51:04PM +0200, Johannes Schindelin wrote:
> >
> > > On Thu, 13 Jun 2019, Junio C Hamano wrote:
> > >
> > > > SZEDER Gábor <szeder.dev@gmail.com> writes:
> > > >
> > > > > On Thu, Jun 13, 2019 at 05:53:51AM -0700, Johannes Schindelin via
> > > > > GitGitGadget wrote:
> > > > >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > > > >>
> > > > >> This job was abused to not only run the test suite in a regular
> > > > >> way but also with all kinds of `GIT_TEST_*` options set to
> > > > >> non-default values.
> > > > >>
> > > > >> Let's split this into two
> > > > >
> > > > > Why...?
> > > > >
> > > > >> with the `linux-gcc` job running the default test suite, and
> > > > >> the newly-introduced `linux-gcc-extra` job running the test
> > > > >> suite in the "special" ways.
> > > > >>
> > > > >> Technically, we would have to build Git only once, but it would
> > > > >> not be obvious how to teach Travis to transport build
> > > > >> artifacts, so we keep it simple and just build Git in both
> > > > >> jobs.
> > > >
> > > > I had the same reaction.
> > >
> > > So basically you are saying that the cover letter was the wrong
> > > location for this:
> > >
> > > 	For people like me, who often look at our CI builds, it is hard to
> > > 	tell whether test suite failures in the linux-gcc job stem from
> > > 	the first make test run, or from the second one, after setting all
> > > 	kinds of GIT_TEST_* variables to non-default values.
> >
> > Is this really an issue in practice?
>
> I don't think that this is the right question.

I still think that this is the wrong question.

To put more water down the drain, I would like to challenge you to look at
this build and tell me as fast as you can what half of the linux-gcc job
fails, and whether the other half of the job fails, too, or whether the
test cases succeed there, and if so, why:

https://dev.azure.com/gitgitgadget/git/_build/results?buildId=11410&view=ms.vss-test-web.build-test-results-tab

We really need to split linux-gcc. It's not right that it throws two
completely separate concerns into the same bucket.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-06-25  8:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-13 12:53 [PATCH 0/1] ci: split linux-gcc into linux-gcc and linux-gcc-extra Johannes Schindelin via GitGitGadget
2019-06-13 12:53 ` [PATCH 1/1] ci: split the `linux-gcc` job into two jobs Johannes Schindelin via GitGitGadget
2019-06-13 15:33   ` SZEDER Gábor
2019-06-13 15:56     ` Junio C Hamano
2019-06-13 16:51       ` Johannes Schindelin
2019-06-13 17:43         ` SZEDER Gábor
2019-06-14 19:35           ` Johannes Schindelin
2019-06-25  8:56             ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).