git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Han-Wen Nienhuys <hanwenn@gmail.com>,
	Han-Wen Nienhuys <hanwen@google.com>
Subject: Re: [PATCH] Make some commit hashes in tests reproducible
Date: Tue, 7 Jul 2020 16:54:18 -0400	[thread overview]
Message-ID: <20200707205418.GB1396940@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqfta33y0m.fsf@gitster.c.googlers.com>

On Tue, Jul 07, 2020 at 12:50:33PM -0700, Junio C Hamano wrote:

> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > Adds test_tick to t5801-remote-helpers.sh and t3203-branch-output.sh
> 
> That can be read from the patch.  Also the subject tells us a half
> of what you want to achieve with this change (by the way, your
> subject is malformatted and lacks the <area>: prefix; perhaps
> "[PATCH] tests: make commit object names reproducible" or something),
> but the readers are left hanging without knowing what motivated the
> change.  Do any test pieces in these scripts change their behaviour
> based on what exact object names are assigned to them, making them
> flaky and hard to test, and if so which one and in what way?

I agree that more discussion would be nice.

But I kind of wonder if we should be aiming for more determinism in
general, just to make debugging and reproduction simpler.

I.e., rather than pointing to _these_ tests, I think we could make an
argument for setting up a known timestamp in the test environment.
test_tick would continue to tick forward as usual, but for any tests
that don't use it, they'd by default get a deterministic outcome.

Something like this:

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 618a7c8d5b..d8adf5a199 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -441,15 +441,18 @@ TEST_AUTHOR_LOCALNAME=author
 TEST_AUTHOR_DOMAIN=example.com
 GIT_AUTHOR_EMAIL=${TEST_AUTHOR_LOCALNAME}@${TEST_AUTHOR_DOMAIN}
 GIT_AUTHOR_NAME='A U Thor'
+GIT_AUTHOR_DATE='1112911993 -0700'
 TEST_COMMITTER_LOCALNAME=committer
 TEST_COMMITTER_DOMAIN=example.com
 GIT_COMMITTER_EMAIL=${TEST_COMMITTER_LOCALNAME}@${TEST_COMMITTER_DOMAIN}
 GIT_COMMITTER_NAME='C O Mitter'
+GIT_COMMITTER_DATE='1112911993 -0700'
 GIT_MERGE_VERBOSITY=5
 GIT_MERGE_AUTOEDIT=no
 export GIT_MERGE_VERBOSITY GIT_MERGE_AUTOEDIT
 export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME
 export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME
+export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
 export EDITOR
 
 # Tests using GIT_TRACE typically don't want <timestamp> <file>:<line> output

That's using the same start point as test_tick, though really it could
be anything. I've intentionally _not_ called test_tick at the beginning
of each script, because that would throw off all of the scripts that do
use it by one tick (whereas the first test_tick will overwrite these
values).

Trying to devil's advocate against this line of reasoning:

  - using the current timestamp introduces more randomness into the test
    suite, which could uncover problems. I'm somewhat skeptical, as the
    usual outcome I see here is that we realize a test's expected output
    is simply racy, and we remove the raciness by using test_tick

  - using the current timestamp could alert us to problems that occur
    only as the clock ticks forward (e.g., if we had a Y2021 bug, we'd
    notice when the clock rolled forward).

  - some tests may rely on having a "recent" timestamp in commits (e.g.,
    when looking at relative date handling). I think all of the
    relative-time tests already use a specific date, though, because
    otherwise we have too many problems with raciness.

Note that the patch above does seem to cause two tests to fail. One of
them I _suspect_ is a raciness problem (order of commits output changes,
which implies the original was expecting the time to increment between
two commits without running test_tick). And the other looks like some
weird interaction with the perl test harness. I'd be happy to dig into
both if this direction seems sane.

-Peff

  reply	other threads:[~2020-07-07 20:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 19:23 [PATCH] Make some commit hashes in tests reproducible Han-Wen Nienhuys via GitGitGadget
2020-07-07 19:50 ` Junio C Hamano
2020-07-07 20:54   ` Jeff King [this message]
2020-07-07 21:35     ` Junio C Hamano
2020-07-07 21:52       ` Jeff King
2020-07-07 22:37         ` Junio C Hamano
2020-07-07 21:41     ` Jeff King
2020-07-08  5:06   ` Han-Wen Nienhuys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200707205418.GB1396940@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hanwen@google.com \
    --cc=hanwenn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).