git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Elijah Newren <newren@gmail.com>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 2/2] merge-recursive: fix merging a subdirectory into the root directory
Date: Mon, 14 Oct 2019 12:41:46 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1910141211130.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <CABPp-BFNCLJnt4NgFKVxURBGD1Z00gastc5q4ZPjcHmwS=kuFw@mail.gmail.com>

Hi Elijah,

On Sat, 12 Oct 2019, Elijah Newren wrote:

> On Sat, Oct 12, 2019 at 1:37 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> [...]
> > For the record: I am still a huge anti-fan of splitting `setup` test
> > cases from the test cases that do actual things, _unless_ it is
> > _one_, and _only one_, big, honking `setup` test case that is the
> > very first one in the test script.
> >
> > Splitting things into two inevitably leads to unnecessary churn when
> > investigating test failures.
> >
> > And that's really what test cases need to be optimized for: when
> > they report breakages. They need to help as much as they can to
> > investigate why things break. Nobody cares when test cases succeed.
> > The ~150k test cases that pass on every CI build: nobody is
> > interested. When a test case reports failure, that's when people pay
> > attention. At least some, including me.
> >
> > The most common case for me (and every other lazy person who relies
> > on CI/PR builds) is when a build breaks, and then I usually get to
> > the trace of the actually failing test case very quickly. The
> > previous test case's trace, not so easy. Clicks involved. Now I lose
> > context. Not good.
> >
> > A less common case for me is when I run test scripts locally, with
> > `-i -v -x`. Still, I need to scroll back to get context. And then,
> > really, I already lost context.
>
> I guess we have some strongly differing opinions here.

And probably experiences, too.

I literally debug something like a dozen per week test failures that are
reported via Azure Pipelines. And yes, I ran into some snags even with
your two-parter test cases in the past, and they did not help me. They
increased my burden of figuring out what is going wrong.

> The one thing I do agree with you on is test cases need to be
> optimized for when they report breakages, but that is precisely what
> led me to splitting setup and testing.

To me, it is so not helpful _not_ to see the output of a `setup` that
succeeded, and only the output of the actual test that actually failed.

It removes context.

I need to understand the scenario where the breakage happens, and the
only way I can understand is when I understand the context.

So the context needs to be as close as possible.

> Way too many tests in the testsuite intersperse several setup and test
> cases together making it really hard to disentangle, understand what
> is going on, or even reverse engineer what is relevant.  The absolute
> worst tests are the ones which just keep making additional changes to
> some existing repo to provide extra setup, causing all sorts of
> problems for skipping and resuming and understanding (to say nothing
> of test prerequisites that aren't always met).

I agree with this sentiment, and have to point out that this is yet
another fallout of the way our test suite is implemented. If you look
e.g. at JUnit, there are no "setup test cases". There are specific setup
steps that you can define, there is even a teardown step you can define,
and those individual test cases? They can run in parallel, or
randomized, and they run in their own sandbox, to make sure that they
don't rely on side effects of unrelated test cases.

We don't do that. We don't enforce the absence of side effects, and
therefore we have a megaton of them.

But note how this actually speaks _against_ separating the setup from
the test? Because then you _explicitly_ want those test cases to rely on
one another. Which flies in the _face_ of trying to disentangling them.

> But the ones with one big honking `setup` test case that is the very
> first one in the test script can often be pretty bad too when you've
> found a bug in testcase 82 and want to have a simple way to reproduce.

Indeed. One of the first things I try when `--run=82` fails is
`--run=1,82`. That's the best we can do in the absence of a clean design
like JUnit and its `@Before` method.

> It's awesome when someone can just run ./testcase.sh --ver --imm -x
> --run=86,87 and reproduce the problem.

But of course, often you would need `--run=1,86,87`. And it is totally
not obvious to _anybody_ who wants to contribute a new feature and whose
CI/PR build fails in one of your test cases.

> It feels sadly rare to be able to do that in much of git.git's
> testsuite.  Not accreting ever more changes into a repo to setup
> subsequent problems, using entirely separate repos for different cases
> where testing makes any changes, making it clear what is setup, making
> it clear what's the command being tested, and making it clear what all
> the commands are that are testing that the original test command
> produced the expected behavior all go a long way to making things way
> easier to investigate.

Sorry, I disagree. By squirreling away your setup phase into what looks
like an independent test case, the code is not more obvious, but less
so.

> Now, I will re-use a setup case for multiple tests and even did so in
> t6043, but only when the setup really is identical (e.g. not only are
> the tests expecting an identical setup, but the commands being tested
> are read-only or any changes they make are unconditionally reverted as
> the last step of the testcase).  Naturally, when I re-use a setup
> block multiple times, I really don't want to copy the setup into each
> of those tests either.
>
> It would be really nice, though, if there were some way for me to
> specify that a given "testcase" was just setup (so that they don't
> even get a number in the prove output and don't count as a "test"
> except maybe when they fail), and if there were some way to specify
> which testcases depend on which setup blocks (so that if I want to run
> just a given test, the setup test gets automatically included).
>
>
> Your example of CI/PR builds makes sense to me, and I really would
> like to make that nicer and improve other workflows too, especially if
> it can be done without breaking my ability to investigate test
> failures.  If someone can think of a clever way to do that (e.g. some
> way of implementing my "It would be really nice" in the last
> paragraph, and in a way that could also benefit CI/PR builds), I'd be
> happy to go through and help switch things over.

My suggestion: do not rip apart commands that belong together. The setup
phase is often more important to understand than the actually failing
command. _Especially_ when the setup test case reports success, but
actually failed to set things up as intended (which I encountered more
times than I care to count).

> > > +     (
> > > +             cd 12d &&
> > > +
> > > +             git checkout A^0 &&
> > > +
> > > +             git -c merge.directoryRenames=true merge -s recursive B^0 &&
> > > +
> > > +             git ls-files -s >out &&
> > > +             test_line_count = 2 out &&
> >
> > Isn't this a bit overzealous?
> >
> > > +
> > > +             git rev-parse >actual \
> > > +                     HEAD:subdir/foo.c.t   HEAD:bar.c.t &&
> > > +             git rev-parse >expect \
> > > +                     O:a/b/subdir/foo.c.t  B:a/b/bar.c.t &&
> > > +             test_cmp expect actual &&
> >
> > Likewise?
> >
> > > +
> > > +             git hash-object bar.c.t >actual &&
> > > +             git rev-parse B:a/b/bar.c.t >expect &&
> > > +             test_cmp expect actual &&
> >
> > Likewise?
> >
> > > +
> > > +             test_must_fail git rev-parse HEAD:a/b/subdir/foo.c.t &&
> > > +             test_must_fail git rev-parse HEAD:a/b/bar.c.t &&
> > > +             test_path_is_missing a/
> >
> > Makes sense, but the part that I am missing is
> >
> >                 test_path_is_file bar.c.t
> >
> > I.e. the _most_ important outcome of this test is: the rename was
> > detected, and the added file was correctly placed into the target
> > directory of the rename.
>
> That's a useful thing to add to the test, I'll add it.  (It's kind of
> included in the 'git hash-object bar.c.t' half a dozen or so lines
> above, but this line helps document the expectation a bit better.)
>
> I'd be very reticent to include only this test_path_is_file check, as
> it makes no guarantee that it has the right contents or that we didn't
> also keep around another copy in a/b/bar.c.t, etc.r

I agree that it has to strike a balance. There are multiple aspects you
need to consider:

- It needs to be easy to understand what the test case tries to ensure.

- While it is important to verify that Git works correctly, you do not
  want to make the test suite so slow as to be useless. I vividly
  remember how Duy mentioned that he does not have the time to run the
  test suite before contributing. That's how bad things are _already_.

As a rule of thumb, I would like to submit that a test case should fail
when the claimed goal (i.e. the test case's title) is, and a breakage in
that should both be sufficient and necessary for it to fail.

In this instance, the title is:

	12d-setup: Rename (merge) of subdirectory into the root

Now, this does not even tell me clearly what it tries to ensure. But
from the code, I guess that it wants to ensure that the directory rename
detection can detect when a directory's contents were moved into the
top-level directory.

Now, let's apply the rule of the "sufficient". I would expect the
outcome to be that the file that was added to that directory in one
branch is moved together with the directory's contents that the other
branch moved into the top-level directory.

I was missing that test, hence I pointed it out.

But testing the absence of the file `foo.c.t` in its original location
which was moved in one branch, and not touched in the other, i.e. with a
_trivial_ three-way merge resolution, that seems to be extra. It would
not break if the directory rename detection is broken, so it is an extra
test that does not do what the test case's title claims to test.

Now for the "necessary" part. It is totally unclear to me how all those
other things you test for would be failing _only_ if the directory
rename detection failed. I would actually expect that there are _many_
ways that this test case could fail in ways that have not the faintest
thing to do with directory rename detection.

As somebody who, as I stated above, investigates dozens of test case
failures per week (although, granted, most of them are relatively
trivial to resolve, it is invariably test cases like the one I am
commenting on that require the most time to get resolved), I have to
admit that there are few more frustrating things in debugging than
having to chase issues that have _nothing_ to do with the stated goal of
the test case.

> I understand my checks above look overzealous, but merge-recursive has
> been a bit of a pain to understand and deal with at times, and I've
> been bitten one too many times with tests (of merge-recursive and
> elsewhere in git.git) that passed but still did the wrong thing
> because they only tested one way in which the test problem failed in
> the past rather than specifying what correct behavior is.  And my
> "overzealousness" in the merge-recursive tests really have caught and
> prevented bugs would have been missed otherwise.

If you must have those tests, then please structure them in a way where
they are much more actionable. Ideally, it should be clear from a glance
at the trace of a failed test case (and _only_ from that test case) for
less than five seconds _what_ is being tested, and it should be obvious
what is going wrong.

I cannot say that about the test case in this mail.

> Oh, and I think there's another place in the code that needs to be
> tweaked to make sure we handle renaming subdirectories into the root
> directory that I missed (and just wasn't tested by this testcase), so
> I'll check into it and if so fix the code and add another testcase,
> and include the fixups I agreed to above and send out a v2.  Probably
> won't get to it until the middle of next week, though.
>
>
> Thanks again for the review and food for thought.  The CI/PR builds
> sound particularly interesting; I'd really like to make those better
> for you if it can be done in a way that doesn't break command line
> builds with clear documentation of setup and expectations while
> supporting test skipping and post-facto investigation (even without
> --immediate or re-running), etc.  Hmm...

My hope was that CI/PR builds could help improve the code quality in
`pu`, and to a large part it did.

The time I need to spend on understanding test cases' code (t0021 is a
particularly nasty one, not your fault, of course) before I can even get
to fire up a debugger, this amount of time is insane, though.

Ciao,
Dscho

  reply	other threads:[~2019-10-14 10:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-11 20:42 [PATCH 0/2] Dir rename fixes Elijah Newren via GitGitGadget
2019-10-11 20:42 ` [PATCH 1/2] merge-recursive: clean up get_renamed_dir_portion() Elijah Newren via GitGitGadget
2019-10-12 19:47   ` Johannes Schindelin
2019-10-11 20:42 ` [PATCH 2/2] merge-recursive: fix merging a subdirectory into the root directory Elijah Newren via GitGitGadget
2019-10-12 20:37   ` Johannes Schindelin
2019-10-13  0:40     ` Elijah Newren
2019-10-14 10:41       ` Johannes Schindelin [this message]
2019-10-22 19:15         ` Elijah Newren
2019-10-24 22:22           ` Johannes Schindelin
2019-10-25  0:12             ` Elijah Newren
2019-10-25 13:30               ` Johannes Schindelin
2019-10-29  1:20                 ` Junio C Hamano
2019-10-30  7:01                   ` Elijah Newren
2019-10-30 22:16                     ` Johannes Schindelin
2019-10-30 22:31                       ` Elijah Newren
2019-10-31  8:28                         ` Johannes Schindelin
2019-10-12 18:41 ` [PATCH 0/2] Dir rename fixes Johannes Schindelin
2019-10-22 21:22 ` [PATCH v2 0/3] " Elijah Newren via GitGitGadget
2019-10-22 21:22   ` [PATCH v2 1/3] merge-recursive: clean up get_renamed_dir_portion() Elijah Newren via GitGitGadget
2019-10-22 21:22   ` [PATCH v2 2/3] merge-recursive: fix merging a subdirectory into the root directory Elijah Newren via GitGitGadget
2019-10-22 21:22   ` [PATCH v2 3/3] t604[236]: do not run setup in separate tests Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1910141211130.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).