git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "Taylor Blau" <me@ttaylorr.com>, "Jeff King" <peff@peff.net>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH 2/7] t: introduce tests for unexpected object types
Date: Tue, 9 Apr 2019 18:59:34 -0700	[thread overview]
Message-ID: <20190410015934.GB56470@Taylors-MBP.hsd1.wa.comcast.net> (raw)
In-Reply-To: <871s2ba6bb.fsf@evledraar.gmail.com>

Hi Ævar,

On Tue, Apr 09, 2019 at 11:14:48AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Tue, Apr 09 2019, Taylor Blau wrote:
>
> > Hi Ævar,
> >
> > On Sun, Apr 07, 2019 at 11:00:19PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >>
> >> On Fri, Apr 05 2019, Jeff King wrote:
> >>
> >> > On Fri, Apr 05, 2019 at 08:42:29PM +0200, SZEDER Gábor wrote:
> >> >
> >> >> > > Don't run git commands upstream of a pipe, because the pipe hides
> >> >> > > their exit code.  This applies to several other tests below as well.
> >> >> >
> >> >> > I disagree with that rule here. We're not testing "cat-file" in any
> >> >> > meaningful way, but just getting some stock output from a known-good
> >> >> > commit.
> >> >>
> >> >> It makes auditing harder and sets bad example.
> >> >
> >> > I disagree that it's a bad example. Just because a reader might not
> >> > realize that it is sometimes OK and sometimes not, does not make it bad
> >> > to do so in the OK case. It means the reader needs to understand the
> >> > rule in order to correctly apply it.
> >>
> >> FWIW I thought the rule was something to the effect of "we're hacking on
> >> git, any change might segfault in some unexpected test, let's make sure
> >> that fails right away", hence the blanket rule in t/README against "!
> >> git <cmd>" in favor of "test_must_fail git <cmd>", and "don't feed the
> >> output of a git command to a pipe" documented right after that.
> >
> > I have some loosely-held opinions on this. Certainly knowing if a change
> > to git caused a segfault in some test is something that we would want to
> > know about, though I'm not sure we're loosing anything by putting 'git'
> > on the left-hand side of a pipe here.
> >
> >   - If someone wrote a change to git that introduced a segfault in 'git
> >     cat-file', I'm sure that this would not be the only place that in
> >     the suite that would break because of it. Presumably, at least one
> >     of those places uses 'test_must_fail git ...' instead
> >
> >   - At the very least, 'broken-commit' doesn't contain what it needs to,
> >     the test breaks in another way (albeit further from the actual
> >     defect), and the developer finds out about their bug that way.
> >
> > In any case, these above two might be moot anyways, because I'm almost
> > certain that it will be a rarity for someone to _only_ run t6102, unless
> > it is included in a blanket 'make' from within 't'.
>
> First. I realize we're talking about the light fixtures in the bike shed
> at this point, but with that in mind...

I don't mind some bike-shedding ;-).

> I just think it's useful as a general rule, particularly since with the
> "special setups" in the test mode we've found that all sorts of odd
> tests we didn't expect to stress test some features turned out to cover
> edge cases of them, e.g. when "GIT_TEST_COMMIT_GRAPH" was added we found
> that a bunch of random stuff segfaulted all over the place.
>
> So it's hard to say with any confidence in advance that something isn't
> going to stress git in some unusual way and isn't useful to guard for
> segfaults.

Yes, I think that's certainly true. There's something to be said for my
argument, too, (i.e., that it's ``probably'' just fine), but as I think
about it more, I'm less convinced that mine is a position worth holding.

Even if we catch only a bug or two on an off chance, there aren't too
many things I'd sacrifice to not make it so. In other words, I'm willing
to do quite a lot in order to get even a little indication of when
things are broken (including replacing 'git ... | sed' with 'git ... >foo &&
sed ... <foo').

> Of course if and when it segfaults it's likely to be seen by subsequent
> tests. In this case I'll note that if git fails we'll happily run not
> only perl/sed, but then hash-object the subsequent empty file, and then
> (presumably) fail in the next test.

Sure, we'll probably find out about it anyway, but still...

> >> > I am sympathetic to the auditing issue, though.
> >
> > Just to satisfy my curiosity, I put git on the left-hand side of a pipe
> > to see how many such examples exist today:
> >
> >   ~/g/git (master) $ git grep 'git cat-file .* |' -- t/t*.sh | wc -l
> >       179
> >
> > I'm not going to claim that changing 179 -> 180 is neutral or bad, but
> > it's interesting nonetheless.
>
> Separate from the question of if we generally agree that some value of
> "Y" makes for good coding style or not, we're always going to be in some
> flux state where a bunch of examples in our existing codebase contradict
> that, particularly in the test suite.
>
> I think that's a bit unfortunate in some ways. It's the result of the
> default "policy" that refactoring for refactoring's sakes is generally
> frowned upon, so we don't tend to go back and mass-fix "X" to "Y" once
> we agree "Y" is better than "X" for new code, just do it as we go when
> new code is written, or existing tests are updated for other reasons
> "while we're at it".

Is that refactoring for its own sake, though? I would personally be
happy to write a series that either (1) identifies spots in 't' where
`git` is found on the left-hand side of a pipe, (2) fixes the bad-style
invocation in those spots, or (3) both.

> >> > I dunno. In this case it is not too bad to do:
> >> >
> >> >   git cat-file commit $commit >good-commit &&
> >> >   perl ... <good-commit >broken-commit
> >> >
> >> > but I am not sure I am on board with a blanket rule of "git must never
> >> > be on the left-hand side of a pipe".
> >
> > I think that I mostly agree with Peff here for the reasons above.
> >
> > That all said, I don't really want to hold up the series for this alone.
> > Since there don't seem to be many other comments or issues, I would be
> > quite happy to do whatever people are most in favor of.
>
> FWIW this series LGTM as a whole. I think it says something about the
> general quality of it that we're all in some giant nitpick thread about
> perl v.s. sed and git on the LHS of a pipe or not :) I'm happy to have
> it queued as-is. These test issues are minor...

Thanks, I appreciate that you think the series is in good shape. Since
there haven't been too many comments other than this thread, I'm going
to send v2 now.

I decided to ban t6102 of all such 'git ... | sed' invocations, and
described the change earlier in this thread.

> > I basically don't really feel strongly about writing:
> >
> >   git cat-file commit $commit | sed -e ... >broken-commit
> >
> > as opposed to:
> >
> >   git cat-file commit $commit >good-commit &&
> >   sed -e '' <commit >bad-commit
> >
> >> >> > > Wouldn't a 'sed' one-liner suffice, so we won't have yet another perl
> >> >> > > dependency?
> >> >> >
> >> >> > Heh, this was actually the subject of much discussion before the patches
> >> >> > hit the list. If you can write such a one-liner that is both readable
> >> >> > and portable, please share it. I got disgusted with sed and suggested
> >> >> > this perl.
> >> >>
> >> >> Hm, so the trivial s/// with '\n' in the replacement part is not
> >> >> portable, then?  Oh, well.
> >> >
> >> > I don't think it is, but I could be wrong. POSIX does say that "\n"
> >> > matches a newline in the pattern space, but nothing about it on the RHS
> >> > of a substitution. I have a vague feeling of running into problems in
> >> > the past, but I could just be misremembering.
> >> >
> >> > We also tried matching /^tree/ and using "a\" to append a line, but it
> >> > was both hard to read and hit portability issues with bsd sed.
> >
> > I think that all of this discussion is addressed elsewhere in thread, but
> > the gist is that Eric provided a suitable sed invocation that I am going
> > to use instead of Peff's Perl.
> >
> >> > -Peff
> >
> > Thanks,
> > Taylor

Thanks,
Taylor

  reply	other threads:[~2019-04-10  1:59 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-05  3:37 [PATCH 0/7] harden unexpected object types checks Taylor Blau
2019-04-05  3:37 ` [PATCH 1/7] t: move 'hex2oct' into test-lib-functions.sh Taylor Blau
2019-04-05  3:37 ` [PATCH 2/7] t: introduce tests for unexpected object types Taylor Blau
2019-04-05 10:50   ` SZEDER Gábor
2019-04-05 18:24     ` Jeff King
2019-04-05 18:42       ` SZEDER Gábor
2019-04-05 18:52         ` Jeff King
2019-04-07 21:00           ` Ævar Arnfjörð Bjarmason
2019-04-09  2:29             ` Taylor Blau
2019-04-09  9:14               ` Ævar Arnfjörð Bjarmason
2019-04-10  1:59                 ` Taylor Blau [this message]
2019-04-08  5:27           ` Junio C Hamano
2019-04-05 19:25       ` Eric Sunshine
2019-04-05 20:53         ` Jeff King
2019-04-06  5:33           ` Taylor Blau
2019-04-08  6:44         ` Junio C Hamano
2019-04-09  2:30           ` Taylor Blau
2019-04-09  3:28             ` Eric Sunshine
2019-04-09  5:08               ` Taylor Blau
2019-04-09  8:02                 ` Eric Sunshine
2019-04-10  1:54                   ` Taylor Blau
2019-04-06  5:31       ` Taylor Blau
2019-04-05 18:31   ` Jeff King
2019-04-06  5:23     ` Taylor Blau
2019-04-05  3:37 ` [PATCH 3/7] list-objects.c: handle unexpected non-blob entries Taylor Blau
2019-04-05  3:37 ` [PATCH 4/7] list-objects.c: handle unexpected non-tree entries Taylor Blau
2019-04-05  3:37 ` [PATCH 5/7] get_commit_tree(): return NULL for broken tree Taylor Blau
2019-04-05  3:37 ` [PATCH 6/7] rev-list: let traversal die when --missing is not in use Taylor Blau
2019-04-05 18:41   ` Jeff King
2019-04-06  5:36     ` Taylor Blau
2019-04-07 13:41       ` Jeff King
2019-04-09  2:11         ` Taylor Blau
2019-04-05  3:37 ` [PATCH 7/7] rev-list: detect broken root trees Taylor Blau
2019-04-10  2:13 ` [PATCH v2 0/7] harden unexpected object types checks Taylor Blau
2019-04-10  2:13   ` [PATCH v2 1/7] t: move 'hex2oct' into test-lib-functions.sh Taylor Blau
2019-04-10  2:13   ` [PATCH v2 2/7] t: introduce tests for unexpected object types Taylor Blau
2019-04-10  2:13   ` [PATCH v2 3/7] list-objects.c: handle unexpected non-blob entries Taylor Blau
2019-04-10  2:13   ` [PATCH v2 4/7] list-objects.c: handle unexpected non-tree entries Taylor Blau
2019-04-10  2:13   ` [PATCH v2 5/7] get_commit_tree(): return NULL for broken tree Taylor Blau
2019-04-10  2:13   ` [PATCH v2 6/7] rev-list: let traversal die when --missing is not in use Taylor Blau
2019-04-10  2:13   ` [PATCH v2 7/7] rev-list: detect broken root trees Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190410015934.GB56470@Taylors-MBP.hsd1.wa.comcast.net \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).