git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Brandon Williams <bmwill@google.com>, Jeff King <peff@peff.net>,
	Joachim Durchholz <jo@durchholz.org>,
	Stefan Beller <sbeller@google.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name
Date: Mon, 10 Apr 2017 13:40:13 +0200	[thread overview]
Message-ID: <CACBZZX7kMcTgKFkFN3OvVKVHU693PYhRFe6gyO4AirihNsUYmg@mail.gmail.com> (raw)
In-Reply-To: <CAM0VKjnwbCgCjEBr895068k4veoSGZMf8Cu7neoH=oofgWS2Cw@mail.gmail.com>

On Mon, Apr 10, 2017 at 1:19 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Mon, Apr 10, 2017 at 10:02 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Mon, Apr 10, 2017 at 3:47 AM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
>>>> Change the test library to insert non-alphanumeric ASCII characters
>>>> into the TRASH_DIRECTORY name, that's the directory the test library
>>>> creates, chdirs to and runs each individual test from.
>>>>
>>>> Unless test_fails_on_unusual_directory_names=1 is declared before
>>>> importing test-lib.sh (and if perl isn't available on the system), the
>>>> trash directory will contain every non-alphanumeric character in
>>>> ASCII, in order.
>>>
>>> At the very least there must be an easier way to disable this, e.g. a
>>> command line option.
>>>
>>> This change is sure effective in smoking out bugs, but it's a major
>>> annoyance during development when it comes to debugging a test.  At
>>> first I could not even cd into the trash directory, because TAB
>>> completing the directory name with all those non-printable characters
>>> didn't work (this may be a bug in the bash-completion package).  And
>>> simply copy-pasting the dirname didn't work either, because 'ls'
>>>
>>>   trash directory.t9902-completion.??????????????????????????????? !"#$%&'()*+,-:;<=>?@[\]^_`{|}~?
>
> Btw, it seems most of the failures in t9902-completion are triggered
> by remote URL parsing.  The trash directory's new name contains '[',
> ']' and even "@[", all of which are treated special by
> connect.c:host_end(), a helper function of parse_connect_url(),
> basically breaking anything trying to e.g.:
>
>   git fetch "$(pwd)/other"

I'm going to work on this patch so that I can report on tests by type
of character that triggers a failure.

> What puzzles me most is that parse_connect_url() recognizes right at
> its beginning that a remote URL like this is not actually an URL, so
> why does it continue parsing it as if it were one?
>
> A few other failures are triggered by the ':' in the trash directory's
> name, breaking the following commonly used pattern:
>
>   export GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY" &&
>   cd subdir &&
>   test-git-pretending-it's-run-outside-of-a-repository

Does GIT_CEILING_DIRECTORIES support escaping somehow? E.g.
"foo\:bar". If so maybe we could use a wrapper to set it, if not
that's a bug in the ceiling dir feature, surely.

> I think ':' should therefore be excluded from the trash directory, too.

I think it's preferable to have some mode to use : in dirnames for
those tests that don't fail already, to protect them against future
regressions. Disabling the use of a tricky character like ":" invites
future bugs & regressions.

>>> After some headscratching, Sunday night may be my excuse, I figured
>>> out that 'cd tr*' works...  only to be greeted with the ugliest-ever
>>> three-line(!) shell prompt.
>>>
>>> Therefore I would say that this should not even be enabled by default
>>> in test-lib.sh, so at least running a test directly from the command
>>> line as ./t1234-foo.sh would considerately give us an easily
>>> accessible trash directory even without any command line options.  We
>>> could enable it for 'make test' by default via GIT_TEST_OPTS in
>>> t/Makefile, though.
>>
>> This definitely needs some tweaking as you and Joachim point out. E.g.
>> some capabilities check in the test suite to check if we can even
>> create these sorts of paths on the local filesystem.
>>
>> A couple of comments on the above though:
>>
>> a) If we have something that's a more strict mode that makes tests
>> fail due to buggy code in various scenarios, we gain the most from
>> having it on by default
>
> I know, and I basically agree...
>
>> and having some optional mode to have devs
>> e.g. disable it for manual inspection of the test directories.
>
> ... but this is just too gross to live as default outside of a CI
> environment.
>
>> Most of the running of the test suite that really matters, i.e. just
>> before the software is delivered to end users, is going to be running
>> in some non-interactive build system preparing a package.
>>
>> b) I think any sort of magic like using it with 'make test', but not
>> when the *.sh is manually run, will just lead to frustrating seemingly
>> heisenbugs from people trying to debug the test suite when things do
>> fail, i.e. you run 'make test' on some obscure platform we haven't
>> fixed path bugs on, 10 fail, you manually inspect them and every one
>> of them succeeds, because some --use-garbage-dirs option wasn't
>> passed.
>
> That's not really an issue.  When a test fails during 'make test' with
> garbage in trash dir names, the dev comes and attempts to cd into the
> trash dir, and will be instantly reminded that non-printable
> characters might play a role in the failure when he can't do so with
> ordinary means.

When a test fails for me I cd to t/ and re-run the test *.sh manually.
I don't go straight to inspecting the existing trash.

If those manual invocations were running in some different mode &
succeeded that would be very confusing.

In any case, I'll try to come up with something more granular, e.g. to
categorize tests by failure type.



>>>> This includes all the control characters, !, [], {} etc. the "."
>>>> character isn't included because it's already in the directory name,
>>>> and nor is "/" for obvious reasons, although that would actually work,
>>>> we'd just create a subdirectory, which would make the tests harder to
>>>> inspect when they fail.i
>>>
>>> 1. Heh.  How an additional subdirectory would make the tests harder to
>>>    inspect is nothing compared to the effect of all the other
>>>    characters.
>>>
>>> 2. s/i$//
>>>
>>>> This change is inspired by the "submodule: prevent backslash expantion
>>>> in submodule names" patch[1]. If we'd had backslashes in the
>>>> TRASH_DIRECTORY all along that bug would have been fixed a long time
>>>> ago. This will flag such issues by marking tests that currently fail
>>>> with "test_fails_on_unusual_directory_names=1", ensure that new tests
>>>> aren't added unless a discussion is had about why the code can't
>>>> handle unusual pathnames, and prevent future regressions.
>>>>
>>>> 1. <20170407172306.172673-1-bmwill@google.com>
>>>> ---
>>>>  t/README      | 12 ++++++++++++
>>>>  t/test-lib.sh |  4 ++++
>>>>  2 files changed, 16 insertions(+)
>>>>
>>>> diff --git a/t/README b/t/README
>>>> index ab386c3681..314dd40221 100644
>>>> --- a/t/README
>>>> +++ b/t/README
>>>> @@ -345,6 +345,18 @@ assignment to variable 'test_description', like this:
>>>>       This test registers the following structure in the cache
>>>>       and tries to run git-ls-files with option --frotz.'
>>>>
>>>> +By default the tests will be run from a directory with a highly
>>>> +unusual filename that includes control characters, a newline, various
>>>> +punctuation etc., this is done to smoke out any bugs related to path
>>>> +handling. If for whatever reason the tests can't deal with such
>>>> +unusual path names, set:
>>>> +
>>>> +    test_fails_on_unusual_directory_names=1
>>>> +
>>>> +Before sourcing 'test-lib.sh' as described below. This option is
>>>> +mainly intended to grandfather in existing broken tests & code, and
>>>> +should usually not be used in new code, instead your tests or code
>>>> +probably need fixing.
>>>>
>>>>  Source 'test-lib.sh'
>>>>  --------------------
>>>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>>>> index 13b5696822..089ff5ac7d 100644
>>>> --- a/t/test-lib.sh
>>>> +++ b/t/test-lib.sh
>>>> @@ -914,6 +914,10 @@ fi
>>>>
>>>>  # Test repository
>>>>  TRASH_DIRECTORY="trash directory.$(basename "$0" .sh)"
>>>> +if test -z "$test_fails_on_unusual_directory_names" -a "$(perl -e 'print 1+1' 2>/dev/null)" = "2"
>>>> +then
>>>> +   TRASH_DIRECTORY="$TRASH_DIRECTORY.$(perl -e 'print join q[], grep { /[^[:alnum:]]/ and !m<[./]> } map chr, 0x01..0x7f')"
>>>> +fi
>>>>  test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
>>>>  case "$TRASH_DIRECTORY" in
>>>>  /*) ;; # absolute path is good
>>>> --
>>>> 2.11.0
>>>
>>>

  reply	other threads:[~2017-04-10 11:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-09 19:11 [PATCH 0/2] test: Detect *lots* of bugs by adding non-alnum to trash dir names Ævar Arnfjörð Bjarmason
2017-04-09 19:11 ` [PATCH 1/2] tests: mark tests that fail when the TEST_DIRECTORY is unusual Ævar Arnfjörð Bjarmason
2017-04-09 19:11 ` [PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name Ævar Arnfjörð Bjarmason
2017-04-10  1:47   ` SZEDER Gábor
2017-04-10  8:02     ` Ævar Arnfjörð Bjarmason
2017-04-10 11:19       ` SZEDER Gábor
2017-04-10 11:40         ` Ævar Arnfjörð Bjarmason [this message]
2017-04-10 13:38           ` Jeff King
2017-04-10 14:59             ` Joachim Durchholz
2017-04-10 16:57               ` Jeff King
2017-04-10 18:19                 ` Joachim Durchholz
2017-04-10 19:22                   ` Jeff King
2017-04-10 13:43           ` SZEDER Gábor
2017-04-10 23:23   ` Ævar Arnfjörð Bjarmason
2017-04-11  0:30     ` [PATCH] connect.c: handle errors from split_cmdline Jeff King
2017-04-11  0:35       ` Jeff King
2017-04-11  9:27         ` Ævar Arnfjörð Bjarmason
2017-04-11 10:54           ` Jeff King
2017-04-11 11:06             ` Ævar Arnfjörð Bjarmason
2017-04-17  0:51               ` Junio C Hamano
2017-04-17  0:54               ` Junio C Hamano
2017-04-19 10:59                 ` Ævar Arnfjörð Bjarmason
2017-04-11  1:14     ` [PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name Jeff King
2017-04-11  6:28     ` Joachim Durchholz
2017-04-09 20:37 ` [PATCH 0/2] test: Detect *lots* of bugs by adding non-alnum to trash dir names Joachim Durchholz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACBZZX7kMcTgKFkFN3OvVKVHU693PYhRFe6gyO4AirihNsUYmg@mail.gmail.com \
    --to=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jo@durchholz.org \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).