git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: "René Scharfe" <l.s.r@web.de>, "Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH] p5311: handle spaces in wc(1) output
Date: Mon, 4 Oct 2021 03:43:47 -0400	[thread overview]
Message-ID: <YVqws1lZMD+l1MfK@coredump.intra.peff.net> (raw)
In-Reply-To: <YVk8SeuDIWwsrdO0@nand.local>

On Sun, Oct 03, 2021 at 01:14:49AM -0400, Taylor Blau wrote:

> On Sat, Oct 02, 2021 at 10:33:18PM +0200, René Scharfe wrote:
> > Some implementations of wc(1) align their output with leading spaces,
> > even when just a single number is requested, e.g. with "wc -c".  p5311
> > runs all tests successfully on such a platform, but fails to aggregate
> > their results and reports:
> 
> This makes sense, and makes me think that wc's platform-specific
> implementations are too tricky to use when we are being picky about
> leading spaces.
> 
> In other words, I think that your fix is absolutely correct, but I
> wonder if test_size should be friendlier in what it accepts, and to
> chomp off any leading space. So perhaps something like the below would
> work without any modification to p5311.

I do like this direction, because by centralizing, it's one less thing
for perf-script writers to mess up. And not only does it fix "wc -c",
but it is more friendly to any other tools (since test_size can really
be used with any scalar magnitude measurement we like; our current tests
just happen to use wc).

But...

> Subject: [PATCH] t/perf/aggregate.perl: tolerate leading spaces
> 
> When using `test_size` with `wc -c`, users on certain platforms can run
> into issues when `wc` emits leading space characters in its output,
> which confuses get_times.
> 
> Callers could switch to use test_file_size instead of `wc -c` (the
> former never prints leading space characters, so will always work with
> test_size regardless of platform), but this is an easy enough spot to
> miss that we should teach get_times to be more tolerant of the input it
> accepts.
> 
> Teach get_times to do just that by stripping any leading space
> characters.

This leaves the extra whitespace inside the test-results/foo.results
file, which is a bit unfortunate, just because anything else besides
aggregate.perl will have to do the same workaround. So we've traded one
gotcha for another. ;)

I don't have a strong opinion on which is worse. The ideal would be for
test_size() itself to handle it, though it's a bit awkward because it is
literally just redirecting the output of the test snippet into the
result file. It's probably not worth spending a ton of effort on that.

> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
> index 82c0df4553..575d2000cc 100755
> --- a/t/perf/aggregate.perl
> +++ b/t/perf/aggregate.perl
> @@ -17,8 +17,8 @@ sub get_times {
>  		my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
>  		return ($rt, $4, $5);
>  	# size
> -	} elsif ($line =~ /^\d+$/) {
> -		return $&;
> +	} elsif ($line =~ /^\s*(\d+)$/) {
> +		return $1;

If we do go this route, it might be nice to ignore trailing whitespace,
too (I don't think it matters for wc, but just for general
friendliness). I'm tempted even to say that it should just drop the
anchors and match "\d+" anywhere, but perhaps that is a recipe for
mistakes (if somebody writes "foo 1234" we probably want to detect and
complain).

-Peff

      parent reply	other threads:[~2021-10-04  7:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-02 20:33 [PATCH] p5311: handle spaces in wc(1) output René Scharfe
2021-10-03  5:14 ` Taylor Blau
2021-10-03  8:04   ` Ævar Arnfjörð Bjarmason
2021-10-04 16:16     ` Junio C Hamano
2021-10-04  7:43   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVqws1lZMD+l1MfK@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).