git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, gitster@pobox.com, git@jeffhostetler.com,
	avarab@gmail.com, peff@peff.net
Subject: Re: [RFC PATCH v2 2/3] trace2: add a schema validator for trace2 events
Date: Thu, 11 Jul 2019 15:35:20 +0200	[thread overview]
Message-ID: <86muhkae3r.fsf@gmail.com> (raw)
In-Reply-To: <3fa4e9eef84ba00c631c82fb3a2eacb8439df9e5.1562712943.git.steadmon@google.com> (Josh Steadmon's message of "Tue, 9 Jul 2019 16:05:44 -0700")

Josh Steadmon <steadmon@google.com> writes:

> trace_schema_validator can be used to verify that trace2 event output
> conforms to the expectations set by the API documentation and codified
> in event_schema.json (or strict_schema.json). This allows us to build a
> regression test to verify that trace2 output does not change
> unexpectedly.
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>

Very nitpicky comments below.

> ---
>  t/trace_schema_validator/.gitignore           |  1 +
>  t/trace_schema_validator/Makefile             | 10 +++
>  .../trace_schema_validator.go                 | 78 +++++++++++++++++++
>  3 files changed, 89 insertions(+)
>  create mode 100644 t/trace_schema_validator/.gitignore
>  create mode 100644 t/trace_schema_validator/Makefile
>  create mode 100644 t/trace_schema_validator/trace_schema_validator.go
>
> diff --git a/t/trace_schema_validator/.gitignore b/t/trace_schema_validator/.gitignore
> new file mode 100644
> index 0000000000..c3f1e04e9e
> --- /dev/null
> +++ b/t/trace_schema_validator/.gitignore
> @@ -0,0 +1 @@
> +trace_schema_validator
> diff --git a/t/trace_schema_validator/Makefile b/t/trace_schema_validator/Makefile
> new file mode 100644
> index 0000000000..ed22675e5d
> --- /dev/null
> +++ b/t/trace_schema_validator/Makefile
> @@ -0,0 +1,10 @@
> +.PHONY: fetch_deps clean
> +
> +trace_schema_validator: fetch_deps trace_schema_validator.go
> +	go build

I don't know the Go build process, but shouldn't the name of target and
the name of actual source file passed to the command?

Though I don't think we would _need_ for example being able to configure
Go build process via Makefile variables, like e.g. $(GOBUILD) in
https://sohlich.github.io/post/go_makefile/

> +
> +fetch_deps:
> +	go get github.com/xeipuuv/gojsonschema
> +
> +clean:
> +	rm -f trace_schema_validator

In git Makefile we use

  clean:
  	$(RM) $(PROGRAMS)

I'm not sure if it is needed for operating system independence, but
using $(RM) is a standard way to create 'clean' targets...

> diff --git a/t/trace_schema_validator/trace_schema_validator.go
> b/t/trace_schema_validator/trace_schema_validator.go
> new file mode 100644
> index 0000000000..f779ac5ff5
> --- /dev/null
> +++ b/t/trace_schema_validator/trace_schema_validator.go
> @@ -0,0 +1,78 @@
> +// trace_schema_validator validates individual lines of an input file against a
> +// provided JSON-Schema for git trace2 event output.
> +//
> +// Traces can be collected by setting the GIT_TRACE2_EVENT environment variable
> +// to an absolute path and running any Git command; traces will be appended to
> +// the file.
> +//
> +// Traces can then be verified like so:
> +//   trace_schema_validator \
> +//     --trace2_event_file /path/to/trace/output \
> +//     --schema_file /path/to/schema
> +package main
> +
> +import (
> +	"bufio"
> +	"flag"
> +	"log"
> +	"os"
> +	"path/filepath"
> +
> +	"github.com/xeipuuv/gojsonschema"
> +)
> +
> +// Required flags
> +var schemaFile = flag.String("schema_file", "", "JSON-Schema filename")
> +var trace2EventFile = flag.String("trace2_event_file", "", "trace2 event filename")

The standard for long options is to use "kebab case", not "snake case"
for them, i.e. --schema-file not current --schema_file, etc.

> +
> +func main() {
> +	flag.Parse()
> +	if *schemaFile == "" || *trace2EventFile == "" {
> +		log.Fatal("Both --schema_file and --trace2_event_file are required.")
> +	}

I guess that you prefer required options with explicit arguments instead
of positional arguments (that is requiring the command to be called with
two arguments, first being schema file, second being event file to
validate).

> +	schemaURI, err := filepath.Abs(*schemaFile)
> +	if err != nil {
> +		log.Fatal("Can't get absolute path for schema file: ", err)
> +	}
> +	schemaURI = "file://" + schemaURI
> +
> +	schemaLoader := gojsonschema.NewReferenceLoader(schemaURI)
> +	schema, err := gojsonschema.NewSchema(schemaLoader)
> +	if err != nil {
> +		log.Fatal("Problem loading schema: ", err)
> +	}
> +
> +	tracesFile, err := os.Open(*trace2EventFile)
> +	if err != nil {
> +		log.Fatal("Problem opening trace file: ", err)
> +	}
> +	defer tracesFile.Close()
> +
> +	scanner := bufio.NewScanner(tracesFile)
> +
> +	count := 0
> +	for ; scanner.Scan(); count++ {

I see that you assume JSON-Lines format, i.e. one JSON object per line
in the file.

> +		if count%10000 == 0 {
> +			// Travis-CI expects regular output or it will time out.
> +			log.Print("Validated items: ", count)

I wonder if it wouldn't be better to provide --progress flag, which
Travis-CI job would turn on.

> +		}
> +		event := gojsonschema.NewStringLoader(scanner.Text())
> +		result, err := schema.Validate(event)
> +		if err != nil {
> +			log.Fatal(err)
> +		}
> +		if !result.Valid() {
> +			log.Print("Trace event is invalid: ", scanner.Text())

It might be good idea to print the line (i.e. the value of `count`
variable).

I guess that conforming to the <filename>:<line>: prefix (like e.g. gcc
uses for errors) would be unnecessary.

> +			for _, desc := range result.Errors() {
> +				log.Print("- ", desc)
> +			}
> +			os.Exit(1)
> +		}
> +	}
> +
> +	if err := scanner.Err(); err != nil {
> +		log.Fatal("Scanning error: ", err)
> +	}
> +
> +	log.Print("Validated events: ", count)
> +}

  reply	other threads:[~2019-07-11 13:35 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-11 23:31 [RFC PATCH 0/3] Add a JSON Schema for trace2 events Josh Steadmon
2019-06-11 23:31 ` [RFC PATCH 1/3] trace2: correct trace2 field name documentation Josh Steadmon
2019-06-12 18:00   ` Junio C Hamano
2019-06-12 18:14     ` Josh Steadmon
2019-06-14 15:53   ` Jeff Hostetler
2019-06-11 23:31 ` [RFC PATCH 2/3] trace2: Add a JSON schema for trace2 events Josh Steadmon
2019-06-14 15:59   ` Jeff Hostetler
2019-06-20 17:26     ` Josh Steadmon
2019-06-11 23:31 ` [RFC PATCH 3/3] trace2: add a schema validator " Josh Steadmon
2019-06-12 13:28   ` Ævar Arnfjörð Bjarmason
2019-06-12 16:23     ` Josh Steadmon
2019-06-12 19:18       ` Jeff King
2019-06-20 18:15         ` Josh Steadmon
2019-06-21 11:53       ` Jakub Narebski
2019-06-27 13:57         ` Jeff Hostetler
2019-07-09 23:05 ` [RFC PATCH v2 0/3] Add a JSON Schema " Josh Steadmon
2019-07-09 23:05   ` [RFC PATCH v2 1/3] trace2: Add a JSON schema " Josh Steadmon
2019-07-10 18:32     ` Jakub Narebski
2019-07-24 22:37       ` Josh Steadmon
2019-07-09 23:05   ` [RFC PATCH v2 2/3] trace2: add a schema validator " Josh Steadmon
2019-07-11 13:35     ` Jakub Narebski [this message]
2019-07-24 22:47       ` Josh Steadmon
2019-07-09 23:05   ` [RFC PATCH v2 3/3] ci: run trace2 schema validation in the CI suite Josh Steadmon
2019-07-24 23:06 ` [PATCH v3 0/3] Add a JSON Schema for trace2 events Josh Steadmon
2019-07-24 23:06   ` [PATCH v3 1/3] trace2: Add a JSON schema " Josh Steadmon
2019-07-25 16:55     ` Junio C Hamano
2019-07-24 23:06   ` [PATCH v3 2/3] trace2: add a schema validator " Josh Steadmon
2019-07-24 23:06   ` [PATCH v3 3/3] ci: run trace2 schema validation in the CI suite Josh Steadmon
2019-07-25 11:18   ` [PATCH v3 0/3] Add a JSON Schema for trace2 events SZEDER Gábor
2019-07-25 16:14     ` Junio C Hamano
2019-07-26 21:16       ` Josh Steadmon
2019-07-25 23:42   ` SZEDER Gábor
2019-07-26 12:12     ` Johannes Schindelin
2019-07-26 13:53       ` SZEDER Gábor
2019-07-31 11:00         ` Johannes Schindelin
2019-07-26 22:03       ` Josh Steadmon
2019-08-01 18:08         ` Josh Steadmon
2019-08-02  1:52           ` Jonathan Nieder
2019-08-02 11:56             ` Johannes Schindelin
2019-08-02 16:59               ` Jonathan Nieder
2019-08-02 19:38                 ` SZEDER Gábor
2019-08-02 23:25                   ` Jonathan Nieder
2019-08-03 21:25                     ` Johannes Schindelin
2019-08-02 19:16             ` SZEDER Gábor
2019-08-02 23:06               ` Jonathan Nieder
2019-08-03  7:35                 ` SZEDER Gábor
2019-08-03  7:40                   ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86muhkae3r.fsf@gmail.com \
    --to=jnareb@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).