git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, git@jeffhostetler.com,
	avarab@gmail.com, peff@peff.net
Subject: Re: [RFC PATCH v2 2/3] trace2: add a schema validator for trace2 events
Date: Wed, 24 Jul 2019 15:47:05 -0700	[thread overview]
Message-ID: <20190724224705.GC43313@google.com> (raw)
In-Reply-To: <86muhkae3r.fsf@gmail.com>

Thanks for the review, replies are inline below.

On 2019.07.11 15:35, Jakub Narebski wrote:
> Josh Steadmon <steadmon@google.com> writes:
> 
> > trace_schema_validator can be used to verify that trace2 event output
> > conforms to the expectations set by the API documentation and codified
> > in event_schema.json (or strict_schema.json). This allows us to build a
> > regression test to verify that trace2 output does not change
> > unexpectedly.
> >
> > Signed-off-by: Josh Steadmon <steadmon@google.com>
> 
> Very nitpicky comments below.
> 
> > ---
> >  t/trace_schema_validator/.gitignore           |  1 +
> >  t/trace_schema_validator/Makefile             | 10 +++
> >  .../trace_schema_validator.go                 | 78 +++++++++++++++++++
> >  3 files changed, 89 insertions(+)
> >  create mode 100644 t/trace_schema_validator/.gitignore
> >  create mode 100644 t/trace_schema_validator/Makefile
> >  create mode 100644 t/trace_schema_validator/trace_schema_validator.go
> >
> > diff --git a/t/trace_schema_validator/.gitignore b/t/trace_schema_validator/.gitignore
> > new file mode 100644
> > index 0000000000..c3f1e04e9e
> > --- /dev/null
> > +++ b/t/trace_schema_validator/.gitignore
> > @@ -0,0 +1 @@
> > +trace_schema_validator
> > diff --git a/t/trace_schema_validator/Makefile b/t/trace_schema_validator/Makefile
> > new file mode 100644
> > index 0000000000..ed22675e5d
> > --- /dev/null
> > +++ b/t/trace_schema_validator/Makefile
> > @@ -0,0 +1,10 @@
> > +.PHONY: fetch_deps clean
> > +
> > +trace_schema_validator: fetch_deps trace_schema_validator.go
> > +	go build
> 
> I don't know the Go build process, but shouldn't the name of target and
> the name of actual source file passed to the command?
> 
> Though I don't think we would _need_ for example being able to configure
> Go build process via Makefile variables, like e.g. $(GOBUILD) in
> https://sohlich.github.io/post/go_makefile/

Yeah it seems optional for Go, but fixed to make things more consistent
with the other makefiles.

> > +
> > +fetch_deps:
> > +	go get github.com/xeipuuv/gojsonschema
> > +
> > +clean:
> > +	rm -f trace_schema_validator
> 
> In git Makefile we use
> 
>   clean:
>   	$(RM) $(PROGRAMS)
> 
> I'm not sure if it is needed for operating system independence, but
> using $(RM) is a standard way to create 'clean' targets...

Fixed in V3.

> > diff --git a/t/trace_schema_validator/trace_schema_validator.go
> > b/t/trace_schema_validator/trace_schema_validator.go
> > new file mode 100644
> > index 0000000000..f779ac5ff5
> > --- /dev/null
> > +++ b/t/trace_schema_validator/trace_schema_validator.go
> > @@ -0,0 +1,78 @@
> > +// trace_schema_validator validates individual lines of an input file against a
> > +// provided JSON-Schema for git trace2 event output.
> > +//
> > +// Traces can be collected by setting the GIT_TRACE2_EVENT environment variable
> > +// to an absolute path and running any Git command; traces will be appended to
> > +// the file.
> > +//
> > +// Traces can then be verified like so:
> > +//   trace_schema_validator \
> > +//     --trace2_event_file /path/to/trace/output \
> > +//     --schema_file /path/to/schema
> > +package main
> > +
> > +import (
> > +	"bufio"
> > +	"flag"
> > +	"log"
> > +	"os"
> > +	"path/filepath"
> > +
> > +	"github.com/xeipuuv/gojsonschema"
> > +)
> > +
> > +// Required flags
> > +var schemaFile = flag.String("schema_file", "", "JSON-Schema filename")
> > +var trace2EventFile = flag.String("trace2_event_file", "", "trace2 event filename")
> 
> The standard for long options is to use "kebab case", not "snake case"
> for them, i.e. --schema-file not current --schema_file, etc.

Fixed in V3.

> > +
> > +func main() {
> > +	flag.Parse()
> > +	if *schemaFile == "" || *trace2EventFile == "" {
> > +		log.Fatal("Both --schema_file and --trace2_event_file are required.")
> > +	}
> 
> I guess that you prefer required options with explicit arguments instead
> of positional arguments (that is requiring the command to be called with
> two arguments, first being schema file, second being event file to
> validate).

Yeah, that's the style I'm used to at work, but I can change this if
positional args are more acceptable for Git.

> > +	schemaURI, err := filepath.Abs(*schemaFile)
> > +	if err != nil {
> > +		log.Fatal("Can't get absolute path for schema file: ", err)
> > +	}
> > +	schemaURI = "file://" + schemaURI
> > +
> > +	schemaLoader := gojsonschema.NewReferenceLoader(schemaURI)
> > +	schema, err := gojsonschema.NewSchema(schemaLoader)
> > +	if err != nil {
> > +		log.Fatal("Problem loading schema: ", err)
> > +	}
> > +
> > +	tracesFile, err := os.Open(*trace2EventFile)
> > +	if err != nil {
> > +		log.Fatal("Problem opening trace file: ", err)
> > +	}
> > +	defer tracesFile.Close()
> > +
> > +	scanner := bufio.NewScanner(tracesFile)
> > +
> > +	count := 0
> > +	for ; scanner.Scan(); count++ {
> 
> I see that you assume JSON-Lines format, i.e. one JSON object per line
> in the file.

Yeah, I'll add a comment noting this. This won't work with the provided
list schemas unless we reformat everything to be on a single line.

> 
> > +		if count%10000 == 0 {
> > +			// Travis-CI expects regular output or it will time out.
> > +			log.Print("Validated items: ", count)
> 
> I wonder if it wouldn't be better to provide --progress flag, which
> Travis-CI job would turn on.

Done, thanks for the suggestion.

> > +		}
> > +		event := gojsonschema.NewStringLoader(scanner.Text())
> > +		result, err := schema.Validate(event)
> > +		if err != nil {
> > +			log.Fatal(err)
> > +		}
> > +		if !result.Valid() {
> > +			log.Print("Trace event is invalid: ", scanner.Text())
> 
> It might be good idea to print the line (i.e. the value of `count`
> variable).
> 
> I guess that conforming to the <filename>:<line>: prefix (like e.g. gcc
> uses for errors) would be unnecessary.

Done in V3.

> > +			for _, desc := range result.Errors() {
> > +				log.Print("- ", desc)
> > +			}
> > +			os.Exit(1)
> > +		}
> > +	}
> > +
> > +	if err := scanner.Err(); err != nil {
> > +		log.Fatal("Scanning error: ", err)
> > +	}
> > +
> > +	log.Print("Validated events: ", count)
> > +}

  reply	other threads:[~2019-07-24 22:47 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-11 23:31 [RFC PATCH 0/3] Add a JSON Schema for trace2 events Josh Steadmon
2019-06-11 23:31 ` [RFC PATCH 1/3] trace2: correct trace2 field name documentation Josh Steadmon
2019-06-12 18:00   ` Junio C Hamano
2019-06-12 18:14     ` Josh Steadmon
2019-06-14 15:53   ` Jeff Hostetler
2019-06-11 23:31 ` [RFC PATCH 2/3] trace2: Add a JSON schema for trace2 events Josh Steadmon
2019-06-14 15:59   ` Jeff Hostetler
2019-06-20 17:26     ` Josh Steadmon
2019-06-11 23:31 ` [RFC PATCH 3/3] trace2: add a schema validator " Josh Steadmon
2019-06-12 13:28   ` Ævar Arnfjörð Bjarmason
2019-06-12 16:23     ` Josh Steadmon
2019-06-12 19:18       ` Jeff King
2019-06-20 18:15         ` Josh Steadmon
2019-06-21 11:53       ` Jakub Narebski
2019-06-27 13:57         ` Jeff Hostetler
2019-07-09 23:05 ` [RFC PATCH v2 0/3] Add a JSON Schema " Josh Steadmon
2019-07-09 23:05   ` [RFC PATCH v2 1/3] trace2: Add a JSON schema " Josh Steadmon
2019-07-10 18:32     ` Jakub Narebski
2019-07-24 22:37       ` Josh Steadmon
2019-07-09 23:05   ` [RFC PATCH v2 2/3] trace2: add a schema validator " Josh Steadmon
2019-07-11 13:35     ` Jakub Narebski
2019-07-24 22:47       ` Josh Steadmon [this message]
2019-07-09 23:05   ` [RFC PATCH v2 3/3] ci: run trace2 schema validation in the CI suite Josh Steadmon
2019-07-24 23:06 ` [PATCH v3 0/3] Add a JSON Schema for trace2 events Josh Steadmon
2019-07-24 23:06   ` [PATCH v3 1/3] trace2: Add a JSON schema " Josh Steadmon
2019-07-25 16:55     ` Junio C Hamano
2019-07-24 23:06   ` [PATCH v3 2/3] trace2: add a schema validator " Josh Steadmon
2019-07-24 23:06   ` [PATCH v3 3/3] ci: run trace2 schema validation in the CI suite Josh Steadmon
2019-07-25 11:18   ` [PATCH v3 0/3] Add a JSON Schema for trace2 events SZEDER Gábor
2019-07-25 16:14     ` Junio C Hamano
2019-07-26 21:16       ` Josh Steadmon
2019-07-25 23:42   ` SZEDER Gábor
2019-07-26 12:12     ` Johannes Schindelin
2019-07-26 13:53       ` SZEDER Gábor
2019-07-31 11:00         ` Johannes Schindelin
2019-07-26 22:03       ` Josh Steadmon
2019-08-01 18:08         ` Josh Steadmon
2019-08-02  1:52           ` Jonathan Nieder
2019-08-02 11:56             ` Johannes Schindelin
2019-08-02 16:59               ` Jonathan Nieder
2019-08-02 19:38                 ` SZEDER Gábor
2019-08-02 23:25                   ` Jonathan Nieder
2019-08-03 21:25                     ` Johannes Schindelin
2019-08-02 19:16             ` SZEDER Gábor
2019-08-02 23:06               ` Jonathan Nieder
2019-08-03  7:35                 ` SZEDER Gábor
2019-08-03  7:40                   ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190724224705.GC43313@google.com \
    --to=steadmon@google.com \
    --cc=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).