Re: [PATCH 0/2] routines to generate JSON data

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Jeff King <peff@peff.net>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: git@vger.kernel.org, gitster@pobox.com,
	lars.schneider@autodesk.com,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH 0/2] routines to generate JSON data
Date: Tue, 20 Mar 2018 01:42:23 -0400	[thread overview]
Message-ID: <20180320054223.GC15813@sigill.intra.peff.net> (raw)
In-Reply-To: <e2a9d2a9-f2e2-f285-52c3-be11668da543@jeffhostetler.com>

On Mon, Mar 19, 2018 at 06:19:26AM -0400, Jeff Hostetler wrote:

> > To make the above work, I think you'd have to store a little more state.
> > E.g., the "array_append" functions check "out->len" to see if they need
> > to add a separating comma. That wouldn't work if we might be part of a
> > nested array. So I think you'd need a context struct like:
> > 
> >    struct json_writer {
> >      int first_item;
> >      struct strbuf out;
> >    };
> >    #define JSON_WRITER_INIT { 1, STRBUF_INIT }
> > 
> > to store the state and the output. As a bonus, you could also use it to
> > store some other sanity checks (e.g., keep a "depth" counter and BUG()
> > when somebody tries to access the finished strbuf with a hanging-open
> > object or array).
> 
> Yeah, I thought about that, but I think it gets more complex than that.
> I'd need a stack of "first_item" values.  Or maybe the _begin() needs to
> increment a depth and set first_item and the _end() needs to always
> unset first_item.  I'll look at this gain.

I think you may be able to get by with just unsetting first_item for any
"end". Because as you "pop" to whatever data structure is holding
whatever has ended, you know it's no longer the first item (whatever
just ended was there before it).

I admit I haven't thought too hard on it, though, so maybe I'm missing
something.

> The thing I liked about the bottom-up construction is that it is easier
> to collect multiple sets in parallel and combine them during the final
> roll-up.  With the in-line nesting, you're tempted to try to construct
> the resulting JSON in a single series and that may not fit what the code
> is trying to do.  For example, if I wanted to collect an array of error
> messages as they are generated and an array of argv arrays and any alias
> expansions, then put together a final JSON string containing them and
> the final exit code, I'd need to build it in parts.  I can build these
> parts in pieces of JSON and combine them at the end -- or build up other
> similar data structures (string arrays, lists, or whatever) and then
> have a JSON conversion step.  But we can make it work both ways, I just
> wanted to keep it simpler.

Yeah, I agree that kind of bottom-up construction would be nice for some
cases. I'm mostly worried about inefficiency copying the strings over
and over as we build up the final output.  Maybe that's premature
worrying, though.

If the first_item thing isn't too painful, then it might be nice to have
both approaches available.

> > In general I'd really prefer to keep the shell script as the driver for
> > the tests, and have t/helper programs just be conduits. E.g., something
> > like:
> > 
> >    cat >expect <<-\EOF &&
> >    {"key": "value", "foo": 42}
> >    EOF
> >    test-json-writer >actual \
> >      object_begin \
> >      object_append_string key value \
> >      object_append_int foo 42 \
> >      object_end &&
> >    test_cmp expect actual
> > 
> > It's a bit tedious (though fairly mechanical) to expose the API in this
> > way, but it makes it much easier to debug, modify, or add tests later
> > on (for example, I had to modify the C program to find out that my
> > append example above wouldn't work).
> 
> Yeah, I wasn't sure if such a simple api required exposing all that
> machinery to the shell or not.  And the api is fairly self-contained
> and not depending on a lot of disk/repo setup or anything, so my tests
> would be essentially static WRT everything else.
> 
> With my t0019 script you should have been able use -x -v to see what
> was failing.

I was able to run the test-helper directly. The tricky thing is that I
had to write new C code to test my theory about how the API worked.
Admittedly that's not something most people would do regularly, but I
often seem to end up doing that kind of probing and debugging. Many
times I've found the more generic t/helper programs useful.

I also wonder if various parts of the system embrace JSON, if we'd want
to have a tool for generating it as part of other tests (e.g., to create
"expect" files).

-Peff

next prev parent reply	other threads:[~2018-03-20  5:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-16 19:40 [PATCH 0/2] routines to generate JSON data git
2018-03-16 19:40 ` [PATCH 1/2] json_writer: new routines to create data in JSON format git
2018-03-16 19:40 ` [PATCH 2/2] json-writer: unit test git
2018-03-16 21:18 ` [PATCH 0/2] routines to generate JSON data Jeff King
2018-03-16 23:00   ` Ævar Arnfjörð Bjarmason
2018-03-20  5:52     ` Jeff King
2018-03-17  7:38   ` Jacob Keller
2018-03-19 17:31     ` Jeff Hostetler
2018-03-19 10:19   ` Jeff Hostetler
2018-03-20  5:42     ` Jeff King [this message]
2018-03-20 16:44       ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180320054223.GC15813@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=lars.schneider@autodesk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).