git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Jeff King <peff@peff.net>
Cc: Martin Langhoff <martin.langhoff@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Structured (ie: json) output for query commands?
Date: Wed, 30 Jun 2021 20:19:49 +0000	[thread overview]
Message-ID: <YNzR5ZZDTfcN2Q+s@camp.crustytoothpaste.net> (raw)
In-Reply-To: <YNyxD4qAHmbluNRe@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 1363 bytes --]

On 2021-06-30 at 17:59:43, Jeff King wrote:
> One complication we faced is that a lot of Git's data is bag-of-bytes,
> not utf8. And json technically requires utf8. I don't remember if we
> simply fudged that and output possibly non-utf8 sequences, or if we
> actually encode them.

I think we just emit invalid UTF-8 in that case, which is a problem.
That's why Git is not well suited to JSON output and why it isn't a good
choice for structured data here.  I'd like us not to do more JSON in our
codebase, since it's practically impossible for users to depend on our
output if we do that due to encoding issues[0].

We could emit data in a different format, such as YAML, which does have
encoding for arbitrary byte sequences.  However, in YAML, binary data is
always base64 encoded, which is less readable, although still
interchangeable.  CBOR is also a possibility, although it's not human
readable at all.

I'm personally fine with the ad-hoc approach we use now, which is
actually very convenient to script and, in my view, not to terrible to
parse in other tools and languages.  Your mileage may vary, though.

[0] I worked on a codebase for many years that exploited its JSON parser
not requiring UTF-8 and it was a colossal mess that I'd like us not to
repeat.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  parent reply	other threads:[~2021-06-30 20:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CACPiFC++fG-WL8uvTkiydf3wD8TY6dStVpuLcKA9cX_EnwoHGA@mail.gmail.com>
2021-06-30 17:00 ` Structured (ie: json) output for query commands? Martin Langhoff
2021-06-30 17:59   ` Jeff King
2021-06-30 18:20     ` Martin Langhoff
2021-07-01 15:47       ` Jeff King
2021-06-30 20:19     ` brian m. carlson [this message]
2021-06-30 23:27       ` Martin Langhoff
2021-07-01 16:00       ` Jeff King
2021-07-01 21:18         ` brian m. carlson
2021-07-01 21:48           ` Jeff King
2021-07-02 13:13           ` Ævar Arnfjörð Bjarmason
2021-07-01  8:18   ` Han-Wen Nienhuys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNzR5ZZDTfcN2Q+s@camp.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=martin.langhoff@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).