git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jeff King <peff@peff.net>, Jeff Hostetler <jeffhost@microsoft.com>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH] Add very verbose porcelain output to status
Date: Tue, 12 Jul 2016 12:13:02 -0400	[thread overview]
Message-ID: <5785170E.9070000@jeffhostetler.com> (raw)
In-Reply-To: <20160712150749.GD613@sigill.intra.peff.net>


Thanks for the feedback.  I like the overall direction of your
suggestions.  Going for a porcelain V2 feels better than piling
onto verbose.  I think that would also give us a little more license
to address some of the line format issues that you point out.
Let me retool what I have along these lines and see how it looks.

Thank again,
Jeff




On 07/12/2016 11:07 AM, Jeff King wrote:
> On Thu, Jul 07, 2016 at 03:26:28PM -0400, Jeff Hostetler wrote:
>
>> Tools interacting with Git repositories may need to know the complete
>> state of the working directory. For efficiency, it would be good to have
>> a single command to obtain this information.
>>
>> We already have a `--porcelain` mode intended for tools' consumption,
>> and it only makes sense to enhance this mode to offer more information.
>>
>> Just like we do elsewhere in Git's source code, we now interpret
>> multiple `--verbose` flags accumulatively, and show substantially more
>> information in porcelain mode at verbosity level 2.
>
> Using "--verbose" to change the stable output feels a little funny to
> me. Usually --verbose is about increasing the human-readable output to
> stderr. Of course "--verbose" for git-status is already a little weird.
> A single "-v" seems to do nothing, but two will turn on a diff for the
> long-format output.
>
> But it seems to me this just confuses things more. Why does "git status
> --porcelain -v" do nothing, for example? What happens when we want to
> add more information to the porcelain output later? Do we have to
> require "-vvv" to trigger it, and if so, is there a way to get that
> information without the "-vv" information?
>
> I think the existing example for adding to the --porcelain format is
> the --branch option. Could we have "--state" or something to trigger
> the extra lines?
>
>> +If --branch is given, the first line shows a summary of the current
>> +operation in progress.  This line begins with "### state: ", the name
>> +of the operation in progress, and then operation-specific information.
>> +Fields are separated by a single space.
>
> This seems like a good bit of information to have, and AFAIK we don't
> expose the state information in a machine-readable way. A few nits:
>
>> +    Operation   Fields                     Explanation
>> +    ------------------------------------------------------------------
>> +    clean
>
> Using "clean" here seems weird, as that term is usually meant to mean
> the opposite of dirty (i.e., there is something in your working tree
> that can be committed). But there is no "dirty" here. Would "normal" or
> "none" be a good description?
>
>> +    ------------------------------------------------------------------
>> +    merge       <nr>                       Number unmerged
>
> This <nr> (and others below) is redundant with the rest of the output,
> right (i.e., you could count up the "U" entries yourself)? I don't mind
> it as a convenience, but I'm kind of curious of why it's useful.
>
>> +    ------------------------------------------------------------------
>> +    am          [E]                        Present if the current patch
>> +                                           is empty
>> +    ------------------------------------------------------------------
>> +    rebase      <nr>                       Number unmerged
>> +                [S]                        Present if split commit in
>> +                                           progress during rebase
>> +                [E]                        Present if editing a commit
>> +                                           during rebase
>
> Can a rebased commit be empty? If so, should that state get "E" to match
> "am"?
>
> Does editing differentiate between "stopped because of a conflict that
> needs addressing" versus "stopped because the interactive insn sheet
> told us to"?
>
>> +                [I(<done>/<total>)]        Present if in an interactive
>> +                                           rebase. Step counts are given.
>> +                [<current>:<onto>]         Rebase branches
>
> I wonder if we ought to just use "<current> <onto>" here, as separate
> arguments. A ":" between refs usually signifies a refspec, but that is
> not what this is.
>
> I note this is optional, though. Is there a case where we might have one
> but not the other? That would make things more difficult syntactically.
>
>> +If --branch is given, the second line shows branch tracking information.
>> +This line begins with "### track: ".  Fields are separated by a single
>> +space, unless otherwise indicated.
>> +
>> +    Field                    Meaning
>> +    --------------------------------------------------------
>> +    <sha> | "(initial)"      Current commit
>> +    <branch> | "(detached)"  Current branch
>> +    ":"<upstream>            Upstream branch, if set
>> +    "+"<ahead>               Ahead count, if upstream present
>> +    "-"<behind>              Behind count, if upstream present
>> +    --------------------------------------------------------
>
> This really _ought_ to have been how --branch worked in the first place
> with --porcelain. But unfortunately the existing format has been in the
> wild for a long time. To top it off, the existing format is translated!
> So you cannot even match "Initial commit on (.*)". Yech.
>
> So this looks like a big improvement. And I see now why you wanted
> something like "-vv" to implement this, instead of "--state". In
> addition to adding new lines, this is fixing mistakes made in the prior
> format.
>
> I'm still not convinced that it makes sense to trigger this more-sane
> format with a "verbose" option, though. It's not more verbose. It's just
> less crappy. It would make more sense to me if this was "porcelain v2",
> and triggered with "--porcelain=2" or similar.
>
> Ditto with the new per-file format you introduce (which at least _is_
> more verbose).
>
>> +A series of lines are then displayed for the tracked entries.
>> +Lines have one of the following formats:
>> +
>> +    XYS mH mI mW shaH shaI PATH
>> +    XYS mH mI mW shaH shaI score OLD_PATH\tPATH
>> +    XYS m1 m2 m3 mW sha1 sha2 sha3 PATH
>
> So this is sort of like a --raw diff, but covering the 3 states. That
> makes some sense to me.
>
> I notice in the middle one that there is room for only one rename
> (between the index and HEAD). I guess we cannot have a rename between
> the working tree and the index, because the dst file would have to be a
> new file, which by definition is untracked, and not eligible.
>
> One of the nasty points of the current format is that it is
> context-sensitive. You don't know if you have "PATH" with a space, or
> "score OLD_PATH" until you see whether "X" tells you if you have a
> rename or not. So simple parsing with things like "cut" doesn't work.
>
> Can we collapse these into a single line format where some entries just
> end up with some nil state (e.g., "0" for the score of something that
> isn't a rename)?
>
> While I'm thinking about it, have you looked at how this all works with
> "-z"?
>
>> +* X and Y were described in the short format section.
>> +* S is a one character summary of the submodule status with values '0'..'7'
>> +  representing the sum of: 4 when submodule has a new commit, 2 when the
>> +  submodule is modified, and 1 when the submodule contains untracked changes.
>
> Is there any reason not to give these non-numeric constants? I.e., to
> treat it as a sequence of flags, which can be present or not?
>
>> +  The value is ' ' if the entry is not a submodule.
>
> So would it be:
>
>    XY  100644 ...
>
> with two spaces in that case?
>
> I wonder if it would be less mysterious if the submodule data got its
> own bitfield, like:
>
>    N - not a submodule
>    S[C][M][U] - is a submodule, with one or more of:
>                  - commits (C)
> 		- modifications (M)
> 		- untracked files (U)
>
>> @@ -1380,7 +1382,20 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>>   	fd = hold_locked_index(&index_lock, 0);
>>
>>   	s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
>> +	if (!s.is_initial)
>> +		hashcpy(s.sha_commit, sha1);
>> +
>>   	s.ignore_submodule_arg = ignore_submodule_arg;
>> +	if (verbose > 1 && status_format == STATUS_FORMAT_PORCELAIN) {
>> +		/* Capture extra data for very verbose porcelain output. */
>> +		s.verbose = verbose;
>
> The existing s.verbose setup happens in the big format-specific switch
> statement below. Should this go there, too? (Though if you have read the
> rest of my email, you know I'd prefer this actually end up as a new
> format).
>
>> +		/* Force safe_crlf off to prevent normal LF/CRLF warning
>> +		 * message from being printed on stderr for each new file.
>> +		 */
>> +		safe_crlf = SAFE_CRLF_FALSE;
>> +	}
>
> Hmm, this seems like something that would be useful for all formats. It
> comes from doing a diff at all, right?
>
>> [...]
>
> I gave only a brief skim over the rest of the code, as I think the
> design issues are more interesting at this stage.
>
> Please don't let the length of my comments discourage you; I think this
> is a good direction to be going. I did find I had to piece together the
> rationale a bit from looking at the patch itself. I think it would be
> easier to understand as a few patches:
>
>    - add infrastructure and rules for a new v2 porcelain format; the
>      format is still in flux for the next few patches, but will be
>      cemented at release time (in your world-view this is the "more
>      verbose format", but I'm taking the liberty of pretending you agree
>      with everything I wrote above ;) ).
>
>    - add v2 per-file format; the rationale is that the existing format
>      leaves out some details. And also that it's hard to parse (if you
>      agree that we should clean up those cases).
>
>    - convert "--branch" output in v2 to what you have here; the rationale
>      is that the existing format is horribly unstable
>
>    - add "--state"; the rationale is that there isn't a machine-readable
>      way to get this data right now. This can be independent of the v1/v2
>      flag, since we output it only when asked for, and v1-era git didn't
>      have it at all.
>
>    - the SAFE_CRLF_FALSE tweak above; this seems orthogonal to the rest
>      of it, but offhand seems like a reasonable idea to me.
>
> -Peff
>

  reply	other threads:[~2016-07-12 16:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07 19:26 [PATCH] Add very verbose porcelain output to status Jeff Hostetler
2016-07-08  6:29 ` Johannes Schindelin
2016-07-12 15:07 ` Jeff King
2016-07-12 16:13   ` Jeff Hostetler [this message]
2016-07-13  8:00     ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5785170E.9070000@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).