git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Josh Steadmon <steadmon@google.com>,
	Derrick Stolee <stolee@gmail.com>,
	git@vger.kernel.org, szeder.dev@gmail.com
Subject: Re: [RFC PATCH v3 3/3] trace2: write overload message to sentinel files
Date: Fri, 20 Sep 2019 11:59:08 -0400	[thread overview]
Message-ID: <f6645671-b010-856c-821e-3b6b0ad5265a@jeffhostetler.com> (raw)
In-Reply-To: <20190919224733.GA224668@google.com>



On 9/19/2019 6:47 PM, Josh Steadmon wrote:
> On 2019.09.19 14:23, Jeff Hostetler wrote:
>>
>>
>> On 9/16/2019 2:20 PM, Josh Steadmon wrote:
>>> On 2019.09.16 10:11, Jeff Hostetler wrote:
>>>>
>>>>
>>>> On 9/16/2019 8:07 AM, Derrick Stolee wrote:
>>>>> On 9/13/2019 8:26 PM, Josh Steadmon wrote:
>>>>>> Add a new "overload" event type for trace2 event destinations. Write
>>>>>> this event into the sentinel file created by the trace2.maxFiles
>>>>>> feature. Bump up the event format version since we've added a new event
>>>>>> type.
>>>>>>
>>>>>> Writing this message into the sentinel file is useful for tracking how
>>>>>> often the overload protection feature is triggered in practice.
>>>>>
>>>>> Putting meaningful data into the sentinel file is valuable. It's
>>>>> important to know a bit about when and why this happened. A user
>>>>> would be able to inspect the modified time, and the directory info
>>>>> you include is unnecessary. The data you include is only for the
>>>>> log aggregator to keep valuable data around overloads.
>>>>>> +`"overload"`::
>>>>>> +	This event is created in a sentinel file if we are overloading a target
>>>>>> +	trace directory (see the trace2.maxFiles config option).
>>>>>> ++
>>>>>> +------------
>>>>>> +{
>>>>>> +	"event":"overload",
>>>>>> +	...
>>>>>> +	"dir":"/trace/target/dir/", # The configured trace2 target directory
>>>>>> +	"evt":"2",		    # EVENT format version
>>>>>> +}
>>>>> That said, do we really need to resort to a new event format and
>>>>> event type? Could we instead use the "data" event with a key
>>>>> "overload" and put the target dir in the value?
>>>>>
>>>>> Thanks,
>>>>> -Stolee
>>>>>
>>>>
>>>> If I understand the code here, the overload event/message is
>>>> only written to the sentinel file -- it is not written to a
>>>> regular trace2 log file, so regular log file consumers will
>>>> never see this event, right?
>>>
>>> Well, I guess it's hard to define what is a "regular log file consumer".
>>> In our case, our collection system will treat sentinel files like any
>>> other trace file, so it's useful to have it match the expected trace
>>> format.
>>>
>>> At least for our use, we don't want the sentinel files treated
>>> specially, because we want the log collection system to just do its
>>> thing and remove the file after processing, which lets Git know that
>>> it's ok to start writing traces again.
>>>
>>>> That message could be in any format, right?  And you could write
>>>> as much or as little data into the sentinel file as you want.
>>>
>>> To me it seems that it would be less surprising on the users' side if
>>> any data written to the sentinel file matches the format of the
>>> requested traces. If I have an automated process that's reading JSON
>>> from a directory full of files, I don't want to have to add a special
>>> case where one file might have perf-format data (or vice versa).
>>>
>>>> There's no compelling reason to extend the existing trace2 format
>>>> to have a new message type, so I'm not seeing a reason to add the
>>>> event-type nor to increment the version number.
>>>>
>>>> The existing trace2 formats and messages/event-types are defined
>>>> and driven by the Trace2 API calls presented to upper layers
>>>> (consumers of the public trace2_*() functions and macros defined
>>>> in trace2.h).  This overload event doesn't fit that model.
>>>
>>> Yeah, I did feel like this might be overkill. Do you think Stolee's
>>> suggestion to use a "data" event instead would be acceptable?
>>>
>>>> I think it'd be better to just directly write() a message -- in
>>>> plain-text or JSON or whatever -- in tr2_create_sentinel() and
>>>> not try to piggy-back on the existing format machinery in the
>>>> tr2_tgt_*.c files.
>>>
>>> I had a version that did this originally, but I don't really like having
>>> an unexpected special case where we just write a static JSON string. It
>>> feels like an ugly corner case, and would be surprising to users, IMO.
>>> But if everyone thinks this is a better approach, I suppose I could just
>>> add a switch statement in tr2_create_sentinel() that looks at the
>>> sysenv_var field of the tr2_dst.
>>>
>>
>>
>> You make some good points.  I suppose it would be good to be able
>> to parse the overload file using the same reader/scheme as the
>> other events.  Well, at least for the JSON format; the other formats
>> don't really matter for your purposes anyway.
>>
>> I am concerned that the new "overload" event will be the only event
>> in the file and therefore replace the "version" event in those files.
>> That is, we'll break the invariant that all JSON files begin with a
>> "version" event that containing the event version string.  That is,
>> in the current proposal, the format becomes:
>>
>>      v2 ::= <overload> | <<v1>>
>>      v1 ::= <version> <start> ... <atexit>
>>
>> V1 readers were promised that the first event in the file would
>> always be a <version> event.
> 
> Ah interesting, I wasn't aware of this. Thanks for clarifying. I do see
> that there's a recommendation that trace2_initialize() should be called
> "as early as possible", but perhaps I should add a note to the docs?
> 
> 
>> And that they can dispatch on the
>> version.evt field.  V1 readers won't recognize the <overload> event
>> and they won't know to look at the overload.evt field.  That might
>> cause V1 parsers to throw a harder error than a simpler version
>> mismatch.
>>
>> Just using a "data" event also feels wrong for the same reasons.
>> At that point in tr2_create_sentinel(), a new "data" event would
>> just give us:
>>
>>      V2 ::= <data key="overload", value="true"> | <<v1>>
>>      v1 ::= <version> <start> ... <atexit>
>>
>> Having said that I wonder if it would be better to have
>> tr2_create_sentinel() just set a flag (and leave the fd open).
>> And then either add the new event as:
>>
>>      V2 ::= <version evt=2> <overload dir=path, max=n> | <<v1>>
>>
>> or just add a column to the <version> event (and go ahead and
>> let the overload file be a full trace2 log from the command):
>>
>>      V1 ::= <version evt=1, [overload=true]> <start> ... <atexit>
>>
>> Does that make sense??
> 
> Yeah, that seems reasonable. We're already adding one more file to the
> target directory, so we might as well include the traces from this run.
> 
> However, you're still keeping evt=1 here; do we make any promises about
> fields within events being constant in a particular event version? After
> scanning api-trace2.txt, I don't see any explicit description of what
> may or may not change within a version, and the idea that we might add
> new fields would be surprising at least to me. I'm fine either way so
> long as it's documented. Would a section like the following in
> api-trace2.txt be acceptable?
> 
> """
> Adding new fields to events does not require a change in the EVT.
> Removing fields, or adding or removing event types does require an
> increment to the EVT.
> """

Um, good point.  Adding fields to an existing event is probably
safe -- unless it conveys a more significant change in behavior.
Something, for example, that changes the interpretation of existing
events or fields would require a version change.

I could go either way on this.  Perhaps it would be better to just
update the version number and add the event.  (Sorry to be wishy-washy
on this.)

     json ::=   <version evt=1> <<body>>
              | <version evt=2> [<overload>] [<<body>>]
     body ::= <start> ... <atexit>

That has a bit of a cleaner feel (in hindsight).

Jeff




  reply	other threads:[~2019-09-20 15:59 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-29 22:20 [RFC PATCH] trace2: don't overload target directories Josh Steadmon
2019-07-30 13:29 ` Derrick Stolee
2019-07-30 21:52   ` Josh Steadmon
2019-07-30 16:46 ` Jeff Hostetler
2019-07-30 22:01   ` Josh Steadmon
2019-07-30 22:02   ` Josh Steadmon
2019-07-30 18:00 ` Jeff Hostetler
2019-07-30 22:08   ` Josh Steadmon
2019-08-02 22:02 ` [RFC PATCH v2 0/2] " Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 1/2] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 2/2] trace2: don't overload target directories Josh Steadmon
2019-08-05 15:34     ` Jeff Hostetler
2019-08-05 18:17       ` Josh Steadmon
2019-08-05 18:01     ` SZEDER Gábor
2019-08-05 18:09       ` Josh Steadmon
2019-09-14  0:25 ` [RFC PATCH v3 0/3] " Josh Steadmon
2019-09-14  0:25   ` [RFC PATCH v3 1/3] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-09-14  0:25   ` [RFC PATCH v3 2/3] trace2: don't overload target directories Josh Steadmon
2019-09-14  0:26   ` [RFC PATCH v3 3/3] trace2: write overload message to sentinel files Josh Steadmon
2019-09-16 12:07     ` Derrick Stolee
2019-09-16 14:11       ` Jeff Hostetler
2019-09-16 18:20         ` Josh Steadmon
2019-09-19 18:23           ` Jeff Hostetler
2019-09-19 22:47             ` Josh Steadmon
2019-09-20 15:59               ` Jeff Hostetler [this message]
2019-09-16 18:07       ` Josh Steadmon
2019-10-03 23:32 ` [PATCH v4 0/4] trace2: don't overload target directories Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 1/4] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 2/4] docs: clarify trace2 version invariants Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 3/4] trace2: don't overload target directories Josh Steadmon
2019-10-04  0:25     ` Junio C Hamano
2019-10-04 21:57       ` Josh Steadmon
2019-10-04  9:12     ` Johannes Schindelin
2019-10-04 22:05       ` Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 4/4] trace2: write overload message to sentinel files Josh Steadmon
2019-10-04 22:08 ` [PATCH v5 0/4] trace2: discard new traces if the target directory contains too many files Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 1/4] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 2/4] docs: clarify trace2 version invariants Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 3/4] trace2: discard new traces if target directory has too many files Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 4/4] trace2: write discard message to sentinel files Josh Steadmon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6645671-b010-856c-821e-3b6b0ad5265a@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=steadmon@google.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).