git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Josh Steadmon" <steadmon@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] trace2: randomize/timestamp trace2 targets
Date: Fri, 15 Mar 2019 14:39:47 -0400	[thread overview]
Message-ID: <1431dc76-1b1c-c581-6355-b796591e99a8@jeffhostetler.com> (raw)
In-Reply-To: <87h8c6baif.fsf@evledraar.gmail.com>



On 3/13/2019 7:49 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Mar 14 2019, Josh Steadmon wrote:
> 
>> When the value of a trace2 environment variable contains instances of
>> the string "%ISO8601%", expand them into the current UTC timestamp in
>> ISO 8601 format.
> 
> Any reason not to just support feeding the path to strbuf_addftime(), to
> e.g. support a daily/hourly log?
> 
>> When the value of a trace2 environment variable is an absolute path
>> referring to an existing directory, write output to randomly-named
>> files under the given directory. If the value is an absolute path
>> referring to a non-existent file and ends with a dash, use the value as
>> a prefix for randomly named files.
>>
>> The random filenames will consist of the value of the environment
>> variable (after potential timestamp expansion), followed by a 6
>> character random string such as would be produced by mkstemp(3).
>>
>> This makes it more convenient to collect traces for every git
>> invocation by unconditionally setting the relevant trace2 envvar to a
>> constant directory name.
> 
> Hrm, api-trace2.txt already specifies that the "sid" is going to be
> unique, couldn't we just have some mode where we use that?
> 
> But then of course when we have nested processes will contain slashes,
> so we'd either run into deep nesting or need to munge the slashes, in
> which case we might bump against a file length limit (although I haven't
> seen process trees deeper than 3-4).

Using the "sid" would be a good place to start.  Just take the final
component in the string (after the last slash or the whole sid if there
are no slashes).  That will give you a filename with microseconds since
epoch of the command's start time and the PID.

That should be unique, should not require random strings, and not go
deep in the filesystem.  And it will let you correlate files between
child and parent commands, if you need to.

So maybe if GIT_TR2_* is set to a directory, we append the final portion
of the "sid" and create a file inside that directory.

> 
> Just to pry about the use-case since I'm doing similar collecting, why
> are you finding this easier to process?
> 
> With the current O_APPEND semantics you're (unless I've missed
> something) guaranteed to get a single process tree in nested order,
> whereas with this they'll all end up in separate files and you'll need
> to slurp them up, sort the whole thing and stitch it together yourself
> without the benefit of stream-parsing it where you can cheat a bit
> knowing that e.g. a "reflog expire" entry is always coming after the
> corresponding "gc" that invoked it.
> 

Yes, with O_APPEND, you should get a series of events as they happen
on the system all properly interleaved.  And see concurrent activity.
This file should let you grep to see individual processes if you want
to.

Routing each command to a different file is fine if you want, but
that opens you up to having to manage and delete them.

Whether to have 1 file (with occasional rotation) or 1 file-per-command
depends, I guess, on how you want to process them.

I'm routing the Trace2 data to a named-pipe/socket and have a daemon
collecting and filtering, so I have a single pathname for output and
yet get the per-file stream handling that I think Josh is looking for.

Thanks,
Jeff

  reply	other threads:[~2019-03-15 18:39 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-13 23:33 [PATCH 0/2] Randomize / timestamp trace2 targets Josh Steadmon
2019-03-13 23:33 ` [PATCH 1/2] date: make get_time() public Josh Steadmon
2019-03-13 23:33 ` [PATCH 2/2] trace2: randomize/timestamp trace2 targets Josh Steadmon
2019-03-13 23:49   ` Ævar Arnfjörð Bjarmason
2019-03-15 18:39     ` Jeff Hostetler [this message]
2019-03-15 19:26       ` Ævar Arnfjörð Bjarmason
2019-03-15 20:14         ` Jeff Hostetler
2019-03-15 20:43     ` Josh Steadmon
2019-03-15 20:49       ` Josh Steadmon
2019-03-18  1:40         ` Junio C Hamano
2019-03-19  3:17           ` Jeff King
2019-03-14  0:16   ` Jeff King
2019-03-14  6:07     ` Junio C Hamano
2019-03-14 14:34 ` [PATCH 0/2] Randomize / timestamp " Johannes Schindelin
2019-03-15 20:37   ` Josh Steadmon
2019-03-15 19:18 ` Jeff Hostetler
2019-03-15 20:38   ` Josh Steadmon
2019-03-18 12:50     ` Jeff Hostetler
2019-03-21  0:16 ` [PATCH v2 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21  0:16   ` [PATCH v2 1/1] trace2: write to directory targets Josh Steadmon
2019-03-21  2:04     ` Junio C Hamano
2019-03-21 17:43       ` Jeff Hostetler
2019-03-22  3:30         ` Junio C Hamano
2019-03-22 14:20           ` Jeff Hostetler
2019-03-21 21:09 ` [PATCH v3 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21 21:09   ` [PATCH v3 1/1] trace2: write to directory targets Josh Steadmon
2019-03-23 20:44     ` Ævar Arnfjörð Bjarmason
2019-03-24 12:33       ` Junio C Hamano
2019-03-24 14:51         ` Ævar Arnfjörð Bjarmason
2019-03-25  2:21           ` Junio C Hamano
2019-03-25  8:21             ` Ævar Arnfjörð Bjarmason
2019-03-25 16:29       ` Jeff Hostetler
2019-03-21 21:16   ` [PATCH v3 0/1] Write trace2 output to directories Jeff Hostetler
2019-03-22  5:23     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1431dc76-1b1c-c581-6355-b796591e99a8@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).