git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, gitster@pobox.com, git@jeffhostetler.com
Subject: Re: [PATCH v3 1/1] trace2: write to directory targets
Date: Sat, 23 Mar 2019 21:44:39 +0100	[thread overview]
Message-ID: <87bm21coco.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <ce5258610ffbc2e498ff33336c5c89b69468d4fd.1553202340.git.steadmon@google.com>


On Thu, Mar 21 2019, Josh Steadmon wrote:

> When the value of a trace2 environment variable is an absolute path
> referring to an existing directory, write output to files (one per
> process) underneath the given directory. Files will be named according
> to the final component of the trace2 SID, followed by a counter to avoid
> potential collisions.

Is this "counter to avoid collisions" something you've seen the need to
have in practice, or could we just squash this on top:

    diff --git a/trace2/tr2_dst.c b/trace2/tr2_dst.c
    index c3d82ca6a4..06cbef5837 100644
    --- a/trace2/tr2_dst.c
    +++ b/trace2/tr2_dst.c
    @@ -13,11 +13,6 @@
      */
     #define TR2_ENVVAR_DST_DEBUG "GIT_TR2_DST_DEBUG"

    -/*
    - * How many attempts we will make at creating an automatically-named trace file.
    - */
    -#define MAX_AUTO_ATTEMPTS 10
    -
     static int tr2_dst_want_warning(void)
     {
     	static int tr2env_dst_debug = -1;
    @@ -48,7 +43,6 @@ static int tr2_dst_try_auto_path(struct tr2_dst *dst, const char *tgt_prefix)
     	const char *last_slash, *sid = tr2_sid_get();
     	struct strbuf path = STRBUF_INIT;
     	size_t base_path_len;
    -	unsigned attempt_count;

     	last_slash = strrchr(sid, '/');
     	if (last_slash)
    @@ -60,17 +54,7 @@ static int tr2_dst_try_auto_path(struct tr2_dst *dst, const char *tgt_prefix)
     	strbuf_addstr(&path, sid);
     	base_path_len = path.len;

    -	for (attempt_count = 0; attempt_count < MAX_AUTO_ATTEMPTS; attempt_count++) {
    -		if (attempt_count > 0) {
    -			strbuf_setlen(&path, base_path_len);
    -			strbuf_addf(&path, ".%d", attempt_count);
    -		}
    -
    -		fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, 0666);
    -		if (fd != -1)
    -			break;
    -	}
    -
    +	fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, 0666);
     	if (fd == -1) {
     		if (tr2_dst_want_warning())
     			warning("trace2: could not open '%.*s' for '%s' tracing: %s",

The reason I'm raising this is that it seems like sweeping an existing
issue under the rug. We document that the "sid" is "unique", and it's just:

    <nanotime / 1000 (i.e. *nix time in microseconds)>-<pid>

So that might be a lie, and in particular I can imagine that say if
every machine at Google is logging traces into some magic mounted FS
that there'll be collisions there.

But then let's *fix that*, because we're also e.g. going to have other
consumers of these traces using the sid's as primary keys in a logging
system.

I wonder if we should just make it a bit longer, human-readable, and
include a hash of the hostname:

    perl -MTime::HiRes=gettimeofday -MSys::Hostname -MDigest::SHA=sha1_hex -MPOSIX=strftime -wE '
        my ($t, $m) = gettimeofday;
        my $host_hex = substr sha1_hex(hostname()), 0, 8;
        my $htime = strftime("%Y%m%d%H%M%S", localtime);
        my $sid = sprintf("%s-%6d-%s-%s",
            $htime,
            $m,
            $host_hex,
            $$ & 0xFFFF,
        );
        say $sid;
    '

Which gets you a SID like:

    20190323213918-404788-c2f5b994-19027

I.e.:

    <YYYYMMDDHHMMSS>-<microsecond-offset>-<8 chars of sha1(hostname -f)>-<pid>

There's obviously ways to make that more compact, but in this case I
couldn't see a reason to, also using UTC would be a good idea.

All the trace2 tests pass if I fake that up. Jeff H: Do you have
anything that relies on the current format?

> This makes it more convenient to collect traces for every git invocation
> by unconditionally setting the relevant trace2 envvar to a constant
> directory name.
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
>  Documentation/technical/api-trace2.txt |  5 ++
>  t/t0210-trace2-normal.sh               | 15 ++++++
>  trace2/tr2_dst.c                       | 63 +++++++++++++++++++++++++-
>  3 files changed, 81 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt
> index 2de565fa3d..d0948ba250 100644
> --- a/Documentation/technical/api-trace2.txt
> +++ b/Documentation/technical/api-trace2.txt
> @@ -109,6 +109,11 @@ values are recognized.
>
>  	Enables the target, opens and writes to the file in append mode.
>
> +	If the target already exists and is a directory, the traces will be
> +	written to files (one per process) underneath the given directory. They
> +	will be named according to the last component of the SID (optionally
> +	followed by a counter to avoid filename collisions).
> +
>  `af_unix:[<socket_type>:]<absolute-pathname>`::
>
>  	Enables the target, opens and writes to a Unix Domain Socket
> diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
> index 03a0aedb1d..819430658b 100755
> --- a/t/t0210-trace2-normal.sh
> +++ b/t/t0210-trace2-normal.sh
> @@ -80,6 +80,21 @@ test_expect_success 'normal stream, return code 1' '
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'automatic filename' '
> +	test_when_finished "rm -r traces actual expect" &&
> +	mkdir traces &&
> +	GIT_TR2="$(pwd)/traces" test-tool trace2 001return 0 &&
> +	perl "$TEST_DIRECTORY/t0210/scrub_normal.perl" <"$(ls traces/*)" >actual &&
> +	cat >expect <<-EOF &&
> +		version $V
> +		start _EXE_ trace2 001return 0
> +		cmd_name trace2 (trace2)
> +		exit elapsed:_TIME_ code:0
> +		atexit elapsed:_TIME_ code:0
> +	EOF
> +	test_cmp expect actual
> +'
> +
>  # Verb 002exit
>  #
>  # Explicit exit(code) from within cmd_<verb> propagates <code>.
> diff --git a/trace2/tr2_dst.c b/trace2/tr2_dst.c
> index fd490a43ad..c3d82ca6a4 100644
> --- a/trace2/tr2_dst.c
> +++ b/trace2/tr2_dst.c
> @@ -1,5 +1,6 @@
>  #include "cache.h"
>  #include "trace2/tr2_dst.h"
> +#include "trace2/tr2_sid.h"
>
>  /*
>   * If a Trace2 target cannot be opened for writing, we should issue a
> @@ -12,6 +13,11 @@
>   */
>  #define TR2_ENVVAR_DST_DEBUG "GIT_TR2_DST_DEBUG"
>
> +/*
> + * How many attempts we will make at creating an automatically-named trace file.
> + */
> +#define MAX_AUTO_ATTEMPTS 10
> +
>  static int tr2_dst_want_warning(void)
>  {
>  	static int tr2env_dst_debug = -1;
> @@ -36,6 +42,55 @@ void tr2_dst_trace_disable(struct tr2_dst *dst)
>  	dst->need_close = 0;
>  }
>
> +static int tr2_dst_try_auto_path(struct tr2_dst *dst, const char *tgt_prefix)
> +{
> +	int fd;
> +	const char *last_slash, *sid = tr2_sid_get();
> +	struct strbuf path = STRBUF_INIT;
> +	size_t base_path_len;
> +	unsigned attempt_count;
> +
> +	last_slash = strrchr(sid, '/');
> +	if (last_slash)
> +		sid = last_slash + 1;
> +
> +	strbuf_addstr(&path, tgt_prefix);
> +	if (!is_dir_sep(path.buf[path.len - 1]))
> +		strbuf_addch(&path, '/');
> +	strbuf_addstr(&path, sid);
> +	base_path_len = path.len;
> +
> +	for (attempt_count = 0; attempt_count < MAX_AUTO_ATTEMPTS; attempt_count++) {
> +		if (attempt_count > 0) {
> +			strbuf_setlen(&path, base_path_len);
> +			strbuf_addf(&path, ".%d", attempt_count);
> +		}
> +
> +		fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, 0666);
> +		if (fd != -1)
> +			break;
> +	}
> +
> +	if (fd == -1) {
> +		if (tr2_dst_want_warning())
> +			warning("trace2: could not open '%.*s' for '%s' tracing: %s",
> +				(int) base_path_len, path.buf,
> +				dst->env_var_name, strerror(errno));
> +
> +		tr2_dst_trace_disable(dst);
> +		strbuf_release(&path);
> +		return 0;
> +	}
> +
> +	strbuf_release(&path);
> +
> +	dst->fd = fd;
> +	dst->need_close = 1;
> +	dst->initialized = 1;
> +
> +	return dst->fd;
> +}
> +
>  static int tr2_dst_try_path(struct tr2_dst *dst, const char *tgt_value)
>  {
>  	int fd = open(tgt_value, O_WRONLY | O_APPEND | O_CREAT, 0666);
> @@ -202,8 +257,12 @@ int tr2_dst_get_trace_fd(struct tr2_dst *dst)
>  		return dst->fd;
>  	}
>
> -	if (is_absolute_path(tgt_value))
> -		return tr2_dst_try_path(dst, tgt_value);
> +	if (is_absolute_path(tgt_value)) {
> +		if (is_directory(tgt_value))
> +			return tr2_dst_try_auto_path(dst, tgt_value);
> +		else
> +			return tr2_dst_try_path(dst, tgt_value);
> +	}
>
>  #ifndef NO_UNIX_SOCKETS
>  	if (starts_with(tgt_value, PREFIX_AF_UNIX))

  reply	other threads:[~2019-03-23 20:44 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-13 23:33 [PATCH 0/2] Randomize / timestamp trace2 targets Josh Steadmon
2019-03-13 23:33 ` [PATCH 1/2] date: make get_time() public Josh Steadmon
2019-03-13 23:33 ` [PATCH 2/2] trace2: randomize/timestamp trace2 targets Josh Steadmon
2019-03-13 23:49   ` Ævar Arnfjörð Bjarmason
2019-03-15 18:39     ` Jeff Hostetler
2019-03-15 19:26       ` Ævar Arnfjörð Bjarmason
2019-03-15 20:14         ` Jeff Hostetler
2019-03-15 20:43     ` Josh Steadmon
2019-03-15 20:49       ` Josh Steadmon
2019-03-18  1:40         ` Junio C Hamano
2019-03-19  3:17           ` Jeff King
2019-03-14  0:16   ` Jeff King
2019-03-14  6:07     ` Junio C Hamano
2019-03-14 14:34 ` [PATCH 0/2] Randomize / timestamp " Johannes Schindelin
2019-03-15 20:37   ` Josh Steadmon
2019-03-15 19:18 ` Jeff Hostetler
2019-03-15 20:38   ` Josh Steadmon
2019-03-18 12:50     ` Jeff Hostetler
2019-03-21  0:16 ` [PATCH v2 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21  0:16   ` [PATCH v2 1/1] trace2: write to directory targets Josh Steadmon
2019-03-21  2:04     ` Junio C Hamano
2019-03-21 17:43       ` Jeff Hostetler
2019-03-22  3:30         ` Junio C Hamano
2019-03-22 14:20           ` Jeff Hostetler
2019-03-21 21:09 ` [PATCH v3 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21 21:09   ` [PATCH v3 1/1] trace2: write to directory targets Josh Steadmon
2019-03-23 20:44     ` Ævar Arnfjörð Bjarmason [this message]
2019-03-24 12:33       ` Junio C Hamano
2019-03-24 14:51         ` Ævar Arnfjörð Bjarmason
2019-03-25  2:21           ` Junio C Hamano
2019-03-25  8:21             ` Ævar Arnfjörð Bjarmason
2019-03-25 16:29       ` Jeff Hostetler
2019-03-21 21:16   ` [PATCH v3 0/1] Write trace2 output to directories Jeff Hostetler
2019-03-22  5:23     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bm21coco.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).