git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: git@vger.kernel.org
Cc: stolee@gmail.com, git@jeffhostetler.com, szeder.dev@gmail.com
Subject: [RFC PATCH v3 2/3] trace2: don't overload target directories
Date: Fri, 13 Sep 2019 17:25:59 -0700
Message-ID: <bf20ec8ea25ae80ee32d2867fa168c94fbe07d29.1568419818.git.steadmon@google.com> (raw)
In-Reply-To: <cover.1568419818.git.steadmon@google.com>

trace2 can write files into a target directory. With heavy usage, this
directory can fill up with files, causing difficulty for
trace-processing systems.

This patch adds a config option (trace2.maxFiles) to set a maximum
number of files that trace2 will write to a target directory. The
following behavior is enabled when the maxFiles is set to a positive
integer:
  When trace2 would write a file to a target directory, first check
  whether or not the directory is overloaded. A directory is overloaded
  if there is a sentinel file declaring an overload, or if the number of
  files exceeds trace2.maxFiles. If the latter, create a sentinel file
  to speed up later overload checks.

The assumption is that a separate trace-processing system is dealing
with the generated traces; once it processes and removes the sentinel
file, it should be safe to generate new trace files again.

The default value for trace2.maxFiles is zero, which disables the
overload check.

The config can also be overridden with a new environment variable:
GIT_TRACE2_MAX_FILES.

Potential future work:
* Write a message into the sentinel file (should match the requested
  trace2 output format).
* Add a performance test to make sure that contention between multiple
  processes all writing to the same target directory does not become an
  issue.

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 Documentation/config/trace2.txt |  6 +++
 t/t0212-trace2-event.sh         | 17 +++++++
 trace2/tr2_dst.c                | 86 +++++++++++++++++++++++++++++++++
 trace2/tr2_sysenv.c             |  3 ++
 trace2/tr2_sysenv.h             |  2 +
 5 files changed, 114 insertions(+)

diff --git a/Documentation/config/trace2.txt b/Documentation/config/trace2.txt
index 2edbfb02fe..4ce0b9a6d1 100644
--- a/Documentation/config/trace2.txt
+++ b/Documentation/config/trace2.txt
@@ -54,3 +54,9 @@ trace2.destinationDebug::
 	By default, these errors are suppressed and tracing is
 	silently disabled.  May be overridden by the
 	`GIT_TRACE2_DST_DEBUG` environment variable.
+
+trace2.maxFiles::
+	Integer.  When writing trace files to a target directory, do not
+	write additional traces if we would exceed this many files. Instead,
+	write a sentinel file that will block further tracing to this
+	directory. Defaults to 0, which disables this check.
diff --git a/t/t0212-trace2-event.sh b/t/t0212-trace2-event.sh
index ff5b9cc729..2ff97e72da 100755
--- a/t/t0212-trace2-event.sh
+++ b/t/t0212-trace2-event.sh
@@ -265,4 +265,21 @@ test_expect_success JSON_PP 'using global config, event stream, error event' '
 	test_cmp expect actual
 '
 
+test_expect_success "don't overload target directory" '
+	mkdir trace_target_dir &&
+	test_when_finished "rm -r trace_target_dir" &&
+	(
+		GIT_TRACE2_MAX_FILES=5 &&
+		export GIT_TRACE2_MAX_FILES &&
+		cd trace_target_dir &&
+		test_seq $GIT_TRACE2_MAX_FILES >../expected_filenames.txt &&
+		xargs touch <../expected_filenames.txt &&
+		cd .. &&
+		GIT_TRACE2_EVENT="$(pwd)/trace_target_dir" test-tool trace2 001return 0
+	) &&
+	echo git-trace2-overload >>expected_filenames.txt &&
+	ls trace_target_dir >ls_output.txt &&
+	test_cmp expected_filenames.txt ls_output.txt
+'
+
 test_done
diff --git a/trace2/tr2_dst.c b/trace2/tr2_dst.c
index 5dda0ca1cd..414053d550 100644
--- a/trace2/tr2_dst.c
+++ b/trace2/tr2_dst.c
@@ -1,3 +1,5 @@
+#include <dirent.h>
+
 #include "cache.h"
 #include "trace2/tr2_dst.h"
 #include "trace2/tr2_sid.h"
@@ -8,6 +10,19 @@
  */
 #define MAX_AUTO_ATTEMPTS 10
 
+/*
+ * Sentinel file used to detect when we're overloading a directory with too many
+ * trace files.
+ */
+#define OVERLOAD_SENTINEL_NAME "git-trace2-overload"
+
+/*
+ * When set to zero, disables directory overload checking. Otherwise, controls
+ * how many files we can write to a directory before entering overload mode.
+ * This can be overridden via the TR2_SYSENV_MAX_FILES setting.
+ */
+static int tr2env_max_files = 0;
+
 static int tr2_dst_want_warning(void)
 {
 	static int tr2env_dst_debug = -1;
@@ -32,6 +47,67 @@ void tr2_dst_trace_disable(struct tr2_dst *dst)
 	dst->need_close = 0;
 }
 
+/*
+ * Check to make sure we're not overloading the target directory with too many
+ * files. First get the threshold (if present) from the config or envvar. If
+ * it's zero or unset, disable this check.  Next check for the presence of a
+ * sentinel file, then check file count. If we are overloaded, create the
+ * sentinel file if it doesn't already exist.
+ *
+ * We expect that some trace processing system is gradually collecting files
+ * from the target directory; after it removes the sentinel file we'll start
+ * writing traces again.
+ */
+static int tr2_dst_overloaded(const char *tgt_prefix)
+{
+	int file_count = 0, max_files = 0, ret = 0;
+	const char *max_files_var;
+	DIR *dirp;
+	struct strbuf path = STRBUF_INIT, sentinel_path = STRBUF_INIT;
+	struct stat statbuf;
+
+	strbuf_addstr(&path, tgt_prefix);
+	if (!is_dir_sep(path.buf[path.len - 1])) {
+		strbuf_addch(&path, '/');
+	}
+
+	/* Get the config or envvar and decide if we should continue this check */
+	max_files_var = tr2_sysenv_get(TR2_SYSENV_MAX_FILES);
+	if (max_files_var && *max_files_var && ((max_files = atoi(max_files_var)) >= 0))
+		tr2env_max_files = max_files;
+
+	if (!tr2env_max_files) {
+		ret = 0;
+		goto cleanup;
+	}
+
+	/* check sentinel */
+	strbuf_addbuf(&sentinel_path, &path);
+	strbuf_addstr(&sentinel_path, OVERLOAD_SENTINEL_NAME);
+	if (!stat(sentinel_path.buf, &statbuf)) {
+		ret = 1;
+		goto cleanup;
+	}
+
+	/* check file count */
+	dirp = opendir(path.buf);
+	while (file_count < tr2env_max_files && dirp && readdir(dirp))
+		file_count++;
+	if (dirp)
+		closedir(dirp);
+
+	if (file_count >= tr2env_max_files) {
+		creat(sentinel_path.buf, 0666);
+		ret = 1;
+		goto cleanup;
+	}
+
+cleanup:
+	strbuf_release(&path);
+	strbuf_release(&sentinel_path);
+	return ret;
+}
+
 static int tr2_dst_try_auto_path(struct tr2_dst *dst, const char *tgt_prefix)
 {
 	int fd;
@@ -50,6 +126,16 @@ static int tr2_dst_try_auto_path(struct tr2_dst *dst, const char *tgt_prefix)
 	strbuf_addstr(&path, sid);
 	base_path_len = path.len;
 
+	if (tr2_dst_overloaded(tgt_prefix)) {
+		strbuf_release(&path);
+		if (tr2_dst_want_warning())
+			warning("trace2: not opening %s trace file due to too "
+				"many files in target directory %s",
+				tr2_sysenv_display_name(dst->sysenv_var),
+				tgt_prefix);
+		return 0;
+	}
+
 	for (attempt_count = 0; attempt_count < MAX_AUTO_ATTEMPTS; attempt_count++) {
 		if (attempt_count > 0) {
 			strbuf_setlen(&path, base_path_len);
diff --git a/trace2/tr2_sysenv.c b/trace2/tr2_sysenv.c
index 5958cfc424..3c3792eca2 100644
--- a/trace2/tr2_sysenv.c
+++ b/trace2/tr2_sysenv.c
@@ -49,6 +49,9 @@ static struct tr2_sysenv_entry tr2_sysenv_settings[] = {
 				       "trace2.perftarget" },
 	[TR2_SYSENV_PERF_BRIEF]    = { "GIT_TRACE2_PERF_BRIEF",
 				       "trace2.perfbrief" },
+
+	[TR2_SYSENV_MAX_FILES]     = { "GIT_TRACE2_MAX_FILES",
+				       "trace2.maxfiles" },
 };
 /* clang-format on */
 
diff --git a/trace2/tr2_sysenv.h b/trace2/tr2_sysenv.h
index 8dd82a7a56..d4364a7b85 100644
--- a/trace2/tr2_sysenv.h
+++ b/trace2/tr2_sysenv.h
@@ -24,6 +24,8 @@ enum tr2_sysenv_variable {
 	TR2_SYSENV_PERF,
 	TR2_SYSENV_PERF_BRIEF,
 
+	TR2_SYSENV_MAX_FILES,
+
 	TR2_SYSENV_MUST_BE_LAST
 };
 
-- 
2.23.0.237.gc6a4ce50a0-goog


  parent reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-29 22:20 [RFC PATCH] " Josh Steadmon
2019-07-30 13:29 ` Derrick Stolee
2019-07-30 21:52   ` Josh Steadmon
2019-07-30 16:46 ` Jeff Hostetler
2019-07-30 22:01   ` Josh Steadmon
2019-07-30 22:02   ` Josh Steadmon
2019-07-30 18:00 ` Jeff Hostetler
2019-07-30 22:08   ` Josh Steadmon
2019-08-02 22:02 ` [RFC PATCH v2 0/2] " Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 1/2] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 2/2] trace2: don't overload target directories Josh Steadmon
2019-08-05 15:34     ` Jeff Hostetler
2019-08-05 18:17       ` Josh Steadmon
2019-08-05 18:01     ` SZEDER Gábor
2019-08-05 18:09       ` Josh Steadmon
2019-09-14  0:25 ` [RFC PATCH v3 0/3] " Josh Steadmon
2019-09-14  0:25   ` [RFC PATCH v3 1/3] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-09-14  0:25   ` Josh Steadmon [this message]
2019-09-14  0:26   ` [RFC PATCH v3 3/3] trace2: write overload message to sentinel files Josh Steadmon
2019-09-16 12:07     ` Derrick Stolee
2019-09-16 14:11       ` Jeff Hostetler
2019-09-16 18:20         ` Josh Steadmon
2019-09-16 18:07       ` Josh Steadmon

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf20ec8ea25ae80ee32d2867fa168c94fbe07d29.1568419818.git.steadmon@google.com \
    --to=steadmon@google.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror http://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox