From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Emily Shaffer" <emilyshaffer@google.com>,
"Taylor Blau" <me@ttaylorr.com>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v2 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity
Date: Thu, 26 Aug 2021 14:22:18 +0200 [thread overview]
Message-ID: <cover-v2-0.6-00000000000-20210826T121820Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-0.6-00000000000-20210825T231400Z-avarab@gmail.com>
An early re-roll due to some very good & early review by Taylor Blau,
this also fixes the grammar/typo pointed out by Eric Sunshine. Thanks
both:
This should address all the comments Taylor brought up, and hopefully
a bit more. The only thing I left outstanding was the inclusion of the
"while we're at it" enum refactoring in 5/6. It's not necessary for
the end-state here, but I think since we're already reviewing most of
this file it makes sense to change it while we're at it to
future-proof the code vis-as-vis getting stricted checks from the
compiler.
Ævar Arnfjörð Bjarmason (6):
tr2: remove NEEDSWORK comment for "non-procfs" implementations
tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
tr2: stop leaking "thread_name" memory
tr2: fix memory leak & logic error in 2f732bf15e6
tr2: do compiler enum check in trace2_collect_process_info()
tr2: log N parent process names on Linux
compat/linux/procinfo.c | 169 ++++++++++++++++++++++++++++++++++------
trace2/tr2_tls.c | 1 +
2 files changed, 146 insertions(+), 24 deletions(-)
Range-diff against v1:
1: 8c649ce3b49 = 1: 8c649ce3b49 tr2: remove NEEDSWORK comment for "non-procfs" implementations
2: 0150e3402a7 = 2: 0150e3402a7 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
3: 1fa1bbb6743 ! 3: 1d835d6767e tr2: stop leaking "thread_name" memory
@@ Commit message
Fix a memory leak introduced in ee4512ed481 (trace2: create new
combined trace facility, 2019-02-22), we were doing a free() of other
memory allocated in tr2tls_create_self(), but not the "thread_name"
- "strbuf strbuf".
+ "struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
4: 73e7d4eb6ac ! 4: 1aa0dbc394e tr2: fix memory leak & logic error in 2f732bf15e6
@@ Commit message
It was also using the strvec_push() API the wrong way. That API always
does an xstrdup(), so by detaching the strbuf here we'd leak
memory. Let's instead pass in our pointer for strvec_push() to
- xstrdup(), and then free our own strbuf.
+ xstrdup(), and then free our own strbuf. I do have some WIP changes to
+ make strvec_push_nodup() non-static, which makes this and some other
+ callsites nicer, but let's just follow the prevailing pattern of using
+ strvec_push() for now.
- Furthermore we need to free that "procfs_path" strbuf whether or not
+ We'll also need to free that "procfs_path" strbuf whether or not
strbuf_read_file() succeeds, which was another source of memory leaks
in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a
system where we could read the file from procfs.
+ Let's move all the freeing of the memory to the end of the
+ function. If we're still at STRBUF_INIT with "name" due to not haven
+ taken the branch where the strbuf_read_file() succeeds freeing it is
+ redundant, so we could move it into the body of the "if", but just
+ handling freeing the same way for all branches of the function makes
+ it more readable.
+
In combination with the preceding commit this makes all of
t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.
@@ compat/linux/procinfo.c: static void get_ancestry_names(struct strvec *names)
strbuf_trim_trailing_newline(&name);
- strvec_push(names, strbuf_detach(&name, NULL));
+ strvec_push(names, name.buf);
-+ strbuf_release(&name);
}
+ strbuf_release(&procfs_path);
++ strbuf_release(&name);
++
return;
}
5: 4e378da2cce = 5: 70fef093d8d tr2: do compiler enum check in trace2_collect_process_info()
6: da003330800 ! 6: f6aac902484 tr2: log N parent process names on Linux
@@ Commit message
on Linux only the name of the immediate parent process was logged.
Extend the functionality added there to also log full parent chain on
- Linux. In 2f732bf15e6 it was claimed that "further ancestry info can
- be gathered with procfs, but it's unwieldy to do so.".
-
- I don't know what the author meant by that, but I think it probably
- referred to needing to slurp this up from the FS, as opposed to having
- an API. The underlying semantics on Linux are easier to deal with than
- on Windows though, at least as far as finding the parent PIDs
- goes. See the get_processes() function used on Windows. As shown in
- 353d3d77f4f (trace2: collect Windows-specific process information,
- 2019-02-22) it needs to deal with cycles.
-
- What is more complex on Linux is getting at the process name, a
- simpler approach is to use fscanf(), see [1] for an implementation of
- that, but as noted in the comment being added here it would fail in
- the face of some weird process names, so we need our own
- parse_proc_stat() to parse it out.
+ Linux.
+
+ This requires us to lookup "/proc/<getppid()>/stat" instead of
+ "/proc/<getppid()>/comm". The "comm" file just contains the name of the
+ process, but the "stat" file has both that information, and the parent
+ PID of that process, see procfs(5). We parse out the parent PID of our
+ own parent, and recursively walk the chain of "/proc/*/stat" files all
+ the way up the chain. A parent PID of 0 indicates the end of the
+ chain.
+
+ It's possible given the semantics of Linux's PID files that we end up
+ getting an entirely nonsensical chain of processes. It could happen if
+ e.g. we have a chain of processes like:
+
+ 1 (init) => 321 (bash) => 123 (git)
+
+ Let's assume that "bash" was started a while ago, and that as shown
+ the OS has already cycled back to using a lower PID for us than our
+ parent process. In the time it takes us to start up and get to
+ trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
+ process might exit, and be replaced by an entirely different process!
+
+ We'd racily look up our own getppid(), but in the meantime our parent
+ would exit, and Linux would have cycled all the way back to starting
+ an entirely unrelated process as PID 321.
+
+ If that happens we'll just silently log incorrect data in our ancestry
+ chain. Luckily we don't need to worry about this except in this
+ specific cycling scenario, as Linux does not have PID
+ randomization. It appears it once did through a third-party feature,
+ but that it was removed around 2006[1]. For anyone worried about this
+ edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
+ it, but not eliminate it.
+
+ One thing we don't need to worry about is getting into an infinite
+ loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect
+ Windows-specific process information, 2019-02-22) for the related
+ Windows code that needs to deal with that, and [2] for an explanation
+ of that edge case.
+
+ Aside from potential race conditions it's also a bit painful to
+ correctly parse the process name out of "/proc/*/stat". A simpler
+ approach is to use fscanf(), see [3] for an implementation of that,
+ but as noted in the comment being added here it would fail in the face
+ of some weird process names, so we need our own parse_proc_stat() to
+ parse it out.
With this patch the "ancestry" chain for a trace2 event might look
like this:
@@ Commit message
"systemd"
]
- And in the case of naughty process names. This uses perl's ability to
- use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy
- process name with prctl() on assignment to $0 on Linux, 2010-04-15)[2]:
+ And in the case of naughty process names like the following. This uses
+ perl's ability to use prctl(PR_SET_NAME, ...). See
+ Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
+ assignment to $0 on Linux, 2010-04-15)[4]:
$ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
[
@@ Commit message
"systemd"
]
- 1. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
- 2. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e
+ 1. https://grsecurity.net/news#grsec2110
+ 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
+ 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
+ 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
@@ compat/linux/procinfo.c
-static void get_ancestry_names(struct strvec *names)
+/*
-+ * We need more complex parsing instat_parent_pid() and
++ * We need more complex parsing in stat_parent_pid() and
+ * parse_proc_stat() below than a dumb fscanf(). That's because while
+ * the statcomm field is surrounded by parentheses, the process itself
+ * is free to insert any arbitrary byte sequence its its name. That
-+ * can include newlines, spaces, closing parentheses etc. See
-+ * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast
-+ * with the escaped version of the name found in /proc/%d/status.
++ * can include newlines, spaces, closing parentheses etc.
++ *
++ * See do_task_stat() in fs/proc/array.c in linux.git, this is in
++ * contrast with the escaped version of the name found in
++ * /proc/%d/status.
+ *
+ * So instead of using fscanf() we'll read N bytes from it, look for
+ * the first "(", and then the last ")", anything in-between is our
@@ compat/linux/procinfo.c
+ * that's 7 digits for a PID. We have 2 PIDs in the first four fields
+ * we're interested in, so 2 * 7 = 14.
+ *
-+ * We then have 4 spaces between those four values, which brings us up
-+ * to 18. Add the two parentheses and it's 20. The "state" is then one
-+ * character (now at 21).
++ * We then have 3 spaces between those four values, and we'd like to
++ * get to the space between the 4th and the 5th (the "pgrp" field) to
++ * make sure we read the entire "ppid" field. So that brings us up to
++ * 14 + 3 + 1 = 18. Add the two parentheses around the "comm" value
++ * and it's 20. The "state" value itself is then one character (now at
++ * 21).
+ *
+ * Finally the maximum length of the "comm" name itself is 15
+ * characters, e.g. a setting of "123456789abcdefg" will be truncated
@@ compat/linux/procinfo.c
+static int parse_proc_stat(struct strbuf *sb, struct strbuf *name,
+ int *statppid)
{
-+ const char *lhs = strchr(sb->buf, '(');
-+ const char *rhs = strrchr(sb->buf, ')');
++ const char *comm_lhs = strchr(sb->buf, '(');
++ const char *comm_rhs = strrchr(sb->buf, ')');
+ const char *ppid_lhs, *ppid_rhs;
+ char *p;
+ pid_t ppid;
+
-+ if (!lhs || !rhs)
++ if (!comm_lhs || !comm_rhs)
+ goto bad_kernel;
+
/*
@@ compat/linux/procinfo.c
+ * We're at the ")", that's followed by " X ", where X is a
+ * single "state" character. So advance by 4 bytes.
*/
-+ ppid_lhs = rhs + 4;
++ ppid_lhs = comm_rhs + 4;
+
++ /*
++ * Read until the space between the "ppid" and "pgrp" fields
++ * to make sure we're anchored after the untruncated "ppid"
++ * field..
++ */
+ ppid_rhs = strchr(ppid_lhs, ' ');
+ if (!ppid_rhs)
+ goto bad_kernel;
+
+ ppid = strtol(ppid_lhs, &p, 10);
+ if (ppid_rhs == p) {
-+ const char *comm = lhs + 1;
-+ int commlen = rhs - lhs - 1;
++ const char *comm = comm_lhs + 1;
++ size_t commlen = comm_rhs - comm;
+
-+ strbuf_addf(name, "%.*s", commlen, comm);
++ strbuf_add(name, comm, commlen);
+ *statppid = ppid;
+
+ return 0;
@@ compat/linux/procinfo.c
struct strbuf procfs_path = STRBUF_INIT;
- struct strbuf name = STRBUF_INIT;
+ struct strbuf sb = STRBUF_INIT;
-+ size_t n;
-+ FILE *fp = NULL;
++ FILE *fp;
+ int ret = -1;
/* try to use procfs if it's present. */
@@ compat/linux/procinfo.c
- if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) {
- strbuf_trim_trailing_newline(&name);
- strvec_push(names, name.buf);
-- strbuf_release(&name);
- }
+ strbuf_addf(&procfs_path, "/proc/%d/stat", pid);
+ fp = fopen(procfs_path.buf, "r");
+ if (!fp)
+ goto cleanup;
+
-+ n = strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp);
-+ if (n != STAT_PARENT_PID_READ_N)
++ /*
++ * We could be more strict here and assert that we read at
++ * least STAT_PARENT_PID_READ_N. My reading of procfs(5) is
++ * that on any modern kernel (at least since 2.6.0 released in
++ * 2003) even if all the mandatory numeric fields were zero'd
++ * out we'd get at least 100 bytes, but let's just check that
++ * we got anything at all and trust the parse_proc_stat()
++ * function to handle its "Bad Kernel?" error checking.
++ */
++ if (!strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp))
+ goto cleanup;
+ if (parse_proc_stat(&sb, name, statppid) < 0)
+ goto cleanup;
@@ compat/linux/procinfo.c
+ if (ppid)
+ push_ancestry_name(names, ppid);
+cleanup:
-+ strbuf_release(&name);
- return;
- }
+ strbuf_release(&name);
+ return;
@@ compat/linux/procinfo.c: void trace2_collect_process_info(enum trace2_process_info_reason reason)
*/
break;
--
2.33.0.733.ga72a4f1c2e1
next prev parent reply other threads:[~2021-08-26 12:22 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-07 0:29 [PATCH] tr2: log parent process name Emily Shaffer
2021-05-07 3:25 ` Bagas Sanjaya
2021-05-07 17:09 ` Emily Shaffer
2021-05-10 12:29 ` Ævar Arnfjörð Bjarmason
2021-05-11 21:31 ` Junio C Hamano
2021-05-14 22:06 ` Emily Shaffer
2021-05-16 3:48 ` Junio C Hamano
2021-05-17 20:17 ` Emily Shaffer
2021-05-11 17:28 ` Jeff Hostetler
2021-05-14 22:07 ` Emily Shaffer
2021-05-20 21:05 ` [PATCH v2] " Emily Shaffer
2021-05-20 21:36 ` Randall S. Becker
2021-05-20 23:23 ` Emily Shaffer
2021-05-21 13:20 ` Randall S. Becker
2021-05-21 16:24 ` Randall S. Becker
2021-05-21 2:09 ` Junio C Hamano
2021-05-21 19:02 ` Emily Shaffer
2021-05-21 23:22 ` Junio C Hamano
2021-05-24 18:37 ` Emily Shaffer
2021-05-21 19:15 ` Jeff Hostetler
2021-05-21 20:05 ` Emily Shaffer
2021-05-21 20:23 ` Randall S. Becker
2021-05-22 11:18 ` Jeff Hostetler
2021-05-24 23:33 ` Ævar Arnfjörð Bjarmason
2021-05-24 20:10 ` [PATCH v3] " Emily Shaffer
2021-05-24 20:49 ` Emily Shaffer
2021-05-25 3:54 ` Junio C Hamano
2021-05-25 13:33 ` Randall S. Becker
2021-06-08 18:58 ` [PATCH v4] " Emily Shaffer
2021-06-08 20:56 ` Emily Shaffer
2021-06-08 22:10 ` [PATCH v5] " Emily Shaffer
2021-06-08 22:16 ` Randall S. Becker
2021-06-08 22:24 ` Emily Shaffer
2021-06-08 22:39 ` Randall S. Becker
2021-06-09 20:17 ` Emily Shaffer
2021-06-16 8:42 ` Junio C Hamano
2021-06-28 16:45 ` Jeff Hostetler
2021-06-29 23:51 ` Emily Shaffer
2021-06-30 6:10 ` Ævar Arnfjörð Bjarmason
2021-07-22 0:21 ` Emily Shaffer
2021-07-22 1:27 ` [PATCH v6 0/2] " Emily Shaffer
2021-07-22 1:27 ` [PATCH v6 1/2] tr2: make process info collection platform-generic Emily Shaffer
2021-08-02 9:34 ` Ævar Arnfjörð Bjarmason
2021-07-22 1:27 ` [PATCH v6 2/2] tr2: log parent process name Emily Shaffer
2021-07-22 21:02 ` Junio C Hamano
2021-08-02 9:38 ` Ævar Arnfjörð Bjarmason
2021-08-02 12:45 ` Ævar Arnfjörð Bjarmason
2021-08-02 10:22 ` Ævar Arnfjörð Bjarmason
2021-08-02 12:47 ` Ævar Arnfjörð Bjarmason
2021-08-02 15:23 ` Jeff Hostetler
2021-08-02 16:10 ` Randall S. Becker
2021-08-02 18:41 ` Ævar Arnfjörð Bjarmason
2021-08-25 23:19 ` [PATCH 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Ævar Arnfjörð Bjarmason
2021-08-25 23:19 ` [PATCH 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-25 23:19 ` [PATCH 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-25 23:19 ` [PATCH 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-26 3:09 ` Taylor Blau
2021-08-25 23:19 ` [PATCH 4/6] tr2: fix memory leak & logic error in 2f732bf15e6 Ævar Arnfjörð Bjarmason
2021-08-26 3:21 ` Taylor Blau
2021-08-25 23:19 ` [PATCH 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-26 3:23 ` Taylor Blau
2021-08-25 23:19 ` [PATCH 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-25 23:49 ` Eric Sunshine
2021-08-26 4:07 ` Taylor Blau
2021-08-26 12:24 ` "I don't know what the author meant by that..." (was "Re: [PATCH 6/6] tr2: log N parent process names on Linux") Ævar Arnfjörð Bjarmason
2021-08-26 12:22 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-26 12:22 ` [PATCH v2 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-26 12:22 ` [PATCH v2 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-26 12:22 ` [PATCH v2 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-26 12:22 ` [PATCH v2 4/6] tr2: fix memory leak & logic error in 2f732bf15e6 Ævar Arnfjörð Bjarmason
2021-08-26 15:58 ` Eric Sunshine
2021-08-26 16:42 ` Junio C Hamano
2021-08-26 12:22 ` [PATCH v2 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-26 12:22 ` [PATCH v2 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-26 22:38 ` [PATCH v2 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Taylor Blau
2021-08-27 8:02 ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 4/6] tr2: leave the parent list empty upon failure & don't leak memory Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-27 8:02 ` [PATCH v3 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-31 0:17 ` [PATCH v3 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Taylor Blau
2021-08-02 10:30 ` [PATCH v6 2/2] tr2: log parent process name Ævar Arnfjörð Bjarmason
2021-08-02 16:24 ` Junio C Hamano
2021-08-02 18:42 ` Ævar Arnfjörð Bjarmason
2021-07-22 16:59 ` [PATCH v6 0/2] " Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover-v2-0.6-00000000000-20210826T121820Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).