From: "Andrzej Hunt via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Chris Torek <chris.torek@gmail.com>, Jeff King <peff@peff.net>,
Andrzej Hunt <andrzej@ahunt.org>
Subject: [PATCH v2 0/3] Fix uninitialised reads found with MSAN
Date: Mon, 14 Jun 2021 15:51:13 +0000 [thread overview]
Message-ID: <pull.1033.v2.git.git.1623685877.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1033.git.git.1623343712.gitgitgadget@gmail.com>
V2 replaces an #if'd memset with some brace initialisation (patch 3/3) as
per review comments.
I've also removed an irrelevant "technically" from commit message 2/3, and
fixed a typo in commit message 3/3.
Andrzej Hunt (3):
bulk-checkin: make buffer reuse more obvious and safer
split-index: use oideq instead of memcmp to compare object_id's
builtin/checkout--worker: zero-initialise struct to avoid MSAN
complaints
builtin/checkout--worker.c | 2 +-
bulk-checkin.c | 3 +--
split-index.c | 3 ++-
3 files changed, 4 insertions(+), 4 deletions(-)
base-commit: 62a8d224e6203d9d3d2d1d63a01cf5647ec312c9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1033%2Fahunt%2Fmsan-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1033/ahunt/msan-v2
Pull-Request: https://github.com/git/git/pull/1033
Range-diff vs v1:
1: 7659d4bf13c2 = 1: 7659d4bf13c2 bulk-checkin: make buffer reuse more obvious and safer
2: 14b0d5dd7fce ! 2: 6943eb511bee split-index: use oideq instead of memcmp to compare object_id's
@@ Commit message
include that field when calling memcmp on a subset of the cache_entry.
Depending on which hashing algorithm is being used, only part of
object_id.hash is actually being used, therefore including it in a
- memcmp() is technically incorrect. Instead we choose to exclude the
- object_id when calling memcmp(), and call oideq() separately.
+ memcmp() is incorrect. Instead we choose to exclude the object_id when
+ calling memcmp(), and call oideq() separately.
This issue was found when running t1700-split-index with MSAN, see MSAN
output below (on my machine, offset 76 corresponds to 4 bytes after the
3: cd1e1f6985c7 ! 3: 4bdc0b77f6f2 builtin/checkout--worker: memset struct to avoid MSAN complaints
@@ Metadata
Author: Andrzej Hunt <ajrhunt@google.com>
## Commit message ##
- builtin/checkout--worker: memset struct to avoid MSAN complaints
+ builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
report_result() sends a struct to the parent process, but that struct
- contains unintialised padding bytes. Running this code under MSAN
- rightly triggers a warning - but we also don't care about this warning
- because we control the receiving code, and we therefore know that those
- padding bytes won't be read on the receiving end. Therefore we add a
- memset to convince MSAN that this memory is safe to read - but only
- when building with MSAN to avoid this cost in normal usage.
+ would contain uninitialised padding bytes. Running this code under MSAN
+ rightly triggers a warning - but we don't particularly care about this
+ warning because we control the receiving code, and we therefore know
+ that those padding bytes won't be read on the receiving end.
+
+ We could simply suppress this warning under MSAN with the approporiate
+ ifdef'd attributes, but a less intrusive solution is to 0-initialise the
+ struct, which guarantees that the padding will also be initialised.
Interestingly, in the error-case branch, we only try to copy the first
two members of pc_item_result, by copying only PC_ITEM_RESULT_BASE_SIZE
@@ Commit message
after the end of the second last member. We could avoid doing this by
redefining PC_ITEM_RESULT_BASE_SIZE as
'offsetof(second_last_member) + sizeof(second_last_member)', but there's
- no huge benefit to doing so (and our memset hack silences the MSAN
- warning in this scenario either way).
+ no huge benefit to doing so (and this patch silences the MSAN warning in
+ this scenario either way).
MSAN output from t2080 (partially interleaved due to the
parallel work :) ):
@@ Commit message
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
## builtin/checkout--worker.c ##
-@@ builtin/checkout--worker.c: static void report_result(struct parallel_checkout_item *pc_item)
- struct pc_item_result res;
+@@ builtin/checkout--worker.c: static void packet_to_pc_item(const char *buffer, int len,
+
+ static void report_result(struct parallel_checkout_item *pc_item)
+ {
+- struct pc_item_result res;
++ struct pc_item_result res = { 0 };
size_t size;
-+#if defined(__has_feature)
-+# if __has_feature(memory_sanitizer)
-+ // MSAN workaround: res contains padding bytes, which will remain
-+ // permanently unintialised. Later, we read all of res in order to send
-+ // it to the parent process - and MSAN (rightly) complains that we're
-+ // reading those unintialised padding bytes. By memset'ing res we
-+ // guarantee that there are no uninitialised bytes.
-+ memset(&res, 0, sizeof(res));
-+#endif
-+#endif
-+
res.id = pc_item->id;
- res.status = pc_item->status;
-
--
gitgitgadget
next prev parent reply other threads:[~2021-06-14 15:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 16:48 [PATCH 0/3] Fix uninitialised reads found with MSAN Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 1/3] bulk-checkin: make buffer reuse more obvious and safer Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 2/3] split-index: use oideq instead of memcmp to compare object_id's Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 3/3] builtin/checkout--worker: memset struct to avoid MSAN complaints Andrzej Hunt via GitGitGadget
2021-06-11 4:43 ` Chris Torek
2021-06-11 6:28 ` Junio C Hamano
2021-06-11 15:37 ` Andrzej Hunt
2021-06-14 1:04 ` Junio C Hamano
2021-06-11 17:11 ` [PATCH 0/3] Fix uninitialised reads found with MSAN Jeff King
2021-06-14 15:51 ` Andrzej Hunt via GitGitGadget [this message]
2021-06-14 15:51 ` [PATCH v2 1/3] bulk-checkin: make buffer reuse more obvious and safer Andrzej Hunt via GitGitGadget
2021-06-14 15:51 ` [PATCH v2 2/3] split-index: use oideq instead of memcmp to compare object_id's Andrzej Hunt via GitGitGadget
2021-06-14 15:51 ` [PATCH v2 3/3] builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints Andrzej Hunt via GitGitGadget
2021-06-17 9:28 ` [PATCH 0/3] Fix uninitialised reads found with MSAN Philip Oakley
2021-06-20 15:19 ` Andrzej Hunt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1033.v2.git.git.1623685877.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=andrzej@ahunt.org \
--cc=chris.torek@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).