git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Andrzej Hunt via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Chris Torek <chris.torek@gmail.com>, Jeff King <peff@peff.net>,
	Andrzej Hunt <andrzej@ahunt.org>
Subject: [PATCH v2 0/3] Fix uninitialised reads found with MSAN
Date: Mon, 14 Jun 2021 15:51:13 +0000	[thread overview]
Message-ID: <pull.1033.v2.git.git.1623685877.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1033.git.git.1623343712.gitgitgadget@gmail.com>

V2 replaces an #if'd memset with some brace initialisation (patch 3/3) as
per review comments.

I've also removed an irrelevant "technically" from commit message 2/3, and
fixed a typo in commit message 3/3.

Andrzej Hunt (3):
  bulk-checkin: make buffer reuse more obvious and safer
  split-index: use oideq instead of memcmp to compare object_id's
  builtin/checkout--worker: zero-initialise struct to avoid MSAN
    complaints

 builtin/checkout--worker.c | 2 +-
 bulk-checkin.c             | 3 +--
 split-index.c              | 3 ++-
 3 files changed, 4 insertions(+), 4 deletions(-)


base-commit: 62a8d224e6203d9d3d2d1d63a01cf5647ec312c9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1033%2Fahunt%2Fmsan-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1033/ahunt/msan-v2
Pull-Request: https://github.com/git/git/pull/1033

Range-diff vs v1:

 1:  7659d4bf13c2 = 1:  7659d4bf13c2 bulk-checkin: make buffer reuse more obvious and safer
 2:  14b0d5dd7fce ! 2:  6943eb511bee split-index: use oideq instead of memcmp to compare object_id's
     @@ Commit message
          include that field when calling memcmp on a subset of the cache_entry.
          Depending on which hashing algorithm is being used, only part of
          object_id.hash is actually being used, therefore including it in a
     -    memcmp() is technically incorrect. Instead we choose to exclude the
     -    object_id when calling memcmp(), and call oideq() separately.
     +    memcmp() is incorrect. Instead we choose to exclude the object_id when
     +    calling memcmp(), and call oideq() separately.
      
          This issue was found when running t1700-split-index with MSAN, see MSAN
          output below (on my machine, offset 76 corresponds to 4 bytes after the
 3:  cd1e1f6985c7 ! 3:  4bdc0b77f6f2 builtin/checkout--worker: memset struct to avoid MSAN complaints
     @@ Metadata
      Author: Andrzej Hunt <ajrhunt@google.com>
      
       ## Commit message ##
     -    builtin/checkout--worker: memset struct to avoid MSAN complaints
     +    builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
      
          report_result() sends a struct to the parent process, but that struct
     -    contains unintialised padding bytes. Running this code under MSAN
     -    rightly triggers a warning - but we also don't care about this warning
     -    because we control the receiving code, and we therefore know that those
     -    padding bytes won't be read on the receiving end. Therefore we add a
     -    memset to convince MSAN that this memory is safe to read - but only
     -    when building with MSAN to avoid this cost in normal usage.
     +    would contain uninitialised padding bytes. Running this code under MSAN
     +    rightly triggers a warning - but we don't particularly care about this
     +    warning because we control the receiving code, and we therefore know
     +    that those padding bytes won't be read on the receiving end.
     +
     +    We could simply suppress this warning under MSAN with the approporiate
     +    ifdef'd attributes, but a less intrusive solution is to 0-initialise the
     +    struct, which guarantees that the padding will also be initialised.
      
          Interestingly, in the error-case branch, we only try to copy the first
          two members of pc_item_result, by copying only PC_ITEM_RESULT_BASE_SIZE
     @@ Commit message
          after the end of the second last member. We could avoid doing this by
          redefining PC_ITEM_RESULT_BASE_SIZE as
          'offsetof(second_last_member) + sizeof(second_last_member)', but there's
     -    no huge benefit to doing so (and our memset hack silences the MSAN
     -    warning in this scenario either way).
     +    no huge benefit to doing so (and this patch silences the MSAN warning in
     +    this scenario either way).
      
          MSAN output from t2080 (partially interleaved due to the
          parallel work :) ):
     @@ Commit message
          Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
      
       ## builtin/checkout--worker.c ##
     -@@ builtin/checkout--worker.c: static void report_result(struct parallel_checkout_item *pc_item)
     - 	struct pc_item_result res;
     +@@ builtin/checkout--worker.c: static void packet_to_pc_item(const char *buffer, int len,
     + 
     + static void report_result(struct parallel_checkout_item *pc_item)
     + {
     +-	struct pc_item_result res;
     ++	struct pc_item_result res = { 0 };
       	size_t size;
       
     -+#if defined(__has_feature)
     -+#  if __has_feature(memory_sanitizer)
     -+	// MSAN workaround: res contains padding bytes, which will remain
     -+	// permanently unintialised. Later, we read all of res in order to send
     -+	// it to the parent process - and MSAN (rightly) complains that we're
     -+	// reading those unintialised padding bytes. By memset'ing res we
     -+	// guarantee that there are no uninitialised bytes.
     -+	memset(&res, 0, sizeof(res));
     -+#endif
     -+#endif
     -+
       	res.id = pc_item->id;
     - 	res.status = pc_item->status;
     - 

-- 
gitgitgadget

  parent reply	other threads:[~2021-06-14 15:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10 16:48 [PATCH 0/3] Fix uninitialised reads found with MSAN Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 1/3] bulk-checkin: make buffer reuse more obvious and safer Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 2/3] split-index: use oideq instead of memcmp to compare object_id's Andrzej Hunt via GitGitGadget
2021-06-10 16:48 ` [PATCH 3/3] builtin/checkout--worker: memset struct to avoid MSAN complaints Andrzej Hunt via GitGitGadget
2021-06-11  4:43   ` Chris Torek
2021-06-11  6:28     ` Junio C Hamano
2021-06-11 15:37       ` Andrzej Hunt
2021-06-14  1:04         ` Junio C Hamano
2021-06-11 17:11 ` [PATCH 0/3] Fix uninitialised reads found with MSAN Jeff King
2021-06-14 15:51 ` Andrzej Hunt via GitGitGadget [this message]
2021-06-14 15:51   ` [PATCH v2 1/3] bulk-checkin: make buffer reuse more obvious and safer Andrzej Hunt via GitGitGadget
2021-06-14 15:51   ` [PATCH v2 2/3] split-index: use oideq instead of memcmp to compare object_id's Andrzej Hunt via GitGitGadget
2021-06-14 15:51   ` [PATCH v2 3/3] builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints Andrzej Hunt via GitGitGadget
2021-06-17  9:28 ` [PATCH 0/3] Fix uninitialised reads found with MSAN Philip Oakley
2021-06-20 15:19   ` Andrzej Hunt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1033.v2.git.git.1623685877.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=andrzej@ahunt.org \
    --cc=chris.torek@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).