git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: What's cooking in git.git (Jan 2019, #01; Mon, 7)
Date: Wed, 9 Jan 2019 22:06:08 +0100	[thread overview]
Message-ID: <CAN0heSqLUWpwRdeUvYj2KnDX-QxSOnWOdKWz77RjHKJ3AFUGEQ@mail.gmail.com> (raw)
In-Reply-To: <CAN0heSoRYYS3-UAamE9nibhORPoD+_TRHu5-ZTeYxYMS4BAnrA@mail.gmail.com>

On Wed, 9 Jan 2019 at 08:37, Martin Ågren <martin.agren@gmail.com> wrote:
>
> On Tue, 8 Jan 2019 at 00:34, Junio C Hamano <gitster@pobox.com> wrote:
> > * bc/sha-256 (2018-11-14) 12 commits
> >  - hash: add an SHA-256 implementation using OpenSSL
> >  - sha256: add an SHA-256 implementation using libgcrypt
> >  - Add a base implementation of SHA-256 support
> >  - commit-graph: convert to using the_hash_algo
> >  - t/helper: add a test helper to compute hash speed
> >  - sha1-file: add a constant for hash block size
> >  - t: make the sha1 test-tool helper generic
> >  - t: add basic tests for our SHA-1 implementation
> >  - cache: make hashcmp and hasheq work with larger hashes
> >  - hex: introduce functions to print arbitrary hashes
> >  - sha1-file: provide functions to look up hash algorithms
> >  - sha1-file: rename algorithm to "sha1"
> >
> >  Add sha-256 hash and plug it through the code to allow building Git
> >  with the "NewHash".
>
> AddressSanitizer barks at current pu (855f98be272f19d16564e) for a
> handful of tests.
>
> One example is t5702-protocol-v2.sh. [...]
>
> ==1691823==ERROR: AddressSanitizer: heap-buffer-overflow on address
> 0x6040000004f2 at pc 0x0000004ea0fd bp 0x7ffc53082590 sp
> 0x7ffc53081d40
> READ of size 32 at 0x6040000004f2 thread T0
>     #0 0x4ea0fc in __asan_memcpy
> llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:23
>     #1 0x8603ec in oidset_insert oidset.c
>     #2 0x86c977 in add_promisor_object packfile.c:2129:4
>     #3 0x86c07a in for_each_object_in_pack packfile.c:2070:7
>     #4 0x86c535 in for_each_packed_object packfile.c:2095:7
>     #5 0x86c651 in is_promisor_object packfile.c:2151:4

> 0x6040000004f2 is located 0 bytes to the right of 34-byte region
> [0x6040000004d0,0x6040000004f2)
> allocated by thread T0 here:
>     #0 0x4eb4cf in malloc
> llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146
>     #1 0x9fa1db in do_xmalloc wrapper.c:60:8
>     #2 0x9fa2fd in do_xmallocz wrapper.c:100:8
>     #3 0x9fa2fd in xmallocz_gently wrapper.c:113
>     #4 0x86a877 in unpack_compressed_entry packfile.c:1588:11
>     #5 0x86a02e in unpack_entry packfile.c:1737:11
>     #6 0x867431 in cache_or_unpack_entry packfile.c:1439:10
>     #7 0x867431 in packed_object_info packfile.c:1506
>     #8 0x96b7be in oid_object_info_extended sha1-file.c:1394:10
>     #9 0x96d7d0 in read_object sha1-file.c:1434:6
>     #10 0x96d7d0 in read_object_file_extended sha1-file.c:1476
>     #11 0x85cf40 in repo_read_object_file ./object-store.h:174:9
>     #12 0x85cf40 in parse_object object.c:273
>     #13 0x86c752 in add_promisor_object packfile.c:2108:23
>     #14 0x86c07a in for_each_object_in_pack packfile.c:2070:7
>     #15 0x86c535 in for_each_packed_object packfile.c:2095:7
>     #16 0x86c651 in is_promisor_object packfile.c:2151:4

I found some more time to look into this.

It seems we have a buffer with raw data and we set up a `struct
object_id *` pointing into it, at a (supposed) OID value. Then
`update_tree_entry_internal()` verifies that the buffer contains
sufficiently many bytes, i.e., at least `the_hash_algo->rawsz` (=20).
We immediately call `oidset_insert()` which copies an entire struct,
i.e., we copy sizeof(struct object_id) (=32) bytes. Which is 12 more
than what is known to be safe. For this particular input data, we read
outside allocated memory.

I can think of three possible approaches:

* Allocate with a margin (GIT_MAX_RAWSZ - the_hash_algo->rawsz) where
  "necessary" (TM). Maybe not so maintainable.

* Teach `oidset_insert()` (i.e., khash) to only copy
  `the_hash_algo->rawsz` bytes. Maybe not so good for performance.

* Ignore.

I wonder which of these is the least awful, or if there are other ideas.

Martin

  reply	other threads:[~2019-01-09 21:06 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 23:34 What's cooking in git.git (Jan 2019, #01; Mon, 7) Junio C Hamano
2019-01-08  9:50 ` tg/checkout-no-overlay, was " Thomas Gummerer
2019-01-08 17:51   ` Junio C Hamano
2019-01-08 17:30 ` ag/sequencer-reduce-rewriting-todo " Alban Gruin
2019-01-08 21:20 ` sb/more-repo-in-api, was " Jonathan Tan
2019-01-08 21:35   ` Junio C Hamano
2019-01-09 21:28     ` Stefan Beller
2019-01-09  7:37 ` Martin Ågren
2019-01-09 21:06   ` Martin Ågren [this message]
2019-01-10  1:02     ` brian m. carlson
2019-01-10 18:55       ` Junio C Hamano
2019-01-10 19:03       ` Martin Ågren
2019-01-10  4:25     ` [PATCH 0/5] tree-walk object_id refactor brian m. carlson
2019-01-10  4:25       ` [PATCH 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-10  4:25       ` [PATCH 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-10  4:25       ` [PATCH 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-10  6:45         ` Jeff King
2019-01-10 23:55           ` brian m. carlson
2019-01-11 14:51             ` Jeff King
2019-01-11 14:54               ` Jeff King
2019-01-14  1:30                 ` brian m. carlson
2019-01-14 15:40                   ` Jeff King
2019-01-10  4:25       ` [PATCH 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-10  6:49         ` Jeff King
2019-01-10 23:57           ` brian m. carlson
2019-01-10  4:25       ` [PATCH 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-10  6:50         ` Jeff King
2019-01-10  6:40       ` [PATCH 0/5] tree-walk object_id refactor Jeff King
2019-01-11  0:17         ` brian m. carlson
2019-01-11 14:17           ` Jeff King
2019-01-15  0:39     ` [PATCH v2 " brian m. carlson
2019-01-15  0:39       ` [PATCH v2 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-15  0:39       ` [PATCH v2 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-15  0:39       ` [PATCH v2 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-15  0:39       ` [PATCH v2 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-15  0:39       ` [PATCH v2 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-15 17:51       ` [PATCH v2 0/5] tree-walk object_id refactor Junio C Hamano
2019-01-09 10:28 ` What's cooking in git.git (Jan 2019, #01; Mon, 7) Jeff King
2019-01-10 19:05   ` Junio C Hamano
2019-01-10 19:46   ` Junio C Hamano
2019-01-10 18:02 ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN0heSqLUWpwRdeUvYj2KnDX-QxSOnWOdKWz77RjHKJ3AFUGEQ@mail.gmail.com \
    --to=martin.agren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).