git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jessica Clarke <jrtc27@jrtc27.com>, git@vger.kernel.org
Subject: Re: [PATCH] Properly align memory allocations and temporary buffers
Date: Fri, 7 Jan 2022 17:08:56 +0100	[thread overview]
Message-ID: <3b469866-3718-9f0e-419c-bdc793de18e9@web.de> (raw)
In-Reply-To: <20220105132324.6651-1-jrtc27@jrtc27.com>

Am 05.01.22 um 14:23 schrieb Jessica Clarke:
> Currently git_qsort_s allocates a buffer on the stack that has no
> alignment, and mem_pool_alloc assumes uintmax_t's size is adequate
> alignment for any type.
>
> On CHERI, and thus Arm's Morello prototype, pointers are implemented as
> hardware capabilities which, as well as having a normal integer address,
> have additional bounds, permissions and other metadata in a second word,
> so on a 64-bit architecture they are 128-bit quantities, including their
> alignment requirements. Despite being 128-bit, their integer component
> is still only a 64-bit field, so uintmax_t remains 64-bit, and therefore
> uintmax_t does not sufficiently align an allocation.
>
> Moreover, these capabilities have an additional "129th" tag bit, which
> tracks the validity of the capability and is cleared on any invalid
> operation that doesn't trap (e.g. partially overwriting a capability
> will invalidate it) which, combined with the architecture's strict
> checks on capability manipulation instructions, ensures it is
> architecturally impossible to construct a capability that gives more
> rights than those you were given in the first place. To store these tag
> bits, each capability sized and aligned word in memory gains a single
> tag bit that is stored in unaddressable (to the processor) memory. This
> means that it is impossible to store a capability at an unaligned
> address: a normal load or store of a capability will always take an
> alignment fault even if the (micro)architecture supports unaligned
> loads/stores for other data types, and a memcpy will, if the destination
> is not appropriately aligned, copy the byte representation but lose the
> tag, meaning that if it is eventually copied back and loaded from an
> aligned location any attempt to dereference it will trap with a tag
> fault. Thus, even char buffers that are memcpy'ed to or from must be
> properly aligned on CHERI architectures if they are to hold pointers.
>
> Address both of these by introducing a new git_max_align type put in a
> union with the on-stack buffer to force its alignment, as well as a new
> GIT_MAX_ALIGNMENT macro whose value is the alignment of git_max_align
> that gets used for mem_pool_alloc. As well as making the code work on
> CHERI, the former change likely also improves performance on some
> architectures by making memcpy faster (either because it can use larger
> block sizes or because the microarchitecture has inefficient unaligned
> accesses).
>
> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
> ---
>  compat/qsort_s.c  | 11 +++++++----
>  git-compat-util.h | 11 +++++++++++
>  mem-pool.c        |  6 +++---
>  3 files changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/compat/qsort_s.c b/compat/qsort_s.c
> index 52d1f0a73d..1ccdb87451 100644
> --- a/compat/qsort_s.c
> +++ b/compat/qsort_s.c
> @@ -49,16 +49,19 @@ int git_qsort_s(void *b, size_t n, size_t s,
>  		int (*cmp)(const void *, const void *, void *), void *ctx)
>  {
>  	const size_t size = st_mult(n, s);
> -	char buf[1024];
> +	union {
> +		char buf[1024];
> +		git_max_align align;
> +	} u;
>
>  	if (!n)
>  		return 0;
>  	if (!b || !cmp)
>  		return -1;
>
> -	if (size < sizeof(buf)) {
> -		/* The temporary array fits on the small on-stack buffer. */
> -		msort_with_tmp(b, n, s, cmp, buf, ctx);
> +	if (size < sizeof(u.buf)) {
> +		/* The temporary array fits in the small on-stack buffer. */
> +		msort_with_tmp(b, n, s, cmp, u.buf, ctx);

So buf gets maximum alignment instead of char alignment (i.e. none)
because some callers use it to sort pointers, which need that on your
platform.  Makes sense.

>  	} else {
>  		/* It's somewhat large, so malloc it.  */
>  		char *tmp = xmalloc(size);

tmp is used instead of buf if the latter is not big enough, so it can
also contain pointers.  No problem, because malloc(3) returns memory
that is properly aligned for anything already.


stable-qsort.c uses the same algorithm as compat/qsort_s.c, it just
lacks the context pointer.  Shouldn't it get the same treatment?  It
is used e.g. (via the macro STABLE_QSORT) in merge-ort.c to sort
pointers..

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 5fa54a7afe..28581a45c5 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -274,6 +274,17 @@ typedef unsigned long uintptr_t;
>  #define _ALL_SOURCE 1
>  #endif
>
> +typedef union {
> +	uintmax_t max_align_uintmax;
> +	void *max_align_pointer;
> +} git_max_align;

For your purposes just the void * member would suffice here, right?  And
with the added uintmax_t this currently gets maximum alignment, suitable
for any of our objects.  If we were to start using __int128 etc. then
we'd have to add that to this union as well to really get the maximum
possible alignment, though.

> +
> +typedef struct {
> +	char unalign;
> +	git_max_align aligned;
> +} git_max_alignment;
> +#define GIT_MAX_ALIGNMENT offsetof(git_max_alignment, aligned)

C11 has alignas, alignof and max_align_t.  We only recently started to
depend on some C99 features, so perhaps it's a bit early to use
stdalign.h in Git's code base.  That's a pity, though.  The
GIT_MAX_ALIGNMENT macro is sightly enough, but using a union to get
pointer alignment is a bit more cumbersome than something like

	alignas(alignof(max_align_t)) char buf[1024];

> +
>  /* used on Mac OS X */
>  #ifdef PRECOMPOSE_UNICODE
>  #include "compat/precompose_utf8.h"
> diff --git a/mem-pool.c b/mem-pool.c
> index ccdcad2e3d..748eff925a 100644
> --- a/mem-pool.c
> +++ b/mem-pool.c
> @@ -69,9 +69,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len)
>  	struct mp_block *p = NULL;
>  	void *r;
>
> -	/* round up to a 'uintmax_t' alignment */
> -	if (len & (sizeof(uintmax_t) - 1))
> -		len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
> +	/* round up to a 'GIT_MAX_ALIGNMENT' alignment */
> +	if (len & (GIT_MAX_ALIGNMENT - 1))
> +		len += GIT_MAX_ALIGNMENT - (len & (GIT_MAX_ALIGNMENT - 1));

OK.

>
>  	if (pool->mp_block &&
>  	    pool->mp_block->end - pool->mp_block->next_free >= len)


  parent reply	other threads:[~2022-01-07 16:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 13:23 [PATCH] Properly align memory allocations and temporary buffers Jessica Clarke
2022-01-06 21:46 ` Taylor Blau
2022-01-06 21:56   ` Jessica Clarke
2022-01-06 22:27   ` Junio C Hamano
2022-01-06 22:56     ` Jessica Clarke
2022-01-07  0:10       ` Junio C Hamano
2022-01-07  0:22         ` Jessica Clarke
2022-01-07  0:31         ` brian m. carlson
2022-01-07  0:39           ` Jessica Clarke
2022-01-07  1:43             ` brian m. carlson
2022-01-07  2:08               ` Jessica Clarke
2022-01-07  2:11                 ` Jessica Clarke
2022-01-07 19:30               ` Junio C Hamano
2022-01-07 19:33                 ` Jessica Clarke
2022-01-07 20:56                 ` René Scharfe
2022-01-07 21:30                   ` Junio C Hamano
2022-01-07 23:30                     ` René Scharfe
2022-01-08  0:18                       ` Elijah Newren
2022-01-06 23:22 ` brian m. carlson
2022-01-06 23:31   ` Jessica Clarke
2022-01-07 14:57 ` Philip Oakley
2022-01-07 16:08 ` René Scharfe [this message]
2022-01-07 16:21   ` Jessica Clarke
2022-01-12 13:58 ` Jessica Clarke
2022-01-12 15:47   ` René Scharfe
2022-01-12 15:49     ` Jessica Clarke
2022-01-23 15:24 ` [PATCH v2] mem-pool: Don't assume uintmax_t is aligned enough for all types Jessica Clarke
2022-01-23 20:17   ` Junio C Hamano
2022-01-23 20:23     ` Jessica Clarke
2022-01-23 20:28       ` Junio C Hamano
2022-01-23 20:33   ` [PATCH v3] " Jessica Clarke
2022-01-24 17:11     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b469866-3718-9f0e-419c-bdc793de18e9@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=jrtc27@jrtc27.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).