NOTE: This code is not yet in a state to be committed, but I wanted to publish something at this point in order to start public discussions on the approach before I develop it further. The code, as published has been tested on a model running a kernel that enables memory tagging, but there are a number of issues that still need to be resolved before I would even consider asking for a merge. I'm not asking for a review of the code as much as a review of the approach at this point in time. I'm posting this now before the Cauldron so hopefully we can have some useful discussions on the matter during that. ARMv8.5 adds an extension known as MTE (memory tagging extension); the extension allows granules of memory (16 bytes per granule) to be individually 'coloured' with a 4-bit tag value. Unused bits in the top byte of a 64-bit address pointer can then be set to describe the colour expected at that address and a protection fault can be raised if there is then a mismatch. This can then be used for a number of purposes, but primarily is intended to assist with debugging a number of run-time faults that are common in code, including buffer overrun faults and use-after-free type errors. Nomenclature: The AArch64 extension is called MTE. I've tried to use the term 'memory tagging' (mtag) in the generic code to keep the layers separate. Ideally mtag can be used on multiple architectures. This patch proposes a way to utilize the extension to provide such protection in Glibc's malloc() API. Essentially the code here is divided into four logical set of changes, though for the purposes of this discussion I've rolled this up into a single patch set. Part 1 is a simple change to the configuration code to allow memory tagging support to built into glibc Part 2 introduces a new (currently internal) API within glibc that provides access to the architecture-specific memory tagging operations. The API essentially compiles down either no-ops or existing standard library functions when the extension is disabled Part 3 is the bulk of the changes to malloc/malloc.c to use the API to colour memory; I've tried to ensure that when the extension is disabled there is no overhead on existing users. If the extension is enabled during the build, but disabled at run time (eg to support systems that do not have the extension), then there are some minor overheads, but they are hopefully not significant. Part 4 is finally some target specific changes for AArch64; when MTE is enabled we have to be very careful about buffer overruns on read operations. Consequently we have to constrain some of the string operations to ensure that they do not unsafely read across a tag granule boundary. This code is very preliminary at present - eventually we would want to be able to select the code at run time and revert back to the standard code if tagging is disabled. Parts 2 and 3 are obviously the focus of the discussion I'd like to have at present. For part 2, the API is currently entirely private within glibc, but potentially in future (once we're happy that the concept is stable) it might be useful to open this up to users as well. Part 3, which is the bulk of the changes colours all memory requests obtained through the malloc API. Each call to malloc (or any of the more aligned variants) or realloc will return a coloured pointer if the extension is enabled - for realloc I've chosen to recolour all affected memory even if the same logical address is returned (the pointer will contain a different colour, ensuring that before-and-after pointers will not compare equal). This is perhaps the most extreme position, but in some cases that might catch assumptions or code that continues to use the pre-realloc address incorrectly. - colouring is mostly done at the outermost level of the library. This is not necessarily the most efficient point to do this, but it certainly creates the least disruption in the code base. The main exception to this is realloc where the separation of the layers is not quite as clean as for the other entry points. The advantage of colouring at the outermost level is that calloc() can combine the clearing of the memory with the colouring process if the architecture supports that (MTE does). - one colour is retained for use by libc itself. This can be used to detect when user code goes outside the allocated buffer region. - free() recolours the memory; this is a run-time overhead but is useful for catching use-after-free accesses. Limitations in the prototype: MTE has a granule size (minimum colourable memory block) of 16 bytes. This happens to fit well with malloc's block header, which on aarch64 is also 16 bytes of size and thus leads to little overhead in the data structures. I haven't attempted yet to look at support for other sizes, but I suspect that things will become a bit messy if the granule is larger than the block header (it will certainly be less efficient). At present, the code simply assumes that all memory could be tagged (though it works correctly if it is not). We are in discussions with the kernel folk about the possible syscall API extensions that might be needed to make requests for memory from the kernel tagable. I've written enough for now. Let the discussions begin... ----- [mtag] Allow memory tagging to be enabled from the command line This patch adds the configuration machinery to allow memory tagging to be enabled from the command line via the configure option --enable-memory-tagging. The current default is off, though in time we may change that once the API is more stable. [AArch64][mtag] Basic support for memory tagging This patch adds the basic support for memory tagging. This is very much preliminary code and is unlikely to be in its final form. - generic/libc-mtag.h - default implementation of the memory tagging interface. Maps most functions onto no-ops, a few are mapped back onto existing APIs (eg memset). - aarch64/libc-mtag.h - implementation for AArch64. - aarch64/__mtag_* - helper functions for memory tagging (unoptimized). - malloc/malloc.c - updates to support tagging of memory allocations. [AArch64][mtag] Mitigations for string functions when MTE is enabled. This is an initial set of patches for mitigating against MTE issues when that is enabled. Most of the changes are sub-optimal, but should avoid the boundary conditions that can cause spurious MTE faults.