From: "René Scharfe" <l.s.r@web.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
Jessica Clarke <jrtc27@jrtc27.com>, Taylor Blau <me@ttaylorr.com>,
git@vger.kernel.org, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH] Properly align memory allocations and temporary buffers
Date: Sat, 8 Jan 2022 00:30:44 +0100 [thread overview]
Message-ID: <f40c1b47-9aad-2dcc-ceeb-5dee2b517cd8@web.de> (raw)
In-Reply-To: <xmqqzgo76xpj.fsf@gitster.g>
Am 07.01.22 um 22:30 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> If falling back to malloc(3) is fine in general then I wonder if we can
>> just it use always. This would save two branches and make buffer
>> management trivial here. How much worse is malloc(3) on platforms with
>> alloca(3)? Do we sort lots of short lists somewhere? In other words:
>> Does this stack allocation optimization actually make a measurable
>> difference?
>
> Well all the preimage of this came from your 04ee8b87 (compat: add
> qsort_s(), 2017-01-22), so you tell me ;-)
Right, except I stole the code almost verbatim from compat/qsort.c,
which had this optimization since 43fe901b71 (compat: Add simplified
merge sort implementation from glibc, 2008-02-05). :) The
optimization may have raised my eyebrow a bit at the time, but not
enough to come up with a meaningful benchmark..
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/msort.c (the
original) still uses alloca(3) (and more tricks), by the way;
https://cgit.freebsd.org/src/tree/lib/libc/stdlib/merge.c still
doesn't.
>
>> Heap fragmention should not be a concern here, at least, because the
>> pattern of requesting, using and releasing a single allocation won't
>> leave holes.
>
> Sure. Even in a multi-threaded environment that should be true.
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: compat/qsort_s.c: avoid using potentially unaligned access
>
> The compatibility definition for qsort_s() uses "char buffer[1024]"
> on the stack to avoid making malloc() calls for small temporary
> space, which essentially hand-rolls alloca().
>
> But the elements of the array being sorted may have alignment needs
> more strict than what an array of bytes may have. &buf[0] may be
> word aligned, but using the address as if it stores the first
> element of an array of a struct, whose first member may need to be
> aligned on double-word boundary, would be a no-no.
>
> We could use xalloca() from git-compat-util.h, or alloca() directly
> on platforms with HAVE_ALLOCA_H, but let's try using unconditionally
> xmalloc() before we know the performance characteristics of the
> callers.
>
> It may not make much of an argument to inspect the current callers
> and say "it shouldn't matter to any of them", but anyway:
>
> * The one in object-name.c is used to sort potential matches to a
> given ambiguous object name prefix in the error path;
>
> * The one in pack-write.c is done once per a pack .idx file being
> written to create the reverse index, so (1) the cost of malloc()
> overhead is dwarfed by the cost of the packing operation, and (2)
> the number of entries being sorted is the number of objects in a
> pack;
>
> * The one in ref-filter.c is used by "branch --list", "tag --list",
> and "for-each-ref", only once per operation. We sort an array of
> pointers with entries, each corresponding to a ref that is shown.
>
> * The one in string-list.c is used by sort_string_list(), which is
> way too generic to assume any access patterns, so it may or may
> not matter, but I do not care too much ;-)
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> compat/qsort_s.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git c/compat/qsort_s.c w/compat/qsort_s.c
> index 52d1f0a73d..0f7ff30f5f 100644
> --- c/compat/qsort_s.c
> +++ w/compat/qsort_s.c
> @@ -49,21 +49,15 @@ int git_qsort_s(void *b, size_t n, size_t s,
> int (*cmp)(const void *, const void *, void *), void *ctx)
> {
> const size_t size = st_mult(n, s);
> - char buf[1024];
> + char *tmp;
>
> if (!n)
> return 0;
> if (!b || !cmp)
> return -1;
>
> - if (size < sizeof(buf)) {
> - /* The temporary array fits on the small on-stack buffer. */
> - msort_with_tmp(b, n, s, cmp, buf, ctx);
> - } else {
> - /* It's somewhat large, so malloc it. */
> - char *tmp = xmalloc(size);
> - msort_with_tmp(b, n, s, cmp, tmp, ctx);
> - free(tmp);
> - }
> + tmp = xmalloc(size);
> + msort_with_tmp(b, n, s, cmp, tmp, ctx);
> + free(tmp);
> return 0;
> }
--- >8 ---
Subject: [PATCH] stable-qsort: avoid using potentially unaligned access
Like in the previous patch for compat/qsort_s.c, remove the optimization
of using an on-stack buffer to avoid small allocations. This ensures
maximum alignment for the array elements and simplifies the code a bit.
The performance impact for the current callers is unlikely to be
noticeable:
* compat/mingw.c::make_environment_block() uses ALLOC_ARRAY and
ALLOC_GROW several times already, so another allocation of up to 1KB
should not matter much.
* diffcore-rename.c::diffcore_rename_extended() is called once per diff
or twice per merge, and those require allocations for each object and
more already.
* merge-ort.c::detect_and_process_renames() is called once per merge.
It's responsible for the two per-merge diffcore_rename_extended()
calls mentioned above as well, though. So this is possibly the most
impacted caller. Per-object allocations are likely to dwarf the
additional small allocations in git_stable_qsort(), though.
Signed-off-by: René Scharfe <l.s.r@web.de>
---
stable-qsort.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)
diff --git a/stable-qsort.c b/stable-qsort.c
index 6cbaf39f7b..7ff12467cd 100644
--- a/stable-qsort.c
+++ b/stable-qsort.c
@@ -48,15 +48,9 @@ void git_stable_qsort(void *b, size_t n, size_t s,
int (*cmp)(const void *, const void *))
{
const size_t size = st_mult(n, s);
- char buf[1024];
-
- if (size < sizeof(buf)) {
- /* The temporary array fits on the small on-stack buffer. */
- msort_with_tmp(b, n, s, cmp, buf);
- } else {
- /* It's somewhat large, so malloc it. */
- char *tmp = xmalloc(size);
- msort_with_tmp(b, n, s, cmp, tmp);
- free(tmp);
- }
+ char *tmp;
+
+ tmp = xmalloc(size);
+ msort_with_tmp(b, n, s, cmp, tmp);
+ free(tmp);
}
--
2.34.1
next prev parent reply other threads:[~2022-01-07 23:31 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-05 13:23 [PATCH] Properly align memory allocations and temporary buffers Jessica Clarke
2022-01-06 21:46 ` Taylor Blau
2022-01-06 21:56 ` Jessica Clarke
2022-01-06 22:27 ` Junio C Hamano
2022-01-06 22:56 ` Jessica Clarke
2022-01-07 0:10 ` Junio C Hamano
2022-01-07 0:22 ` Jessica Clarke
2022-01-07 0:31 ` brian m. carlson
2022-01-07 0:39 ` Jessica Clarke
2022-01-07 1:43 ` brian m. carlson
2022-01-07 2:08 ` Jessica Clarke
2022-01-07 2:11 ` Jessica Clarke
2022-01-07 19:30 ` Junio C Hamano
2022-01-07 19:33 ` Jessica Clarke
2022-01-07 20:56 ` René Scharfe
2022-01-07 21:30 ` Junio C Hamano
2022-01-07 23:30 ` René Scharfe [this message]
2022-01-08 0:18 ` Elijah Newren
2022-01-06 23:22 ` brian m. carlson
2022-01-06 23:31 ` Jessica Clarke
2022-01-07 14:57 ` Philip Oakley
2022-01-07 16:08 ` René Scharfe
2022-01-07 16:21 ` Jessica Clarke
2022-01-12 13:58 ` Jessica Clarke
2022-01-12 15:47 ` René Scharfe
2022-01-12 15:49 ` Jessica Clarke
2022-01-23 15:24 ` [PATCH v2] mem-pool: Don't assume uintmax_t is aligned enough for all types Jessica Clarke
2022-01-23 20:17 ` Junio C Hamano
2022-01-23 20:23 ` Jessica Clarke
2022-01-23 20:28 ` Junio C Hamano
2022-01-23 20:33 ` [PATCH v3] " Jessica Clarke
2022-01-24 17:11 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f40c1b47-9aad-2dcc-ceeb-5dee2b517cd8@web.de \
--to=l.s.r@web.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrtc27@jrtc27.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).