From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, sandals@crustytoothpaste.net, j6t@kdbg.org,
jonathantanmy@google.com, peff@peff.net,
christian.couder@gmail.com, Fredrik Kuivinen <frekui@gmail.com>
Subject: Re: [PATCH v2 2/2] hex: make hash_to_hex_algop() and friends thread-safe
Date: Mon, 29 Jun 2020 17:11:59 +0200 (CEST) [thread overview]
Message-ID: <nycvar.QRO.7.76.6.2006291646420.56@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <b47445fa1cef6d4523dd0ca336f7ee22bce89466.1593208411.git.matheus.bernardino@usp.br>
Hi Matheus,
I am fine with the Windows changes (although I have to admit that I did
not find time to test things yet).
There is one problem in that I do not necessarily know that the memory is
released correctly when threads end; You will notice that the
`pthread_key_create()` shim in `compat/win32/pthread.h` does not use the
`destructor` parameter at all. The documentation at
https://docs.microsoft.com/en-us/windows/win32/procthread/using-thread-local-storage
is also not terribly clear _how_ the memory is released that was assigned
via `TlsSetValue()`. I notice that the example uses `LocalAlloc()`, but we
override `malloc()` via the `compat/nedmalloc/` functions, so that should
cause problems unless I am wrong.
Maybe there is an expert reading this who could jump in and help
understand the ramifications?
A couple more things:
On Fri, 26 Jun 2020, Matheus Tavares wrote:
> hash_to_hex_algop() returns a static buffer, relieving callers from the
> responsibility of freeing memory after use. But the current
> implementation uses the same static data for all threads and, thus, is
> not thread-safe. We could avoid using this function and its wrappers
> in threaded code, but they are sometimes too deep in the call stack to
> be noticed or even avoided.
>
> grep.c:grep_source_load_oid(), for example, uses the thread-unsafe
> oid_to_hex() (on errors) despite being called in threaded code. And
> oid_to_hex() -- which calls hash_to_hex_algop() -- is used in many other
> places, as well:
>
> $ git grep 'oid_to_hex(' | wc -l
> 818
>
> Although hash_to_hex_algop() and its wrappers don't seem to be causing
> problems out there for now (at least not reported), making them
> thread-safe makes the codebase more robust against race conditions. We
> can easily do that by replicating the static buffer in each thread's
> local storage.
>
> Original-patch-by: Fredrik Kuivinen <frekui@gmail.com>
> Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> hex.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/hex.c b/hex.c
> index da51e64929..4f2f163d5e 100644
> --- a/hex.c
> +++ b/hex.c
> @@ -1,4 +1,5 @@
> #include "cache.h"
> +#include "thread-utils.h"
>
> const signed char hexval_table[256] = {
> -1, -1, -1, -1, -1, -1, -1, -1, /* 00-07 */
> @@ -136,12 +137,49 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)
> return hash_to_hex_algop_r(buffer, oid->hash, the_hash_algo);
> }
>
> +struct hexbuf_array {
> + int idx;
Is there a specific reason why you renamed `bufno` to `idx`? If not, I'd
rather keep the old name.
> + char bufs[4][GIT_MAX_HEXSZ + 1];
> +};
> +
> +#ifdef HAVE_THREADS
> +static pthread_key_t hexbuf_array_key;
> +static pthread_once_t hexbuf_array_once = PTHREAD_ONCE_INIT;
> +
> +static void init_hexbuf_array_key(void)
> +{
> + if (pthread_key_create(&hexbuf_array_key, free))
> + die(_("failed to initialize threads' key for hash to hex conversion"));
> +}
> +
> +#else
> +static struct hexbuf_array default_hexbuf_array;
> +#endif
> +
> char *hash_to_hex_algop(const unsigned char *hash, const struct git_hash_algo *algop)
> {
> - static int bufno;
> - static char hexbuffer[4][GIT_MAX_HEXSZ + 1];
> - bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
> - return hash_to_hex_algop_r(hexbuffer[bufno], hash, algop);
> + struct hexbuf_array *ha;
> +
> +#ifdef HAVE_THREADS
> + void *value;
> +
> + if (pthread_once(&hexbuf_array_once, init_hexbuf_array_key))
> + die(_("failed to initialize threads' key for hash to hex conversion"));
> +
> + value = pthread_getspecific(hexbuf_array_key);
> + if (value) {
> + ha = (struct hexbuf_array *) value;
> + } else {
> + ha = xmalloc(sizeof(*ha));
> + if (pthread_setspecific(hexbuf_array_key, (void *)ha))
> + die(_("failed to set thread buffer for hash to hex conversion"));
> + }
> +#else
> + ha = &default_hexbuf_array;
> +#endif
This introduces two ugly `#ifdef HAVE_THREADS` constructs which are
problematic because they are the most likely places to introduce compile
errors.
I wonder whether you considered introducing a function (and probably a
macro) that transparently gives you a thread-specific instance of a given
data type? The caller would look something like
struct hexbuf_array *hex_array;
GET_THREADSPECIFIC(hex_array);
where the macro would look somewhat like this:
#define GET_THREADSPECIFIC(var) \
if (get_thread_specific(&var, sizeof(var)) < 0)
die(_("Failed to get thread-specific %s"), #var);
and the function would allocate and assign the variable. I guess this
scheme won't work, though, as `pthread_once()` does not take a callback
parameter, right? And we would need that parameter to be able to create
the `pthread_key_t`. Hmm.
Ciao,
Dscho
> +
> + ha->idx = (ha->idx + 1) % ARRAY_SIZE(ha->bufs);
> + return hash_to_hex_algop_r(ha->bufs[ha->idx], hash, algop);
> }
>
> char *hash_to_hex(const unsigned char *hash)
> --
> 2.26.2
>
>
next prev parent reply other threads:[~2020-06-30 15:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-24 22:29 [RFC] Thread safety in some low-level functions Matheus Tavares Bernardino
2020-06-24 22:52 ` Matheus Tavares Bernardino
2020-06-25 1:38 ` brian m. carlson
2020-06-25 20:32 ` [PATCH 0/2] Make oid_to_hex() thread-safe Matheus Tavares
2020-06-25 20:32 ` [PATCH 1/2] compat/win32/pthread: add pthread_once() Matheus Tavares
2020-06-26 5:45 ` Christian Couder
2020-06-26 14:13 ` Matheus Tavares Bernardino
2020-06-25 20:32 ` [PATCH 2/2] hex: make hash_to_hex_algop() and friends thread-safe Matheus Tavares
2020-06-25 23:07 ` brian m. carlson
2020-06-25 23:54 ` Matheus Tavares Bernardino
2020-06-26 0:00 ` Matheus Tavares Bernardino
2020-06-26 6:02 ` Christian Couder
2020-06-26 8:22 ` [PATCH 0/2] Make oid_to_hex() thread-safe Christian Couder
2020-06-26 16:22 ` Matheus Tavares Bernardino
2020-06-26 21:54 ` [PATCH v2 " Matheus Tavares
2020-06-26 21:54 ` [PATCH v2 1/2] compat/win32/pthread: add pthread_once() Matheus Tavares
2020-06-26 21:54 ` [PATCH v2 2/2] hex: make hash_to_hex_algop() and friends thread-safe Matheus Tavares
2020-06-29 15:11 ` Johannes Schindelin [this message]
2020-06-30 20:37 ` Matheus Tavares Bernardino
2020-07-01 11:32 ` Johannes Schindelin
2020-07-16 11:29 ` Johannes Schindelin
2020-07-18 3:09 ` Matheus Tavares Bernardino
2020-08-10 14:15 ` Johannes Schindelin
2020-07-18 3:52 ` Matheus Tavares
2020-07-26 17:41 ` René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.QRO.7.76.6.2006291646420.56@tvgsbejvaqbjf.bet \
--to=johannes.schindelin@gmx.de \
--cc=christian.couder@gmail.com \
--cc=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
--cc=jonathantanmy@google.com \
--cc=matheus.bernardino@usp.br \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).