[PATCH/RFC 0/5] Some ThreadSanitizer-results

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH/RFC 0/5] Some ThreadSanitizer-results
@ 2017-08-15 12:53 Martin Ågren
  2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
                   ` (8 more replies)
  0 siblings, 9 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git

I tried running the test suite on Git compiled with ThreadSanitizer
(SANITIZE=thread). Maybe this series could be useful for someone else
trying to do the same. I needed the first patch to avoid warnings when
compiling, although it actually has nothing to do with threads.

The last four patches are about avoiding some issues where
ThreadSanitizer complains for reasonable reasons, but which to the best
of my understanding are not real problems. These patches could be useful
to make "actual" problems stand out more. Of course, if no-one ever runs
ThreadSanitizer, they are of little to no (or even negative) value...

I'll follow up with the two remaining issues that I found but which I do
not try to address in this series.

Martin

Martin Ågren (5):
  convert: initialize attr_action in convert_attrs
  pack-objects: take lock before accessing `remaining`
  Makefile: define GIT_THREAD_SANITIZER
  strbuf_reset: don't write to slopbuf with ThreadSanitizer
  ThreadSanitizer: add suppressions

 strbuf.h               | 12 ++++++++++++
 builtin/pack-objects.c |  6 ++++++
 convert.c              |  2 +-
 Makefile               |  3 +++
 .tsan-suppressions     | 12 ++++++++++++
 5 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 .tsan-suppressions

-- 
2.14.1.151.gdfeca7a7e

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 1/5] convert: initialize attr_action in convert_attrs
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 14:17   ` Torsten Bögershausen
  2017-08-15 12:53 ` [PATCH 2/5] pack-objects: take lock before accessing `remaining` Martin Ågren
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen

convert_attrs populates a struct conv_attrs. The field attr_action is
not set in all code paths, but still one caller unconditionally reads
it. Since git_check_attr always returns the same value, we'll always end
up in the same code path and there is no problem right now. But
convert_attrs is obviously trying not to rely on such an
implementation-detail of another component.

Initialize attr_action to CRLF_UNDEFINED in the dead code path.

Actually, in the code path that /is/ taken, the variable is assigned to
twice and the first assignment has no effect. That's not wrong, but
let's remove that first assignment while we're here.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I hit a warning about attr_action possibly being uninitialized when
building with SANITIZE=thread. I guess it's some random interaction
between code added by tsan, the optimizer (-O3) and the warning
machinery. (This was with gcc 5.4.0.)

 convert.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/convert.c b/convert.c
index 1012462e3..943d957b4 100644
--- a/convert.c
+++ b/convert.c
@@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
-		ca->attr_action = ca->crlf_action;
 		ca->ident = git_path_check_ident(ccheck + 1);
 		ca->drv = git_path_check_convert(ccheck + 2);
 		if (ca->crlf_action != CRLF_BINARY) {
@@ -1058,6 +1057,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 	} else {
 		ca->drv = NULL;
 		ca->crlf_action = CRLF_UNDEFINED;
+		ca->attr_action = CRLF_UNDEFINED;
 		ca->ident = 0;
 	}
 	if (ca->crlf_action == CRLF_TEXT)
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 1/5] convert: initialize attr_action in convert_attrs
  2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
@ 2017-08-15 14:17   ` Torsten Bögershausen
  2017-08-15 14:29     ` Torsten Bögershausen
  2017-08-15 14:40     ` Martin Ågren
  0 siblings, 2 replies; 64+ messages in thread
From: Torsten Bögershausen @ 2017-08-15 14:17 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

On Tue, Aug 15, 2017 at 02:53:01PM +0200, Martin Ågren wrote:
> convert_attrs populates a struct conv_attrs. The field attr_action is
> not set in all code paths, but still one caller unconditionally reads
> it. Since git_check_attr always returns the same value, we'll always end
> up in the same code path and there is no problem right now. But
> convert_attrs is obviously trying not to rely on such an
> implementation-detail of another component.
> 
> Initialize attr_action to CRLF_UNDEFINED in the dead code path.
> 
> Actually, in the code path that /is/ taken, the variable is assigned to
> twice and the first assignment has no effect. That's not wrong, but
> let's remove that first assignment while we're here.
> 
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> I hit a warning about attr_action possibly being uninitialized when
> building with SANITIZE=thread. I guess it's some random interaction
> between code added by tsan, the optimizer (-O3) and the warning
> machinery. (This was with gcc 5.4.0.)
> 
>  convert.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/convert.c b/convert.c
> index 1012462e3..943d957b4 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>  		ca->crlf_action = git_path_check_crlf(ccheck + 4);
>  		if (ca->crlf_action == CRLF_UNDEFINED)
>  			ca->crlf_action = git_path_check_crlf(ccheck + 0);
> -		ca->attr_action = ca->crlf_action;

I don't think the removal of that line is correct.

>  		ca->ident = git_path_check_ident(ccheck + 1);
>  		ca->drv = git_path_check_convert(ccheck + 2);
>  		if (ca->crlf_action != CRLF_BINARY) {
> @@ -1058,6 +1057,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>  	} else {
>  		ca->drv = NULL;
>  		ca->crlf_action = CRLF_UNDEFINED;
> +		ca->attr_action = CRLF_UNDEFINED;

But this one can be avoided, when the line
ca->attr_action = ca->crlf_action;
would move completely out of the "if/else" block.

>  		ca->ident = 0;
>  	}
>  	if (ca->crlf_action == CRLF_TEXT)
> -- 
> 2.14.1.151.gdfeca7a7e
> 

Thanks for spotting my mess.
What do you think about the following:


diff --git a/convert.c b/convert.c
index 1012462e3c..fd91b91ada 100644
--- a/convert.c
+++ b/convert.c
@@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
-		ca->attr_action = ca->crlf_action;
 		ca->ident = git_path_check_ident(ccheck + 1);
 		ca->drv = git_path_check_convert(ccheck + 2);
 		if (ca->crlf_action != CRLF_BINARY) {
@@ -1060,6 +1059,8 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->crlf_action = CRLF_UNDEFINED;
 		ca->ident = 0;
 	}
+	/* Save attr and make a decision for action */
+	ca->attr_action = ca->crlf_action;
 	if (ca->crlf_action == CRLF_TEXT)
 		ca->crlf_action = text_eol_is_crlf() ? CRLF_TEXT_CRLF : CRLF_TEXT_INPUT;
 	if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_FALSE)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 1/5] convert: initialize attr_action in convert_attrs
  2017-08-15 14:17   ` Torsten Bögershausen
@ 2017-08-15 14:29     ` Torsten Bögershausen
  2017-08-15 14:40     ` Martin Ågren
  1 sibling, 0 replies; 64+ messages in thread
From: Torsten Bögershausen @ 2017-08-15 14:29 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git


> > diff --git a/convert.c b/convert.c
> > index 1012462e3..943d957b4 100644
> > --- a/convert.c
> > +++ b/convert.c
> > @@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
> >  		ca->crlf_action = git_path_check_crlf(ccheck + 4);
> >  		if (ca->crlf_action == CRLF_UNDEFINED)
> >  			ca->crlf_action = git_path_check_crlf(ccheck + 0);
> > -		ca->attr_action = ca->crlf_action;
> 
> I don't think the removal of that line is correct.

Sorry, that went wrong, should have been:
 I think that the line can be removed here.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 1/5] convert: initialize attr_action in convert_attrs
  2017-08-15 14:17   ` Torsten Bögershausen
  2017-08-15 14:29     ` Torsten Bögershausen
@ 2017-08-15 14:40     ` Martin Ågren
  1 sibling, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 14:40 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Git Mailing List

On 15 August 2017 at 16:17, Torsten Bögershausen <tboegi@web.de> wrote:
> On Tue, Aug 15, 2017 at 02:53:01PM +0200, Martin Ågren wrote:
>> convert_attrs populates a struct conv_attrs. The field attr_action is
>> not set in all code paths, but still one caller unconditionally reads
>> it. Since git_check_attr always returns the same value, we'll always end
>> up in the same code path and there is no problem right now. But
>> convert_attrs is obviously trying not to rely on such an
>> implementation-detail of another component.
>>
>> Initialize attr_action to CRLF_UNDEFINED in the dead code path.
>>
>> Actually, in the code path that /is/ taken, the variable is assigned to
>> twice and the first assignment has no effect. That's not wrong, but
>> let's remove that first assignment while we're here.
>>
>> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
>> ---
>> I hit a warning about attr_action possibly being uninitialized when
>> building with SANITIZE=thread. I guess it's some random interaction
>> between code added by tsan, the optimizer (-O3) and the warning
>> machinery. (This was with gcc 5.4.0.)
>>
>>  convert.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/convert.c b/convert.c
>> index 1012462e3..943d957b4 100644
>> --- a/convert.c
>> +++ b/convert.c
>> @@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>>               ca->crlf_action = git_path_check_crlf(ccheck + 4);
>>               if (ca->crlf_action == CRLF_UNDEFINED)
>>                       ca->crlf_action = git_path_check_crlf(ccheck + 0);
>> -             ca->attr_action = ca->crlf_action;
>
> I don't think the removal of that line is correct.

(Thanks for confirming in a follow-up mail that you meant that you /do/
think the removal is correct.)

>
>>               ca->ident = git_path_check_ident(ccheck + 1);
>>               ca->drv = git_path_check_convert(ccheck + 2);
>>               if (ca->crlf_action != CRLF_BINARY) {
>> @@ -1058,6 +1057,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>>       } else {
>>               ca->drv = NULL;
>>               ca->crlf_action = CRLF_UNDEFINED;
>> +             ca->attr_action = CRLF_UNDEFINED;
>
> But this one can be avoided, when the line
> ca->attr_action = ca->crlf_action;
> would move completely out of the "if/else" block.
>
>>               ca->ident = 0;
>>       }
>>       if (ca->crlf_action == CRLF_TEXT)
>> --
>> 2.14.1.151.gdfeca7a7e
>>
>
> Thanks for spotting my mess.
> What do you think about the following:
>
>
> diff --git a/convert.c b/convert.c
> index 1012462e3c..fd91b91ada 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>                 ca->crlf_action = git_path_check_crlf(ccheck + 4);
>                 if (ca->crlf_action == CRLF_UNDEFINED)
>                         ca->crlf_action = git_path_check_crlf(ccheck + 0);
> -               ca->attr_action = ca->crlf_action;
>                 ca->ident = git_path_check_ident(ccheck + 1);
>                 ca->drv = git_path_check_convert(ccheck + 2);
>                 if (ca->crlf_action != CRLF_BINARY) {
> @@ -1060,6 +1059,8 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
>                 ca->crlf_action = CRLF_UNDEFINED;
>                 ca->ident = 0;
>         }
> +       /* Save attr and make a decision for action */
> +       ca->attr_action = ca->crlf_action;
>         if (ca->crlf_action == CRLF_TEXT)
>                 ca->crlf_action = text_eol_is_crlf() ? CRLF_TEXT_CRLF : CRLF_TEXT_INPUT;
>         if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_FALSE)

Yeah, makes lots of sense. Then we could also remove the second
assignment to attr_action. That is, this function would set attr_action
at one place, always. I'll do this in a v2.

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/5] pack-objects: take lock before accessing `remaining`
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
  2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 19:50   ` Johannes Sixt
  2017-08-15 12:53 ` [PATCH 3/5] Makefile: define GIT_THREAD_SANITIZER Martin Ågren
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git; +Cc: Johannes Sixt

When checking the conditional of "while (me->remaining)", we did not
hold the lock. Calling find_deltas would still be safe, since it checks
"remaining" (after taking the lock) and is able to handle all values. In
fact, this could (currently) not trigger any bug: a bug could happen if
`remaining` transitioning from zero to non-zero races with the evaluation
of the while-condition, but these are always separated by the
data_ready-mechanism.

Make sure we have the lock when we read `remaining`. This does mean we
release it just so that find_deltas can take it immediately again. We
could tweak the contract so that the lock should be taken before calling
find_deltas, but let's defer that until someone can actually show that
"unlock+lock" has a measurable negative impact.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I don't think this corrects any real error. The benefits of this patch
would be "future-proofs things slightly" and "silences tsan, so that
other errors don't drown in noise". Feel free to tell me those benefits
are negligible and that this change actually hurts.

 builtin/pack-objects.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c753e9237..bd391e97a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2170,7 +2170,10 @@ static void *threaded_find_deltas(void *arg)
 {
 	struct thread_params *me = arg;

+	progress_lock();
 	while (me->remaining) {
+		progress_unlock();
+
 		find_deltas(me->list, &me->remaining,
 			    me->window, me->depth, me->processed);

@@ -2192,7 +2195,10 @@ static void *threaded_find_deltas(void *arg)
 			pthread_cond_wait(&me->cond, &me->mutex);
 		me->data_ready = 0;
 		pthread_mutex_unlock(&me->mutex);
+
+		progress_lock();
 	}
+	progress_unlock();
 	/* leave ->working 1 so that this doesn't get more work assigned */
 	return NULL;
 }
-- 
2.14.1.151.gdfeca7a7e

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/5] pack-objects: take lock before accessing `remaining`
  2017-08-15 12:53 ` [PATCH 2/5] pack-objects: take lock before accessing `remaining` Martin Ågren
@ 2017-08-15 19:50   ` Johannes Sixt
  0 siblings, 0 replies; 64+ messages in thread
From: Johannes Sixt @ 2017-08-15 19:50 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

Am 15.08.2017 um 14:53 schrieb Martin Ågren:
> When checking the conditional of "while (me->remaining)", we did not
> hold the lock. Calling find_deltas would still be safe, since it checks
> "remaining" (after taking the lock) and is able to handle all values. In
> fact, this could (currently) not trigger any bug: a bug could happen if
> `remaining` transitioning from zero to non-zero races with the evaluation
> of the while-condition, but these are always separated by the
> data_ready-mechanism.
> 
> Make sure we have the lock when we read `remaining`. This does mean we
> release it just so that find_deltas can take it immediately again. We
> could tweak the contract so that the lock should be taken before calling
> find_deltas, but let's defer that until someone can actually show that
> "unlock+lock" has a measurable negative impact.
> 
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> I don't think this corrects any real error. The benefits of this patch
> would be "future-proofs things slightly" and "silences tsan, so that
> other errors don't drown in noise". Feel free to tell me those benefits
> are negligible and that this change actually hurts.
> 
>   builtin/pack-objects.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index c753e9237..bd391e97a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2170,7 +2170,10 @@ static void *threaded_find_deltas(void *arg)
>   {
>   	struct thread_params *me = arg;
>   
> +	progress_lock();
>   	while (me->remaining) {
> +		progress_unlock();
> +
>   		find_deltas(me->list, &me->remaining,
>   			    me->window, me->depth, me->processed);
>   
> @@ -2192,7 +2195,10 @@ static void *threaded_find_deltas(void *arg)
>   			pthread_cond_wait(&me->cond, &me->mutex);
>   		me->data_ready = 0;
>   		pthread_mutex_unlock(&me->mutex);
> +
> +		progress_lock();
>   	}
> +	progress_unlock();
>   	/* leave ->working 1 so that this doesn't get more work assigned */
>   	return NULL;
>   }
> 

It is correct that this access of me->remaining requires a lock. It 
could be solved in the way you did. I tried to do it differently, but 
all cleaner solutions that I can think of are overkill...

-- Hannes

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 3/5] Makefile: define GIT_THREAD_SANITIZER
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
  2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
  2017-08-15 12:53 ` [PATCH 2/5] pack-objects: take lock before accessing `remaining` Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 12:53 ` [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer Martin Ågren
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git

When using ThreadSanitizer (tsan), it can be useful to avoid triggering
false or irrelevant alarms. This should obviously be done with care and
not to paper over real problems. Detecting that tsan is in use in a
cross-compiler way is non-trivial.

Tie into our existing SANITIZE=...-framework and define
GIT_THREAD_SANITIZER when we are compiling for tsan.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index 461c845d3..e77a60d1f 100644
--- a/Makefile
+++ b/Makefile
@@ -1033,6 +1033,9 @@ BASIC_CFLAGS += -fno-omit-frame-pointer
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
 endif
+ifneq ($(filter thread,$(SANITIZERS)),)
+BASIC_CFLAGS += -DGIT_THREAD_SANITIZER
+endif
 endif
 
 ifndef sysconfdir
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (2 preceding siblings ...)
  2017-08-15 12:53 ` [PATCH 3/5] Makefile: define GIT_THREAD_SANITIZER Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 18:43   ` Junio C Hamano
  2017-08-15 12:53 ` [PATCH 5/5] ThreadSanitizer: add suppressions Martin Ågren
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git

If two threads have one freshly initialized string buffer each and call
strbuf_reset on them at roughly the same time, both threads will be
writing a '\0' to strbuf_slopbuf. That is not a problem in practice
since it doesn't matter in which order the writes happen. But
ThreadSanitizer will consider this a race.

When compiling with GIT_THREAD_SANITIZER, avoid writing to
strbuf_slopbuf. Let's instead assert on the first byte of strbuf_slopbuf
being '\0', since it ensures the promised invariant of "buf[len] ==
'\0'". (Writing to strbuf_slopbuf is normally bad, but could become even
more bad if we stop covering it up in strbuf_reset.)

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 strbuf.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/strbuf.h b/strbuf.h
index e705b94db..295654d39 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -153,7 +153,19 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
 /**
  * Empty the buffer by setting the size of it to zero.
  */
+#ifdef GIT_THREAD_SANITIZER
+#define strbuf_reset(sb)						\
+	do {								\
+		struct strbuf *_sb = sb; 				\
+		_sb->len = 0;						\
+		if (_sb->buf == strbuf_slopbuf)				\
+			assert(!strbuf_slopbuf[0]);			\
+		else							\
+			_sb->buf[0] = '\0';				\
+	} while (0)
+#else
 #define strbuf_reset(sb)  strbuf_setlen(sb, 0)
+#endif
 
 
 /**
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer
  2017-08-15 12:53 ` [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer Martin Ågren
@ 2017-08-15 18:43   ` Junio C Hamano
  2017-08-15 19:06     ` Martin Ågren
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-08-15 18:43 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

Martin Ågren <martin.agren@gmail.com> writes:

> If two threads have one freshly initialized string buffer each and call
> strbuf_reset on them at roughly the same time, both threads will be
> writing a '\0' to strbuf_slopbuf. That is not a problem in practice
> since it doesn't matter in which order the writes happen. But
> ThreadSanitizer will consider this a race.
>
> When compiling with GIT_THREAD_SANITIZER, avoid writing to
> strbuf_slopbuf. Let's instead assert on the first byte of strbuf_slopbuf
> being '\0', since it ensures the promised invariant of "buf[len] ==
> '\0'". (Writing to strbuf_slopbuf is normally bad, but could become even
> more bad if we stop covering it up in strbuf_reset.)
>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  strbuf.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/strbuf.h b/strbuf.h
> index e705b94db..295654d39 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -153,7 +153,19 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>  /**
>   * Empty the buffer by setting the size of it to zero.
>   */
> +#ifdef GIT_THREAD_SANITIZER
> +#define strbuf_reset(sb)						\
> +	do {								\
> +		struct strbuf *_sb = sb; 				\
> +		_sb->len = 0;						\
> +		if (_sb->buf == strbuf_slopbuf)				\
> +			assert(!strbuf_slopbuf[0]);			\
> +		else							\
> +			_sb->buf[0] = '\0';				\
> +	} while (0)
> +#else
>  #define strbuf_reset(sb)  strbuf_setlen(sb, 0)
> +#endif
>  
>  
>  /**

The strbuf_slopbuf[] is a shared resource that is expected by
everybody to stay a holder of a NUL.  Even though it is defined as
"char [1]", it in spirit ought to be considered const.  And from
that point of view, your new definition that is conditionally used
only when sanitizer is in use _is_ the more correct one than the
current "we do not care if it is slopbuf, we are writing \0 so it
will be no-op anyway" code.

I wonder if we excessively call strbuf_reset() in the real code to
make your version unacceptably expensive?  If not, I somehow feel
that using this version unconditionally may be a better approach.

What happens when a caller calls "strbuf_setlen(&sb, 0)" on a strbuf
that happens to have nothing and whose buffer still points at the
slopbuf (instead of calling _reset())?  Shouldn't your patch fix
that function instead, i.e. something like the following without the
above?  Is that make things noticeably and measurably too expensive?

Thanks.

 strbuf.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/strbuf.h b/strbuf.h
index 2075384e0b..1a77fe146a 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
 	if (len > (sb->alloc ? sb->alloc - 1 : 0))
 		die("BUG: strbuf_setlen() beyond buffer");
 	sb->len = len;
-	sb->buf[len] = '\0';
+	if (sb->buf != strbuf_slopbuf)
+		sb->buf[len] = '\0';
+	else
+		assert(!strbuf_slopbuf[0]);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer
  2017-08-15 18:43   ` Junio C Hamano
@ 2017-08-15 19:06     ` Martin Ågren
  2017-08-15 19:19       ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 19:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On 15 August 2017 at 20:43, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> If two threads have one freshly initialized string buffer each and call
>> strbuf_reset on them at roughly the same time, both threads will be
>> writing a '\0' to strbuf_slopbuf. That is not a problem in practice
>> since it doesn't matter in which order the writes happen. But
>> ThreadSanitizer will consider this a race.
>>
>> When compiling with GIT_THREAD_SANITIZER, avoid writing to
>> strbuf_slopbuf. Let's instead assert on the first byte of strbuf_slopbuf
>> being '\0', since it ensures the promised invariant of "buf[len] ==
>> '\0'". (Writing to strbuf_slopbuf is normally bad, but could become even
>> more bad if we stop covering it up in strbuf_reset.)
>>
>> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
>> ---
>>  strbuf.h | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/strbuf.h b/strbuf.h
>> index e705b94db..295654d39 100644
>> --- a/strbuf.h
>> +++ b/strbuf.h
>> @@ -153,7 +153,19 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>>  /**
>>   * Empty the buffer by setting the size of it to zero.
>>   */
>> +#ifdef GIT_THREAD_SANITIZER
>> +#define strbuf_reset(sb)                                             \
>> +     do {                                                            \
>> +             struct strbuf *_sb = sb;                                \
>> +             _sb->len = 0;                                           \
>> +             if (_sb->buf == strbuf_slopbuf)                         \
>> +                     assert(!strbuf_slopbuf[0]);                     \
>> +             else                                                    \
>> +                     _sb->buf[0] = '\0';                             \
>> +     } while (0)
>> +#else
>>  #define strbuf_reset(sb)  strbuf_setlen(sb, 0)
>> +#endif
>>
>>
>>  /**
>
> The strbuf_slopbuf[] is a shared resource that is expected by
> everybody to stay a holder of a NUL.  Even though it is defined as
> "char [1]", it in spirit ought to be considered const.  And from
> that point of view, your new definition that is conditionally used
> only when sanitizer is in use _is_ the more correct one than the
> current "we do not care if it is slopbuf, we are writing \0 so it
> will be no-op anyway" code.
>
> I wonder if we excessively call strbuf_reset() in the real code to
> make your version unacceptably expensive?  If not, I somehow feel
> that using this version unconditionally may be a better approach.
>
> What happens when a caller calls "strbuf_setlen(&sb, 0)" on a strbuf
> that happens to have nothing and whose buffer still points at the
> slopbuf (instead of calling _reset())?  Shouldn't your patch fix
> that function instead, i.e. something like the following without the
> above?  Is that make things noticeably and measurably too expensive?

Good thinking. There are about 300 users of strbuf_reset and 10 users of
strbuf_setlen(., 0) with a literal zero. Obviously, there might be more
users which end up setting the length to 0 for some reason or other. So
your idea seems the better one. I would assume that whoever resets a
buffer is about to add something to it, which should be more expensive,
but that's obviously just hand-waving. I'll see if I can find some
interesting caller and/or performance numbers.

>  strbuf.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/strbuf.h b/strbuf.h
> index 2075384e0b..1a77fe146a 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>         if (len > (sb->alloc ? sb->alloc - 1 : 0))
>                 die("BUG: strbuf_setlen() beyond buffer");
>         sb->len = len;
> -       sb->buf[len] = '\0';
> +       if (sb->buf != strbuf_slopbuf)
> +               sb->buf[len] = '\0';
> +       else
> +               assert(!strbuf_slopbuf[0]);
>  }
>
>  /**

When writing my patch, I used assert() and figured that with tsan, we're
in some sort of "debug"-mode anyway. If we decide to always do the
check, would it make sense to do "else if (strbuf_slopbuf[0]) BUG(..);"
instead of the assert? Or, if we do prefer the assert, would the
performance-worry be moot?

Thanks for the feedback.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer
  2017-08-15 19:06     ` Martin Ågren
@ 2017-08-15 19:19       ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2017-08-15 19:19 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

Martin Ågren <martin.agren@gmail.com> writes:

>> diff --git a/strbuf.h b/strbuf.h
>> index 2075384e0b..1a77fe146a 100644
>> --- a/strbuf.h
>> +++ b/strbuf.h
>> @@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>>         if (len > (sb->alloc ? sb->alloc - 1 : 0))
>>                 die("BUG: strbuf_setlen() beyond buffer");
>>         sb->len = len;
>> -       sb->buf[len] = '\0';
>> +       if (sb->buf != strbuf_slopbuf)
>> +               sb->buf[len] = '\0';
>> +       else
>> +               assert(!strbuf_slopbuf[0]);
>>  }
>>
>>  /**
>
> When writing my patch, I used assert() and figured that with tsan, we're
> in some sort of "debug"-mode anyway. If we decide to always do the
> check, would it make sense to do "else if (strbuf_slopbuf[0]) BUG(..);"
> instead of the assert? Or, if we do prefer the assert, would the
> performance-worry be moot?

I wasn't thinking about performance impact of having an assert();
the use of it in the above was merely copied from yours ;-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 5/5] ThreadSanitizer: add suppressions
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (3 preceding siblings ...)
  2017-08-15 12:53 ` [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git

Add .tsan-suppressions for want_color() and transfer_debug(). Both of
these use the pattern

	static int foo = -1;
	if (foo == -1)
		foo = func();

where func always returns the same value. This can cause ThreadSanitizer
to diagnose a race when foo is written from two threads, although it
doesn't matter in practice since it's always the same value that is
written.

The suppressions-file is used by setting the environment variable
TSAN_OPTIONS to, e.g., "suppressions=$(pwd)/.tsan-suppressions". Observe
that relative paths such as ".tsan-suppressions" might not work.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I am no memory-model expert. Maybe (aligned) stores and loads of int are
not actually atomic on all the various hardware that Git wants to run
on. Or maybe the compiler is allowed to compile them into 4 1-byte
accesses anyway...

 .tsan-suppressions | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 .tsan-suppressions

diff --git a/.tsan-suppressions b/.tsan-suppressions
new file mode 100644
index 000000000..910c02e59
--- /dev/null
+++ b/.tsan-suppressions
@@ -0,0 +1,12 @@
+# Suppressions for ThreadSanitizer (tsan).
+#
+# This file is used by setting the environment variable TSAN_OPTIONS to, e.g.,
+# "suppressions=$(pwd)/.tsan-suppressions". Observe that relative paths such as
+# ".tsan-suppressions" might not work.
+#
+# These suppressions can be, e.g., that a static variable is written to and it
+# is always the same value being written, so it doesn't really matter that two
+# or more such writes race.
+
+race:^want_color$
+race:^transfer_debug$
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (4 preceding siblings ...)
  2017-08-15 12:53 ` [PATCH 5/5] ThreadSanitizer: add suppressions Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 17:59   ` Jeff Hostetler
  2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
  2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler

Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit
the potential race below.

What seems to happen is, threaded_lazy_init_name_hash ends up using
hashmap_add on the_index.dir_hash from two threads in a way that tsan
considers racy. While the buckets each have their own mutex, the "size"
does not. So it might end up with the wrong (lower) value. The size is
used to determine when to resize, but that should be fine, since
resizing is turned off due to threading anyway.

If the size is ever used for something else, the consequences might
range from cosmetic error to devastating. I have a "feeling" the size is
not used at the time, although maybe it is, in some roundabout way which
I'm not seeing.

Martin

WARNING: ThreadSanitizer: data race (pid=10554)
  Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16):
    #0 hashmap_add hashmap.c:209 (test-lazy-init-name-hash+0x000000438b7d)
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 (test-lazy-init-name-hash+0x00000043ea6c)
    #2 handle_range_dir name-hash.c:347 (test-lazy-init-name-hash+0x00000043ea6c)
    #3 handle_range_1 name-hash.c:415 (test-lazy-init-name-hash+0x00000043ea6c)
    #4 lazy_dir_thread_proc name-hash.c:471 (test-lazy-init-name-hash+0x00000043ea6c)
    #5 <null> <null> (libtsan.so.0+0x0000000230d9)

  Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31):
    #0 hashmap_add hashmap.c:209 (test-lazy-init-name-hash+0x000000438b90)
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 (test-lazy-init-name-hash+0x00000043e0f2)
    #2 handle_range_dir name-hash.c:347 (test-lazy-init-name-hash+0x00000043e0f2)
    #3 handle_range_1 name-hash.c:415 (test-lazy-init-name-hash+0x00000043e0f2)
    #4 handle_range_dir name-hash.c:380 (test-lazy-init-name-hash+0x00000043e709)
    #5 handle_range_1 name-hash.c:415 (test-lazy-init-name-hash+0x00000043e709)
    #6 lazy_dir_thread_proc name-hash.c:471 (test-lazy-init-name-hash+0x00000043e709)
    #7 <null> <null> (libtsan.so.0+0x0000000230d9)

  Location is global 'the_index' of size 208 at 0x00000082d400 (test-lazy-init-name-hash+0x00000082d488)

  Mutex M16 (0x7d640001a5b8) created at:
    #0 pthread_mutex_init <null> (libtsan.so.0+0x0000000280b5)
    #1 init_recursive_mutex thread-utils.c:73 (test-lazy-init-name-hash+0x0000004d829b)
    #2 init_dir_mutex name-hash.c:241 (test-lazy-init-name-hash+0x00000043d714)
    #3 threaded_lazy_init_name_hash name-hash.c:526 (test-lazy-init-name-hash+0x00000043d714)
    #4 lazy_init_name_hash name-hash.c:588 (test-lazy-init-name-hash+0x00000043d714)
    #5 lazy_init_name_hash name-hash.c:581 (test-lazy-init-name-hash+0x00000043ec34)
    #6 test_lazy_init_name_hash name-hash.c:613 (test-lazy-init-name-hash+0x00000043ec34)
    #7 time_runs t/helper/test-lazy-init-name-hash.c:74 (test-lazy-init-name-hash+0x000000405ac2)
    #8 cmd_main t/helper/test-lazy-init-name-hash.c:261 (test-lazy-init-name-hash+0x0000004066c1)
    #9 main common-main.c:43 (test-lazy-init-name-hash+0x000000404f91)

  Mutex M31 (0x7d640001a810) created at:
    #0 pthread_mutex_init <null> (libtsan.so.0+0x0000000280b5)
    #1 init_recursive_mutex thread-utils.c:73 (test-lazy-init-name-hash+0x0000004d829b)
    #2 init_dir_mutex name-hash.c:241 (test-lazy-init-name-hash+0x00000043d714)
    #3 threaded_lazy_init_name_hash name-hash.c:526 (test-lazy-init-name-hash+0x00000043d714)
    #4 lazy_init_name_hash name-hash.c:588 (test-lazy-init-name-hash+0x00000043d714)
    #5 lazy_init_name_hash name-hash.c:581 (test-lazy-init-name-hash+0x00000043ec34)
    #6 test_lazy_init_name_hash name-hash.c:613 (test-lazy-init-name-hash+0x00000043ec34)
    #7 time_runs t/helper/test-lazy-init-name-hash.c:74 (test-lazy-init-name-hash+0x000000405ac2)
    #8 cmd_main t/helper/test-lazy-init-name-hash.c:261 (test-lazy-init-name-hash+0x0000004066c1)
    #9 main common-main.c:43 (test-lazy-init-name-hash+0x000000404f91)

  Thread T2 (tid=10557, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x000000027577)
    #1 threaded_lazy_init_name_hash name-hash.c:541 (test-lazy-init-name-hash+0x00000043d7ab)
    #2 lazy_init_name_hash name-hash.c:588 (test-lazy-init-name-hash+0x00000043d7ab)
    #3 lazy_init_name_hash name-hash.c:581 (test-lazy-init-name-hash+0x00000043ec34)
    #4 test_lazy_init_name_hash name-hash.c:613 (test-lazy-init-name-hash+0x00000043ec34)
    #5 time_runs t/helper/test-lazy-init-name-hash.c:74 (test-lazy-init-name-hash+0x000000405ac2)
    #6 cmd_main t/helper/test-lazy-init-name-hash.c:261 (test-lazy-init-name-hash+0x0000004066c1)
    #7 main common-main.c:43 (test-lazy-init-name-hash+0x000000404f91)

  Thread T1 (tid=10556, finished) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x000000027577)
    #1 threaded_lazy_init_name_hash name-hash.c:541 (test-lazy-init-name-hash+0x00000043d7ab)
    #2 lazy_init_name_hash name-hash.c:588 (test-lazy-init-name-hash+0x00000043d7ab)
    #3 lazy_init_name_hash name-hash.c:581 (test-lazy-init-name-hash+0x00000043ec34)
    #4 test_lazy_init_name_hash name-hash.c:613 (test-lazy-init-name-hash+0x00000043ec34)
    #5 time_runs t/helper/test-lazy-init-name-hash.c:74 (test-lazy-init-name-hash+0x000000405ac2)
    #6 cmd_main t/helper/test-lazy-init-name-hash.c:261 (test-lazy-init-name-hash+0x0000004066c1)
    #7 main common-main.c:43 (test-lazy-init-name-hash+0x000000404f91)

SUMMARY: ThreadSanitizer: data race hashmap.c:209 hashmap_add


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
@ 2017-08-15 17:59   ` Jeff Hostetler
  2017-08-15 18:17     ` Stefan Beller
  2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
  1 sibling, 1 reply; 64+ messages in thread
From: Jeff Hostetler @ 2017-08-15 17:59 UTC (permalink / raw)
  To: Martin Ågren, git; +Cc: Jeff Hostetler



On 8/15/2017 8:53 AM, Martin Ågren wrote:
> Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit
> the potential race below.
> 
> What seems to happen is, threaded_lazy_init_name_hash ends up using
> hashmap_add on the_index.dir_hash from two threads in a way that tsan
> considers racy. While the buckets each have their own mutex, the "size"
> does not. So it might end up with the wrong (lower) value. The size is
> used to determine when to resize, but that should be fine, since
> resizing is turned off due to threading anyway.
 >
> If the size is ever used for something else, the consequences might
> range from cosmetic error to devastating. I have a "feeling" the size is
> not used at the time, although maybe it is, in some roundabout way which
> I'm not seeing.

Good catch!  Yes, the size field is unguarded.  The only references to
it that I found were used in _add() and _remove() in the need-to-rehash
check.

Since auto-rehash is turned off, this shouldn't be a problem, but it
does feel odd to have a possibly incorrect size due to races.

Rather than adding something like (a cross-platform version of)
InterlockedIncrement(), I'm wondering if it would be better to
disable/invalidate the size field when auto-rehash is turned off
so we don't have to worry about it.

Something like this:


$ git diff
diff --git a/hashmap.c b/hashmap.c
index 9b6a12859b..be97642daa 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -205,6 +205,9 @@ void hashmap_add(struct hashmap *map, void *entry)
         ((struct hashmap_entry *) entry)->next = map->table[b];
         map->table[b] = entry;

+       if (map->disallow_rehash)
+               return;
+
         /* fix size and rehash if appropriate */
         map->size++;
         if (map->size > map->grow_at)
@@ -223,6 +226,9 @@ void *hashmap_remove(struct hashmap *map, const void 
*key, const void *keydata)
         *e = old->next;
         old->next = NULL;

+       if (map->disallow_rehash)
+               return;
+
         /* fix size and rehash if appropriate */
         map->size--;
         if (map->size < map->shrink_at)
diff --git a/hashmap.h b/hashmap.h
index 7a8fa7fa3d..ec9541b981 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -183,7 +183,8 @@ struct hashmap {
         const void *cmpfn_data;

         /* total number of entries (0 means the hashmap is empty) */
-       unsigned int size;
+       /* -1 means size is unknown for threading reasons */
+       int size;

         /*
          * tablesize is the allocated size of the hash table. A non-0 value
@@ -360,6 +361,15 @@ int hashmap_bucket(const struct hashmap *map, 
unsigned int hash);
  static inline void hashmap_disallow_rehash(struct hashmap *map, 
unsigned value)
  {
         map->disallow_rehash = value;
+       if (value) {
+               /*
+                * Assume threaded operations starting, so don't
+                * try to keep size current.
+                */
+               size = -1;
+       } else {
+               /* TODO count items in all buckets and set size. */
+       }
  }

  /*


Jeff

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 17:59   ` Jeff Hostetler
@ 2017-08-15 18:17     ` Stefan Beller
  2017-08-15 18:40       ` Martin Ågren
  0 siblings, 1 reply; 64+ messages in thread
From: Stefan Beller @ 2017-08-15 18:17 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Martin Ågren, git@vger.kernel.org, Jeff Hostetler

On Tue, Aug 15, 2017 at 10:59 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>
>
> On 8/15/2017 8:53 AM, Martin Ågren wrote:
>>
>> Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit
>> the potential race below.
>>
>> What seems to happen is, threaded_lazy_init_name_hash ends up using
>> hashmap_add on the_index.dir_hash from two threads in a way that tsan
>> considers racy. While the buckets each have their own mutex, the "size"
>> does not. So it might end up with the wrong (lower) value. The size is
>> used to determine when to resize, but that should be fine, since
>> resizing is turned off due to threading anyway.
>
>>
>>
>> If the size is ever used for something else, the consequences might
>> range from cosmetic error to devastating. I have a "feeling" the size is
>> not used at the time, although maybe it is, in some roundabout way which
>> I'm not seeing.
>
>
> Good catch!  Yes, the size field is unguarded.  The only references to
> it that I found were used in _add() and _remove() in the need-to-rehash
> check.
>
> Since auto-rehash is turned off, this shouldn't be a problem, but it
> does feel odd to have a possibly incorrect size due to races.
>
> Rather than adding something like (a cross-platform version of)
> InterlockedIncrement(), I'm wondering if it would be better to
> disable/invalidate the size field when auto-rehash is turned off
> so we don't have to worry about it.
>
> Something like this:
>
>
> $ git diff
> diff --git a/hashmap.c b/hashmap.c
> index 9b6a12859b..be97642daa 100644
> --- a/hashmap.c
> +++ b/hashmap.c
> @@ -205,6 +205,9 @@ void hashmap_add(struct hashmap *map, void *entry)
>         ((struct hashmap_entry *) entry)->next = map->table[b];
>         map->table[b] = entry;
>
> +       if (map->disallow_rehash)
> +               return;
> +
>         /* fix size and rehash if appropriate */
>         map->size++;
>         if (map->size > map->grow_at)
> @@ -223,6 +226,9 @@ void *hashmap_remove(struct hashmap *map, const void
> *key, const void *keydata)
>         *e = old->next;
>         old->next = NULL;
>
> +       if (map->disallow_rehash)
> +               return;
> +


Once we have these two checks, the check in rehash itself becomes
redundant (as any code path leading to the check in rehash already
had the check).

Which leads me to wonder if we want to make the size (in/de)crease
part of the rehash function, which could be

    adjust_size(struct hashmap *map, int n)

with `n` either +1 or -1 (the increase value). Depending on 'n', we could
compute the newsize for the potential rehash in that function as well.

Note sure if that is a cleaner internal API.

>         /* fix size and rehash if appropriate */
>         map->size--;
>         if (map->size < map->shrink_at)
> diff --git a/hashmap.h b/hashmap.h
> index 7a8fa7fa3d..ec9541b981 100644
> --- a/hashmap.h
> +++ b/hashmap.h
> @@ -183,7 +183,8 @@ struct hashmap {
>         const void *cmpfn_data;
>
>         /* total number of entries (0 means the hashmap is empty) */
> -       unsigned int size;
> +       /* -1 means size is unknown for threading reasons */
> +       int size;

This double-encodes the state of disallow_rehash (i.e. if we had
signed size, then the invariant disallow_rehash === (size < 0)
is true, such that we could omit either the flag and just check for
size < 0 or we do not need the negative size as any user would
need to check disallow_rehash first. Not sure which API is harder
to misuse. I'd think just having the size and getting rid of
disallow_rehash might be hard to to reused.

>
>         /*
>          * tablesize is the allocated size of the hash table. A non-0 value
> @@ -360,6 +361,15 @@ int hashmap_bucket(const struct hashmap *map, unsigned
> int hash);
>  static inline void hashmap_disallow_rehash(struct hashmap *map, unsigned
> value)
>  {
>         map->disallow_rehash = value;
> +       if (value) {
> +               /*
> +                * Assume threaded operations starting, so don't
> +                * try to keep size current.
> +                */
> +               size = -1;
> +       } else {
> +               /* TODO count items in all buckets and set size. */
> +       }
>  }
>
>  /*
>
>
> Jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 18:17     ` Stefan Beller
@ 2017-08-15 18:40       ` Martin Ågren
  2017-08-15 18:48         ` Stefan Beller
  0 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 18:40 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff Hostetler, git@vger.kernel.org, Jeff Hostetler

On 15 August 2017 at 20:17, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Aug 15, 2017 at 10:59 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>>
>> On 8/15/2017 8:53 AM, Martin Ågren wrote:
>>>
>>> Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit
>>> the potential race below.
>>>
>>> What seems to happen is, threaded_lazy_init_name_hash ends up using
>>> hashmap_add on the_index.dir_hash from two threads in a way that tsan
>>> considers racy. While the buckets each have their own mutex, the "size"
>>> does not. So it might end up with the wrong (lower) value. The size is
>>> used to determine when to resize, but that should be fine, since
>>> resizing is turned off due to threading anyway.
>>
>>>
>>>
>>> If the size is ever used for something else, the consequences might
>>> range from cosmetic error to devastating. I have a "feeling" the size is
>>> not used at the time, although maybe it is, in some roundabout way which
>>> I'm not seeing.
>>
>>
>> Good catch!  Yes, the size field is unguarded.  The only references to
>> it that I found were used in _add() and _remove() in the need-to-rehash
>> check.
>>
>> Since auto-rehash is turned off, this shouldn't be a problem, but it
>> does feel odd to have a possibly incorrect size due to races.
>>
>> Rather than adding something like (a cross-platform version of)
>> InterlockedIncrement(), I'm wondering if it would be better to
>> disable/invalidate the size field when auto-rehash is turned off
>> so we don't have to worry about it.
>>
>> Something like this:
>>
>>
>> $ git diff
>> diff --git a/hashmap.c b/hashmap.c
>> index 9b6a12859b..be97642daa 100644
>> --- a/hashmap.c
>> +++ b/hashmap.c
>> @@ -205,6 +205,9 @@ void hashmap_add(struct hashmap *map, void *entry)
>>         ((struct hashmap_entry *) entry)->next = map->table[b];
>>         map->table[b] = entry;
>>
>> +       if (map->disallow_rehash)
>> +               return;
>> +
>>         /* fix size and rehash if appropriate */
>>         map->size++;
>>         if (map->size > map->grow_at)
>> @@ -223,6 +226,9 @@ void *hashmap_remove(struct hashmap *map, const void
>> *key, const void *keydata)
>>         *e = old->next;
>>         old->next = NULL;
>>
>> +       if (map->disallow_rehash)
>> +               return;
>> +
>
>
> Once we have these two checks, the check in rehash itself becomes
> redundant (as any code path leading to the check in rehash already
> had the check).
>
> Which leads me to wonder if we want to make the size (in/de)crease
> part of the rehash function, which could be
>
>     adjust_size(struct hashmap *map, int n)
>
> with `n` either +1 or -1 (the increase value). Depending on 'n', we could
> compute the newsize for the potential rehash in that function as well.
>
> Note sure if that is a cleaner internal API.
>
>>         /* fix size and rehash if appropriate */
>>         map->size--;
>>         if (map->size < map->shrink_at)
>> diff --git a/hashmap.h b/hashmap.h
>> index 7a8fa7fa3d..ec9541b981 100644
>> --- a/hashmap.h
>> +++ b/hashmap.h
>> @@ -183,7 +183,8 @@ struct hashmap {
>>         const void *cmpfn_data;
>>
>>         /* total number of entries (0 means the hashmap is empty) */
>> -       unsigned int size;
>> +       /* -1 means size is unknown for threading reasons */
>> +       int size;
>
> This double-encodes the state of disallow_rehash (i.e. if we had
> signed size, then the invariant disallow_rehash === (size < 0)
> is true, such that we could omit either the flag and just check for
> size < 0 or we do not need the negative size as any user would
> need to check disallow_rehash first. Not sure which API is harder
> to misuse. I'd think just having the size and getting rid of
> disallow_rehash might be hard to to reused.

(Do you mean "might be hard to be misused"?)

One good thing about turning off the size-tracking with threading is
that someone who later wants to know the size in a threaded application
will not introduce any subtle bugs by misusing size, but will be forced
to provide and use some sort of InterlockedIncrement(). When/if that
change happens, it would be nice if no-one relied on the value of size
to say anything about threading. So it might make sense to have an
implementation-independent way of accessing disallow_rehash a.k.a.
(size < 0).

For example a function hashmap_disallow_rehash(), except that's
obviously taken. :-) Maybe the existing function would then be
hashmap_set_disallow_rehash(). Oh well..

>
>>
>>         /*
>>          * tablesize is the allocated size of the hash table. A non-0 value
>> @@ -360,6 +361,15 @@ int hashmap_bucket(const struct hashmap *map, unsigned
>> int hash);
>>  static inline void hashmap_disallow_rehash(struct hashmap *map, unsigned
>> value)
>>  {
>>         map->disallow_rehash = value;
>> +       if (value) {
>> +               /*
>> +                * Assume threaded operations starting, so don't
>> +                * try to keep size current.
>> +                */
>> +               size = -1;
>> +       } else {
>> +               /* TODO count items in all buckets and set size. */
>> +       }
>>  }
>>
>>  /*
>>
>>
>> Jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 18:40       ` Martin Ågren
@ 2017-08-15 18:48         ` Stefan Beller
  2017-08-15 19:21           ` Martin Ågren
  0 siblings, 1 reply; 64+ messages in thread
From: Stefan Beller @ 2017-08-15 18:48 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Jeff Hostetler, git@vger.kernel.org, Jeff Hostetler

>>>         /* total number of entries (0 means the hashmap is empty) */
>>> -       unsigned int size;
>>> +       /* -1 means size is unknown for threading reasons */
>>> +       int size;
>>
>> This double-encodes the state of disallow_rehash (i.e. if we had
>> signed size, then the invariant disallow_rehash === (size < 0)
>> is true, such that we could omit either the flag and just check for
>> size < 0 or we do not need the negative size as any user would
>> need to check disallow_rehash first. Not sure which API is harder
>> to misuse. I'd think just having the size and getting rid of
>> disallow_rehash might be hard to to reused.
>
> (Do you mean "might be hard to be misused"?)

yes, I do.

> One good thing about turning off the size-tracking with threading is
> that someone who later wants to know the size in a threaded application
> will not introduce any subtle bugs by misusing size, but will be forced
> to provide and use some sort of InterlockedIncrement().

agreed.

> When/if that
> change happens, it would be nice if no-one relied on the value of size
> to say anything about threading. So it might make sense to have an
> implementation-independent way of accessing disallow_rehash a.k.a.
> (size < 0).

Yes, and my point was whether we want to keep disallow_rehash around,
as when a patch as this is applied, we'd have it encoded twice,
both size < 0 as well as disallow_rehash set indicate the rehashing
disabled.

If we were to reduce it to one, we would not have "invalid" state possible
such as size < 0 and disallow_rehash = 0.

In the future we may have more options that make size impossible to
compute efficiently, such that in that case we'd want to know which
condition lead to it. In that case we'd want to have the flags around.

> For example a function hashmap_disallow_rehash(), except that's
> obviously taken. :-) Maybe the existing function would then be
> hashmap_set_disallow_rehash(). Oh well..

Not sure I understand this one.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 18:48         ` Stefan Beller
@ 2017-08-15 19:21           ` Martin Ågren
  2017-08-15 20:46             ` Jeff Hostetler
  0 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 19:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff Hostetler, git@vger.kernel.org, Jeff Hostetler

On 15 August 2017 at 20:48, Stefan Beller <sbeller@google.com> wrote:
>>>>         /* total number of entries (0 means the hashmap is empty) */
>>>> -       unsigned int size;
>>>> +       /* -1 means size is unknown for threading reasons */
>>>> +       int size;
>>>
>>> This double-encodes the state of disallow_rehash (i.e. if we had
>>> signed size, then the invariant disallow_rehash === (size < 0)
>>> is true, such that we could omit either the flag and just check for
>>> size < 0 or we do not need the negative size as any user would
>>> need to check disallow_rehash first. Not sure which API is harder
>>> to misuse. I'd think just having the size and getting rid of
>>> disallow_rehash might be hard to to reused.
>>
>> (Do you mean "might be hard to be misused"?)
>
> yes, I do.
>
>> One good thing about turning off the size-tracking with threading is
>> that someone who later wants to know the size in a threaded application
>> will not introduce any subtle bugs by misusing size, but will be forced
>> to provide and use some sort of InterlockedIncrement().
>
> agreed.
>
>> When/if that
>> change happens, it would be nice if no-one relied on the value of size
>> to say anything about threading. So it might make sense to have an
>> implementation-independent way of accessing disallow_rehash a.k.a.
>> (size < 0).
>
> Yes, and my point was whether we want to keep disallow_rehash around,
> as when a patch as this is applied, we'd have it encoded twice,
> both size < 0 as well as disallow_rehash set indicate the rehashing
> disabled.
>
> If we were to reduce it to one, we would not have "invalid" state possible
> such as size < 0 and disallow_rehash = 0.

Agreed.

> In the future we may have more options that make size impossible to
> compute efficiently, such that in that case we'd want to know which
> condition lead to it. In that case we'd want to have the flags around.

Good point.

>> For example a function hashmap_disallow_rehash(), except that's
>> obviously taken. :-) Maybe the existing function would then be
>> hashmap_set_disallow_rehash(). Oh well..
>
> Not sure I understand this one.

Sorry. What I meant was, if we drop the disallow_rehash-field, someone
might be tempted to use size < 0 (or size == -1) to answer the question
"is rehashing disallowed?". (Or "am I threaded?" which already is a
question which the hashmap as it is today doesn't know about.)

So instead of looking at "disallow_rehash" one should perhaps be calling
"hashmap_is_disallow_rehash()" or "hashmap_get_disallow_rehash()", which
would be implemented as "return disallow_rehash", or possibly "return
size == -1".

Except such names are, to the best of my understanding, not the Git-way,
so it should be, e.g., "hashmap_disallow_rehash()".

Except ... that name is taken.... So to free that name up, the existing
function should perhaps be renamed "hashmap_set_disallow_rehash()",
again assuming I've picked up the right conventions in my recent
browsing of the Git-code.

The final "Oh well" was a short form of "it began with an observation
which currently has no practical effect, and is slowly turning into a
chain of ideas on how to rebuild the interface".

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t3008: hashmap_add touches size from multiple threads
  2017-08-15 19:21           ` Martin Ågren
@ 2017-08-15 20:46             ` Jeff Hostetler
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-08-15 20:46 UTC (permalink / raw)
  To: Martin Ågren, Stefan Beller; +Cc: git@vger.kernel.org, Jeff Hostetler



On 8/15/2017 3:21 PM, Martin Ågren wrote:
> On 15 August 2017 at 20:48, Stefan Beller <sbeller@google.com> wrote:
>>>>>          /* total number of entries (0 means the hashmap is empty) */
>>>>> -       unsigned int size;
>>>>> +       /* -1 means size is unknown for threading reasons */
>>>>> +       int size;
>>>>
>>>> This double-encodes the state of disallow_rehash (i.e. if we had
>>>> signed size, then the invariant disallow_rehash === (size < 0)
>>>> is true, such that we could omit either the flag and just check for
>>>> size < 0 or we do not need the negative size as any user would
>>>> need to check disallow_rehash first. Not sure which API is harder
>>>> to misuse. I'd think just having the size and getting rid of
>>>> disallow_rehash might be hard to to reused.
>>>
>>> (Do you mean "might be hard to be misused"?)
>>
>> yes, I do.
>>
>>> One good thing about turning off the size-tracking with threading is
>>> that someone who later wants to know the size in a threaded application
>>> will not introduce any subtle bugs by misusing size, but will be forced
>>> to provide and use some sort of InterlockedIncrement().
>>
>> agreed.
>>
>>> When/if that
>>> change happens, it would be nice if no-one relied on the value of size
>>> to say anything about threading. So it might make sense to have an
>>> implementation-independent way of accessing disallow_rehash a.k.a.
>>> (size < 0).
>>
>> Yes, and my point was whether we want to keep disallow_rehash around,
>> as when a patch as this is applied, we'd have it encoded twice,
>> both size < 0 as well as disallow_rehash set indicate the rehashing
>> disabled.
>>
>> If we were to reduce it to one, we would not have "invalid" state possible
>> such as size < 0 and disallow_rehash = 0.
> 
> Agreed.
> 
>> In the future we may have more options that make size impossible to
>> compute efficiently, such that in that case we'd want to know which
>> condition lead to it. In that case we'd want to have the flags around.
> 
> Good point.

I feel like we're trying to push hashmaps a little beyond
their capability.  I mean the core hashmap code is NOT thread
safe.  The caller is responsible for carefully controlling how
the hashmap is used and whatever locking strategy it wants --
whether it is a single lock on the entire hashmap -- or a set
of partition-specific locks like I created here.  Whatever the
strategy, it is outside of hashmap.[ch].

Perhaps it would be best to just define things as:
* let (size < 0) mean we choose not to compute/track it (without
   saying why).
* keep "disallow_rehash = 1" to mean we do not want automatic
   resizing (without saying why).

Thread-aware callers will set both.

Thread-aware callers (when finished with threaded operations)
can themselves choose whether to compute the correct size and
re-allow rehashing.  And we can add a method to hashmap.c to
re-calculate the size if we want.


In my lazy_init_name_hash() I set "disallow", do the threaded
code, and then unset "disallow" -- mainly to keep the usage
consistent with the non-threaded case.  I could just as easily
set "disallow" and leave it that way -- the question is whether
we care if the hashmap automatically resizes later.  (I don't.)


>>> For example a function hashmap_disallow_rehash(), except that's
>>> obviously taken. :-) Maybe the existing function would then be
>>> hashmap_set_disallow_rehash(). Oh well..
>>
>> Not sure I understand this one.
> 
> Sorry. What I meant was, if we drop the disallow_rehash-field, someone
> might be tempted to use size < 0 (or size == -1) to answer the question
> "is rehashing disallowed?". (Or "am I threaded?" which already is a
> question which the hashmap as it is today doesn't know about.)
> 
> So instead of looking at "disallow_rehash" one should perhaps be calling
> "hashmap_is_disallow_rehash()" or "hashmap_get_disallow_rehash()", which
> would be implemented as "return disallow_rehash", or possibly "return
> size == -1".
> 
> Except such names are, to the best of my understanding, not the Git-way,
> so it should be, e.g., "hashmap_disallow_rehash()".
> 
> Except ... that name is taken.... So to free that name up, the existing
> function should perhaps be renamed "hashmap_set_disallow_rehash()",
> again assuming I've picked up the right conventions in my recent
> browsing of the Git-code.
> 
> The final "Oh well" was a short form of "it began with an observation
> which currently has no practical effect, and is slowly turning into a
> chain of ideas on how to rebuild the interface".
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] hashmap: address ThreadSanitizer concerns
  2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
  2017-08-15 17:59   ` Jeff Hostetler
@ 2017-08-30 18:59   ` Jeff Hostetler
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
  2017-09-06 15:43     ` [PATCH v2] hashmap: address ThreadSanitizer concerns Jeff Hostetler
  1 sibling, 2 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-08-30 18:59 UTC (permalink / raw)
  To: martin.agren; +Cc: git, jeffhost, gitster, peff

From: Jeff Hostetler <jeffhost@microsoft.com>

This is to address concerns raised by ThreadSanitizer on the
mailing list about threaded unprotected R/W access to map.size with my previous
"disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).

See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/

TODO  This version contains methods to disable item counting and automatic
rehashing independently.  Since the latter is implicit with the former, we
could get by with just the one, but I thought we could discuss it since it
does provide a little extra clarity of intent.


Jeff Hostetler (1):
  hashmap: add API to disable item counting when threaded

 attr.c                  | 14 ++++---
 builtin/describe.c      |  2 +-
 hashmap.c               | 31 +++++++++++-----
 hashmap.h               | 98 ++++++++++++++++++++++++++++++++++++++-----------
 name-hash.c             |  6 ++-
 t/helper/test-hashmap.c |  2 +-
 6 files changed, 113 insertions(+), 40 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] hashmap: add API to disable item counting when threaded
  2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
@ 2017-08-30 18:59     ` Jeff Hostetler
  2017-09-01 23:31       ` Johannes Schindelin
                         ` (3 more replies)
  2017-09-06 15:43     ` [PATCH v2] hashmap: address ThreadSanitizer concerns Jeff Hostetler
  1 sibling, 4 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-08-30 18:59 UTC (permalink / raw)
  To: martin.agren; +Cc: git, jeffhost, gitster, peff

From: Jeff Hostetler <jeffhost@microsoft.com>

This is to address concerns raised by ThreadSanitizer on the
mailing list about threaded unprotected R/W access to map.size with my previous
"disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).  See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/

Add API to hashmap to disable item counting and to disable automatic rehashing.  
Also include APIs to re-enable item counting and automatica rehashing.

When item counting is disabled, the map.size field is invalid.  So to
prevent accidents, the field has been renamed and an accessor function
hashmap_get_size() has been added.  All direct references to this
field have been been updated.  And the name of the field changed
to map.private_size to communicate thie.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 attr.c                  | 14 ++++---
 builtin/describe.c      |  2 +-
 hashmap.c               | 31 +++++++++++-----
 hashmap.h               | 98 ++++++++++++++++++++++++++++++++++++++-----------
 name-hash.c             |  6 ++-
 t/helper/test-hashmap.c |  2 +-
 6 files changed, 113 insertions(+), 40 deletions(-)

diff --git a/attr.c b/attr.c
index 56961f0..a987f37 100644
--- a/attr.c
+++ b/attr.c
@@ -149,10 +149,12 @@ struct all_attrs_item {
 static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 {
 	int i;
+	unsigned int size;
 
 	hashmap_lock(map);
 
-	if (map->map.size < check->all_attrs_nr)
+	size = hashmap_get_size(&map->map);
+	if (size < check->all_attrs_nr)
 		die("BUG: interned attributes shouldn't be deleted");
 
 	/*
@@ -161,13 +163,13 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	 * field), reallocate the provided attr_check instance's all_attrs
 	 * field and fill each entry with its corresponding git_attr.
 	 */
-	if (map->map.size != check->all_attrs_nr) {
+	if (size != check->all_attrs_nr) {
 		struct attr_hash_entry *e;
 		struct hashmap_iter iter;
 		hashmap_iter_init(&map->map, &iter);
 
-		REALLOC_ARRAY(check->all_attrs, map->map.size);
-		check->all_attrs_nr = map->map.size;
+		REALLOC_ARRAY(check->all_attrs, size);
+		check->all_attrs_nr = size;
 
 		while ((e = hashmap_iter_next(&iter))) {
 			const struct git_attr *a = e->value;
@@ -235,10 +237,10 @@ static const struct git_attr *git_attr_internal(const char *name, int namelen)
 
 	if (!a) {
 		FLEX_ALLOC_MEM(a, name, name, namelen);
-		a->attr_nr = g_attr_hashmap.map.size;
+		a->attr_nr = hashmap_get_size(&g_attr_hashmap.map);
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
-		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+		assert(a->attr_nr == (hashmap_get_size(&g_attr_hashmap.map) - 1));
 	}
 
 	hashmap_unlock(&g_attr_hashmap);
diff --git a/builtin/describe.c b/builtin/describe.c
index 89ea1cd..1e3cbc7 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -505,7 +505,7 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 
 	hashmap_init(&names, (hashmap_cmp_fn) commit_name_cmp, NULL, 0);
 	for_each_rawref(get_name, NULL);
-	if (!names.size && !always)
+	if (!hashmap_get_size(&names) && !always)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
diff --git a/hashmap.c b/hashmap.c
index 9b6a128..054f334 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -116,9 +116,6 @@ static void rehash(struct hashmap *map, unsigned int newsize)
 	unsigned int i, oldsize = map->tablesize;
 	struct hashmap_entry **oldtable = map->table;
 
-	if (map->disallow_rehash)
-		return;
-
 	alloc_table(map, newsize);
 	for (i = 0; i < oldsize; i++) {
 		struct hashmap_entry *e = oldtable[i];
@@ -166,6 +163,17 @@ void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function,
 	while (initial_size > size)
 		size <<= HASHMAP_RESIZE_BITS;
 	alloc_table(map, size);
+
+	/*
+	 * By default, we want to:
+	 * [] keep track of the number of items in the map and
+	 * [] allow the map to automatically grow as necessary.
+	 *
+	 * We probably only need to disable these if we are being
+	 * used by a threaded caller.
+	 */
+	map->do_count_items = 1;
+	map->do_auto_rehash = 1;
 }
 
 void hashmap_free(struct hashmap *map, int free_entries)
@@ -206,9 +214,11 @@ void hashmap_add(struct hashmap *map, void *entry)
 	map->table[b] = entry;
 
 	/* fix size and rehash if appropriate */
-	map->size++;
-	if (map->size > map->grow_at)
-		rehash(map, map->tablesize << HASHMAP_RESIZE_BITS);
+	if (map->do_count_items) {
+		map->private_size++;
+		if (map->do_auto_rehash && map->private_size > map->grow_at)
+			rehash(map, map->tablesize << HASHMAP_RESIZE_BITS);
+	}
 }
 
 void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)
@@ -224,9 +234,12 @@ void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)
 	old->next = NULL;
 
 	/* fix size and rehash if appropriate */
-	map->size--;
-	if (map->size < map->shrink_at)
-		rehash(map, map->tablesize >> HASHMAP_RESIZE_BITS);
+	if (map->do_count_items) {
+		map->private_size--;
+		if (map->do_auto_rehash && map->private_size < map->shrink_at)
+			rehash(map, map->tablesize >> HASHMAP_RESIZE_BITS);
+	}
+
 	return old;
 }
 
diff --git a/hashmap.h b/hashmap.h
index 7a8fa7f..7b8e6f4 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -183,7 +183,7 @@ struct hashmap {
 	const void *cmpfn_data;
 
 	/* total number of entries (0 means the hashmap is empty) */
-	unsigned int size;
+	unsigned int private_size; /* use hashmap_get_size() */
 
 	/*
 	 * tablesize is the allocated size of the hash table. A non-0 value
@@ -196,8 +196,8 @@ struct hashmap {
 	unsigned int grow_at;
 	unsigned int shrink_at;
 
-	/* See `hashmap_disallow_rehash`. */
-	unsigned disallow_rehash : 1;
+	unsigned int do_count_items : 1;
+	unsigned int do_auto_rehash : 1;
 };
 
 /* hashmap functions */
@@ -253,6 +253,19 @@ static inline void hashmap_entry_init(void *entry, unsigned int hash)
 }
 
 /*
+ * Return the number of items in the map.
+ */
+inline unsigned int hashmap_get_size(struct hashmap *map)
+{
+	if (map->do_count_items)
+		return map->private_size;
+
+	/* TODO Consider counting them and returning that. */
+	die("hashmap_get_size: size not set");
+	return 0;
+}
+
+/*
  * Returns the hashmap entry for the specified key, or NULL if not found.
  *
  * `map` is the hashmap structure.
@@ -345,24 +358,6 @@ extern void *hashmap_remove(struct hashmap *map, const void *key,
 int hashmap_bucket(const struct hashmap *map, unsigned int hash);
 
 /*
- * Disallow/allow rehashing of the hashmap.
- * This is useful if the caller knows that the hashmap needs multi-threaded
- * access.  The caller is still required to guard/lock searches and inserts
- * in a manner appropriate to their usage.  This simply prevents the table
- * from being unexpectedly re-mapped.
- *
- * It is up to the caller to ensure that the hashmap is initialized to a
- * reasonable size to prevent poor performance.
- *
- * A call to allow rehashing does not force a rehash; that might happen
- * with the next insert or delete.
- */
-static inline void hashmap_disallow_rehash(struct hashmap *map, unsigned value)
-{
-	map->disallow_rehash = value;
-}
-
-/*
  * Used to iterate over all entries of a hashmap. Note that it is
  * not safe to add or remove entries to the hashmap while
  * iterating.
@@ -387,6 +382,67 @@ static inline void *hashmap_iter_first(struct hashmap *map,
 	return hashmap_iter_next(iter);
 }
 
+/*
+ * Disable item counting when adding/removing items.
+ *
+ * Normally, the hashmap keeps track of the number of items in the map
+ * and uses it to dynamically resize it.  This (both the counting and
+ * the resizing) can cause problems when the map is being used by
+ * threaded callers (because the hashmap code does not know about the
+ * locking strategy used by the threaded callers and therefore, does
+ * not know how to protect the "private_size" counter).
+ *
+ * (This will implicitly disable auto resizing.)
+ */
+static inline void hashmap_disable_item_counting(struct hashmap *map)
+{
+	map->do_count_items = 0;
+}
+
+/*
+ * Re-enable item couting when adding/removing items.
+ * If counting is currently disabled, it will force count them.
+ *
+ * This might be used after leaving a threaded section.
+ */
+static inline void hashmap_enable_item_counting(struct hashmap *map)
+{
+	void *item;
+	unsigned int n = 0;
+	struct hashmap_iter iter;
+
+	hashmap_iter_init(map, &iter);
+	while ((item = hashmap_iter_next(&iter)))
+		n++;
+
+	map->do_count_items = 1;
+	map->private_size = n;
+}
+
+/*
+ * Disable automatic resizing when adding/removing items.
+ *
+ * This prevents the hashmap from resizing the table size and
+ * redistributing the items into new bucket chains.  This causes
+ * problems when the map is being used by threaded callers (because
+ * the hashmap code does not know about the locking strategy used by
+ * the threaded callers and therefore, does not know how to safely
+ * build new chains while other threads may be walking them).
+ */
+static inline void hashmap_disable_auto_rehash(struct hashmap *map)
+{
+	map->do_auto_rehash = 0;
+}
+
+/*
+ * Re-enable automatic resizing when adding/removing items.
+ * This DOES NOT force an immediate rehash.
+ */
+static inline void hashmap_enable_auto_rehash(struct hashmap *map)
+{
+	map->do_auto_rehash = 1;
+}
+
 /* String interning */
 
 /*
diff --git a/name-hash.c b/name-hash.c
index 0e10f3e..829ff59 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -580,9 +580,11 @@ static void lazy_init_name_hash(struct index_state *istate)
 			NULL, istate->cache_nr);
 
 	if (lookup_lazy_params(istate)) {
-		hashmap_disallow_rehash(&istate->dir_hash, 1);
+		hashmap_disable_item_counting(&istate->dir_hash);
+		hashmap_disable_auto_rehash(&istate->dir_hash);
 		threaded_lazy_init_name_hash(istate);
-		hashmap_disallow_rehash(&istate->dir_hash, 0);
+		hashmap_enable_auto_rehash(&istate->dir_hash);
+		hashmap_enable_item_counting(&istate->dir_hash);
 	} else {
 		int nr;
 		for (nr = 0; nr < istate->cache_nr; nr++)
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 095d739..84c57b9 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -237,7 +237,7 @@ int cmd_main(int argc, const char **argv)
 		} else if (!strcmp("size", cmd)) {
 
 			/* print table sizes */
-			printf("%u %u\n", map.tablesize, map.size);
+			printf("%u %u\n", map.tablesize, hashmap_get_size(&map));
 
 		} else if (!strcmp("intern", cmd) && l1) {
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
@ 2017-09-01 23:31       ` Johannes Schindelin
  2017-09-01 23:50         ` Jonathan Nieder
                           ` (2 more replies)
  2017-09-02  8:05       ` Jeff King
                         ` (2 subsequent siblings)
  3 siblings, 3 replies; 64+ messages in thread
From: Johannes Schindelin @ 2017-09-01 23:31 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: martin.agren, git, jeffhost, gitster, peff

Hi Jeff,

On Wed, 30 Aug 2017, Jeff Hostetler wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> This is to address concerns raised by ThreadSanitizer on the mailing
> list about threaded unprotected R/W access to map.size with my previous
> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).
> See:
> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
> 
> Add API to hashmap to disable item counting and to disable automatic
> rehashing.  Also include APIs to re-enable item counting and automatica
> rehashing.
> 
> When item counting is disabled, the map.size field is invalid.  So to
> prevent accidents, the field has been renamed and an accessor function
> hashmap_get_size() has been added.  All direct references to this field
> have been been updated.  And the name of the field changed to
> map.private_size to communicate thie.
> 
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---

The Git contribution process forces me to point out lines longer than 80
columns. I wish there was already an automated tool to fix that, but we
(as in "the core Git developers") have not yet managed to agree on one. So
I'll have to ask you to identify and fix them manually.

> @@ -253,6 +253,19 @@ static inline void hashmap_entry_init(void *entry, unsigned int hash)
>  }
>  
>  /*
> + * Return the number of items in the map.
> + */
> +inline unsigned int hashmap_get_size(struct hashmap *map)
> +{
> +	if (map->do_count_items)
> +		return map->private_size;
> +
> +	/* TODO Consider counting them and returning that. */

I'd rather not. If counting is disabled, it is disabled.

> +	die("hashmap_get_size: size not set");

Before anybody can ask for this message to be wrapped in _(...) to be
translateable, let me suggest instead to add the prefix "BUG: ".

> +static inline void hashmap_enable_item_counting(struct hashmap *map)
> +{
> +	void *item;
> +	unsigned int n = 0;
> +	struct hashmap_iter iter;
> +
> +	hashmap_iter_init(map, &iter);
> +	while ((item = hashmap_iter_next(&iter)))
> +		n++;
> +
> +	map->do_count_items = 1;
> +	map->private_size = n;
> +}

BTW this made me think that we may have a problem in our code since
switching from my original hashmap implementation to the bucket one added
in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
2013-11-14): while it is not expected that there are many collisions, the
"grow_at" logic still essentially assumes the number of buckets to be
equal to the number of hashmap entries.

Your code simply reiterates that assumption, so I do not blame you for
anything here, nor ask you to change your patch.

But it does look a bit weird to assume so much about the nature of our
data, without having any real-life numbers. I wish I had more time so that
I could afford to run a couple of tests on this hashmap, such as: what is
the typical difference between bucket count and entry count, or the median
of the bucket sizes when the map is 80% full (i.e. *just* below the grow
threshold).

> diff --git a/name-hash.c b/name-hash.c
> index 0e10f3e..829ff59 100644
> --- a/name-hash.c
> +++ b/name-hash.c
> @@ -580,9 +580,11 @@ static void lazy_init_name_hash(struct index_state *istate)
>  			NULL, istate->cache_nr);
>  
>  	if (lookup_lazy_params(istate)) {
> -		hashmap_disallow_rehash(&istate->dir_hash, 1);
> +		hashmap_disable_item_counting(&istate->dir_hash);
> +		hashmap_disable_auto_rehash(&istate->dir_hash);
>  		threaded_lazy_init_name_hash(istate);
> -		hashmap_disallow_rehash(&istate->dir_hash, 0);
> +		hashmap_enable_auto_rehash(&istate->dir_hash);
> +		hashmap_enable_item_counting(&istate->dir_hash);

By your rationale, it would be enough to simply disable and re-enable
counting...

The rest of the patch looks just dandy to me.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-01 23:31       ` Johannes Schindelin
@ 2017-09-01 23:50         ` Jonathan Nieder
  2017-09-05 16:39           ` Jeff Hostetler
  2017-09-02  8:17         ` Jeff King
  2017-09-05 16:33         ` Jeff Hostetler
  2 siblings, 1 reply; 64+ messages in thread
From: Jonathan Nieder @ 2017-09-01 23:50 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff Hostetler, martin.agren, git, jeffhost, gitster, peff

Hi,

Johannes Schindelin wrote:
> On Wed, 30 Aug 2017, Jeff Hostetler wrote:

>> This is to address concerns raised by ThreadSanitizer on the mailing
>> list about threaded unprotected R/W access to map.size with my previous
>> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).

Nice!

What does the message from TSan look like?  (The full message doesn't
need to go in the commit message, but a snippet can help.)  How can I
reproduce it?


>> See:
>> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
>>
>> Add API to hashmap to disable item counting and to disable automatic
>> rehashing.  Also include APIs to re-enable item counting and automatica
>> rehashing.
>>
>> When item counting is disabled, the map.size field is invalid.  So to
>> prevent accidents, the field has been renamed and an accessor function
>> hashmap_get_size() has been added.  All direct references to this field
>> have been been updated.  And the name of the field changed to
>> map.private_size to communicate thie.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>
> The Git contribution process forces me to point out lines longer than 80
> columns. I wish there was already an automated tool to fix that, but we
> (as in "the core Git developers") have not yet managed to agree on one. So
> I'll have to ask you to identify and fix them manually.

I *think* (but am not sure) that there is an implied bug report in this
comment, but I am not sure what that bug report is and how I can help
with it!  If you have a link to an issue tracker, thread, wiki page, or
other pointer where I can learn more, then that might help me.

Have you experimented with the patches in Junio's branch
bw/git-clang-format?

[...]
>> +	/* TODO Consider counting them and returning that. */
>
> I'd rather not. If counting is disabled, it is disabled.
>
>> +	die("hashmap_get_size: size not set");
>
> Before anybody can ask for this message to be wrapped in _(...) to be
> translateable, let me suggest instead to add the prefix "BUG: ".

Nowadays (since v2.13.2~26^2~3, 2017-05-12), there is a BUG() function
that you can use in place of die():

	BUG("hashmap_get_size: size not set");

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-01 23:50         ` Jonathan Nieder
@ 2017-09-05 16:39           ` Jeff Hostetler
  2017-09-05 17:13             ` Martin Ågren
  0 siblings, 1 reply; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-05 16:39 UTC (permalink / raw)
  To: Jonathan Nieder, Johannes Schindelin
  Cc: martin.agren, git, jeffhost, gitster, peff



On 9/1/2017 7:50 PM, Jonathan Nieder wrote:
> Hi,
> 
> Johannes Schindelin wrote:
>> On Wed, 30 Aug 2017, Jeff Hostetler wrote:
> 
>>> This is to address concerns raised by ThreadSanitizer on the mailing
>>> list about threaded unprotected R/W access to map.size with my previous
>>> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).
> 
> Nice!
> 
> What does the message from TSan look like?  (The full message doesn't
> need to go in the commit message, but a snippet can help.)  How can I
> reproduce it?

I'll let Martin common on how to run TSan; I'm just going on
what he reported in the "tsan: t3008..." message from the URL
I quoted.  I didn't think to copy that text into the commit
message because it is just stack traces and too long, but I
could include a snippet.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-05 16:39           ` Jeff Hostetler
@ 2017-09-05 17:13             ` Martin Ågren
  0 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-09-05 17:13 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jonathan Nieder, Johannes Schindelin, Git Mailing List,
	Jeff Hostetler, Junio C Hamano, Jeff King

On 5 September 2017 at 18:39, Jeff Hostetler <git@jeffhostetler.com> wrote:
>
>
> On 9/1/2017 7:50 PM, Jonathan Nieder wrote:
>>
>> Hi,
>>
>> Johannes Schindelin wrote:
>>>
>>> On Wed, 30 Aug 2017, Jeff Hostetler wrote:
>>
>>
>>>> This is to address concerns raised by ThreadSanitizer on the mailing
>>>> list about threaded unprotected R/W access to map.size with my previous
>>>> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).
>>
>>
>> Nice!
>>
>> What does the message from TSan look like?  (The full message doesn't
>> need to go in the commit message, but a snippet can help.)  How can I
>> reproduce it?
>
>
> I'll let Martin common on how to run TSan; I'm just going on
> what he reported in the "tsan: t3008..." message from the URL
> I quoted.  I didn't think to copy that text into the commit
> message because it is just stack traces and too long, but I
> could include a snippet.

I ran the test suite with ThreadSanitizer:

$ make SANITIZE=thread test

Any failures were then inspected:

$ cd t
$ ./t3008-ls-files-lazy-init-name-hash.sh --verbose

That can be done with or without ma/ts-cleanups. That series adds a file
.tsan-suppressions, which can be used by defining the environment
variable

TSAN_OPTIONS="suppressions=/some/absolute/path/.tsan-suppressions"

in order to suppress some findings which are indeed races, but which are
"not a problem in practice".

Martin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-01 23:31       ` Johannes Schindelin
  2017-09-01 23:50         ` Jonathan Nieder
@ 2017-09-02  8:17         ` Jeff King
  2017-09-04 15:59           ` Johannes Schindelin
                             ` (2 more replies)
  2017-09-05 16:33         ` Jeff Hostetler
  2 siblings, 3 replies; 64+ messages in thread
From: Jeff King @ 2017-09-02  8:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff Hostetler, martin.agren, git, jeffhost, gitster

On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote:

> > https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
> [...]
> > Add API to hashmap to disable item counting and to disable automatic
> > rehashing.  Also include APIs to re-enable item counting and automatica
> > rehashing.
> [...]
> 
> The Git contribution process forces me to point out lines longer than 80
> columns. I wish there was already an automated tool to fix that, but we
> (as in "the core Git developers") have not yet managed to agree on one. So
> I'll have to ask you to identify and fix them manually.

Perhaps it would be helpful if you pointed out the lines that are too
long. Because I don't see any being added by the patch. There are two in
the commit message. One is a URL. For the other, which is 82 characters,
I'm not sure there is a better tool than "turn on text wrapping in your
editor".

> > +	/* TODO Consider counting them and returning that. */
> 
> I'd rather not. If counting is disabled, it is disabled.
> 
> > +	die("hashmap_get_size: size not set");
> 
> Before anybody can ask for this message to be wrapped in _(...) to be
> translateable, let me suggest instead to add the prefix "BUG: ".

Agreed on both (and Jonathan's suggestion to just use BUG()).

> > +static inline void hashmap_enable_item_counting(struct hashmap *map)
> > +{
> > +	void *item;
> > +	unsigned int n = 0;
> > +	struct hashmap_iter iter;
> > +
> > +	hashmap_iter_init(map, &iter);
> > +	while ((item = hashmap_iter_next(&iter)))
> > +		n++;
> > +
> > +	map->do_count_items = 1;
> > +	map->private_size = n;
> > +}
> 
> BTW this made me think that we may have a problem in our code since
> switching from my original hashmap implementation to the bucket one added
> in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
> 2013-11-14): while it is not expected that there are many collisions, the
> "grow_at" logic still essentially assumes the number of buckets to be
> equal to the number of hashmap entries.

I'm confused about what the problem is. If I am reading the code
correctly, "size" is always the number of elements and "grow_at" is the
table size times a load factor. Those are the same numbers you'd use to
decide to grow in an open-address table.

It's true that this does not take into account the actual number of
collisions we see (or the average per bucket, or however you want to
count it). But generally nor do open-address schemes (and certainly our
other hash tables just use load factor to decide when to grow).

Am I missing something?

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-02  8:17         ` Jeff King
@ 2017-09-04 15:59           ` Johannes Schindelin
  2017-09-05 16:54           ` Jeff Hostetler
  2017-09-06  3:43           ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Johannes Schindelin @ 2017-09-04 15:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Jeff Hostetler, martin.agren, git, jeffhost, gitster

Hi Peff,

On Sat, 2 Sep 2017, Jeff King wrote:

> On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote:
> 
> > > https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
> > [...]
> > > +static inline void hashmap_enable_item_counting(struct hashmap *map)
> > > +{
> > > +	void *item;
> > > +	unsigned int n = 0;
> > > +	struct hashmap_iter iter;
> > > +
> > > +	hashmap_iter_init(map, &iter);
> > > +	while ((item = hashmap_iter_next(&iter)))
> > > +		n++;
> > > +
> > > +	map->do_count_items = 1;
> > > +	map->private_size = n;
> > > +}
> > 
> > BTW this made me think that we may have a problem in our code since
> > switching from my original hashmap implementation to the bucket one
> > added in 6a364ced497 (add a hashtable implementation that supports
> > O(1) removal, 2013-11-14): while it is not expected that there are
> > many collisions, the "grow_at" logic still essentially assumes the
> > number of buckets to be equal to the number of hashmap entries.
> 
> I'm confused about what the problem is. If I am reading the code
> correctly, "size" is always the number of elements and "grow_at" is the
> table size times a load factor. Those are the same numbers you'd use to
> decide to grow in an open-address table.
> 
> It's true that this does not take into account the actual number of
> collisions we see (or the average per bucket, or however you want to
> count it). But generally nor do open-address schemes (and certainly our
> other hash tables just use load factor to decide when to grow).

In the worst case, there is only one bucket when the table is grown
already, is all I tried to point out.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-02  8:17         ` Jeff King
  2017-09-04 15:59           ` Johannes Schindelin
@ 2017-09-05 16:54           ` Jeff Hostetler
  2017-09-06  3:43           ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-05 16:54 UTC (permalink / raw)
  To: Jeff King, Johannes Schindelin; +Cc: martin.agren, git, jeffhost, gitster



On 9/2/2017 4:17 AM, Jeff King wrote:
> On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote:
>> Before anybody can ask for this message to be wrapped in _(...) to be
>> translateable, let me suggest instead to add the prefix "BUG: ".
> 
> Agreed on both (and Jonathan's suggestion to just use BUG()).

will do.  thanks.

> 
>>> +static inline void hashmap_enable_item_counting(struct hashmap *map)
>>> +{
>>> +	void *item;
>>> +	unsigned int n = 0;
>>> +	struct hashmap_iter iter;
>>> +
>>> +	hashmap_iter_init(map, &iter);
>>> +	while ((item = hashmap_iter_next(&iter)))
>>> +		n++;
>>> +
>>> +	map->do_count_items = 1;
>>> +	map->private_size = n;
>>> +}
>>
>> BTW this made me think that we may have a problem in our code since
>> switching from my original hashmap implementation to the bucket one added
>> in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
>> 2013-11-14): while it is not expected that there are many collisions, the
>> "grow_at" logic still essentially assumes the number of buckets to be
>> equal to the number of hashmap entries.
> 
> I'm confused about what the problem is. If I am reading the code
> correctly, "size" is always the number of elements and "grow_at" is the
> table size times a load factor. Those are the same numbers you'd use to
> decide to grow in an open-address table.
> 
> It's true that this does not take into account the actual number of
> collisions we see (or the average per bucket, or however you want to
> count it). But generally nor do open-address schemes (and certainly our
> other hash tables just use load factor to decide when to grow).
> 
> Am I missing something?
> 
> -Peff
> 

Hashmap is not thread-safe by itself.  There are several uses of
it in a threaded context and they all handle their own locking
before accessing the hashmap.  Those usually work by locking the
whole hashmap.

My changes in "lazy-init-name-hash" deviated from that pattern
by locking on individual hash chains.  That is, n locks each
controlling 1/nth of the chains.

https://public-inbox.org/git/1490202865-31325-1-git-send-email-git@jeffhostetler.com/

To do that I had to disable automatic rehashing for the duration
of my threaded computation.  The problem that TSan identified is
that "size" is always incremented during inserts and it doesn't
have any locks protecting it.  So even though auto-rehash was
disabled, we are still counting the number of items in the map.
Not a terrible problem, but still a race.

Jeff


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-02  8:17         ` Jeff King
  2017-09-04 15:59           ` Johannes Schindelin
  2017-09-05 16:54           ` Jeff Hostetler
@ 2017-09-06  3:43           ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2017-09-06  3:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin, Jeff Hostetler, martin.agren, git, jeffhost

Jeff King <peff@peff.net> writes:

> On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote:
>
>> BTW this made me think that we may have a problem in our code since
>> switching from my original hashmap implementation to the bucket one added
>> in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
>> 2013-11-14): while it is not expected that there are many collisions, the
>> "grow_at" logic still essentially assumes the number of buckets to be
>> equal to the number of hashmap entries.
>
> I'm confused about what the problem is. If I am reading the code
> correctly, "size" is always the number of elements and "grow_at" is the
> table size times a load factor. Those are the same numbers you'd use to
> decide to grow in an open-address table.
>
> It's true that this does not take into account the actual number of
> collisions we see (or the average per bucket, or however you want to
> count it). But generally nor do open-address schemes (and certainly our
> other hash tables just use load factor to decide when to grow).

Are we comparing the hashmap.[ch] with the hash.[ch] added in
9027f53c ("Do linear-time/space rename logic for exact renames",
2007-10-25)?  I am a bit confused because Johannes calls it "my"
original.

Unless the real person in this discussion thread sending the
messages under Johannes's name is Linus, that is ;-).  Or maybe the
"original" being compared is something other than the series with
6a364ced497 replaced with its hashmap.[ch]?

In any case, I do think your reading of the code is correct in that
the comparison between size and grow-at/shrink-at is done correctly
with the true load factor of the table, not how many buckets out of
the possible buckets are filled.  

Old one used to grow at 50% full and never shrunk it, but the
current one grows at 80% and shrinks at a bit below 40%; I agree
with Dscho's feeling (in part not quoted above) that 50% vs 80%
doesn't seem to have been backed by any numbers, but optimizing the
load factor is outside the scope of this series, I would think.

6a364ced ("add a hashtable implementation that supports O(1)
removal", 2013-11-14) credits less frequent resizing for gain of
insert performance, but my hunch is that the need for frequent
resizing in the version before it primarily comes from the fact that
the table started empty (as opposed to having an initial size of 64,
which is what the current implementation uses).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-01 23:31       ` Johannes Schindelin
  2017-09-01 23:50         ` Jonathan Nieder
  2017-09-02  8:17         ` Jeff King
@ 2017-09-05 16:33         ` Jeff Hostetler
  2 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-05 16:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: martin.agren, git, jeffhost, gitster, peff



On 9/1/2017 7:31 PM, Johannes Schindelin wrote:
> Hi Jeff,
> 
> On Wed, 30 Aug 2017, Jeff Hostetler wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> This is to address concerns raised by ThreadSanitizer on the mailing
>> list about threaded unprotected R/W access to map.size with my previous
>> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).
>> See:
>> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
>>
>> Add API to hashmap to disable item counting and to disable automatic
>> rehashing.  Also include APIs to re-enable item counting and automatica
>> rehashing.
>>
>> When item counting is disabled, the map.size field is invalid.  So to
>> prevent accidents, the field has been renamed and an accessor function
>> hashmap_get_size() has been added.  All direct references to this field
>> have been been updated.  And the name of the field changed to
>> map.private_size to communicate thie.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
> 
> The Git contribution process forces me to point out lines longer than 80
> columns. I wish there was already an automated tool to fix that, but we
> (as in "the core Git developers") have not yet managed to agree on one. So
> I'll have to ask you to identify and fix them manually.

I'm not sure which lines you're talking about, but I'll
give it another scan and double check.

There's not much I can do about the public-inbox.org URL.

> 
>> @@ -253,6 +253,19 @@ static inline void hashmap_entry_init(void *entry, unsigned int hash)
>>   }
>>   
>>   /*
>> + * Return the number of items in the map.
>> + */
>> +inline unsigned int hashmap_get_size(struct hashmap *map)
>> +{
>> +	if (map->do_count_items)
>> +		return map->private_size;
>> +
>> +	/* TODO Consider counting them and returning that. */
> 
> I'd rather not. If counting is disabled, it is disabled.
> 
>> +	die("hashmap_get_size: size not set");
> 
> Before anybody can ask for this message to be wrapped in _(...) to be
> translateable, let me suggest instead to add the prefix "BUG: ".

Good point.  Thanks.

> 
>> +static inline void hashmap_enable_item_counting(struct hashmap *map)
>> +{
>> +	void *item;
>> +	unsigned int n = 0;
>> +	struct hashmap_iter iter;
>> +
>> +	hashmap_iter_init(map, &iter);
>> +	while ((item = hashmap_iter_next(&iter)))
>> +		n++;
>> +
>> +	map->do_count_items = 1;
>> +	map->private_size = n;
>> +}
> 
> BTW this made me think that we may have a problem in our code since
> switching from my original hashmap implementation to the bucket one added
> in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
> 2013-11-14): while it is not expected that there are many collisions, the
> "grow_at" logic still essentially assumes the number of buckets to be
> equal to the number of hashmap entries.
> 
> Your code simply reiterates that assumption, so I do not blame you for
> anything here, nor ask you to change your patch.

I'm not sure what you're saying here.  The iterator iterates over
all entries (and handles walking collision chains), so my newly
computed count should be correct and all of this is independent of
the "grow-at" and table-size logic.

I'm not forcing a rehash when counting is enabled.  I'm just
reestablishing the expected state.  The next insert may cause
a rehash, but I'm not forcing it.

However, there is an assumption that the caller pre-allocated sufficient
table-size space to avoid poor performance for the duration of the
non-counting period.

> 
> But it does look a bit weird to assume so much about the nature of our
> data, without having any real-life numbers. I wish I had more time so that
> I could afford to run a couple of tests on this hashmap, such as: what is
> the typical difference between bucket count and entry count, or the median
> of the bucket sizes when the map is 80% full (i.e. *just* below the grow
> threshold).

Personally, I think the 80% threshold is too aggressive (and the
default size is too small), but that's a different question.

The hashmap in question contains directory pathnames, so the
distribution will be completely dependent on the shape of the
data.

FWIW, I created a tool to dump some of this data.  See:
     t/helper/test-lazy-init-name-hash.c

> 
>> diff --git a/name-hash.c b/name-hash.c
>> index 0e10f3e..829ff59 100644
>> --- a/name-hash.c
>> +++ b/name-hash.c
>> @@ -580,9 +580,11 @@ static void lazy_init_name_hash(struct index_state *istate)
>>   			NULL, istate->cache_nr);
>>   
>>   	if (lookup_lazy_params(istate)) {
>> -		hashmap_disallow_rehash(&istate->dir_hash, 1);
>> +		hashmap_disable_item_counting(&istate->dir_hash);
>> +		hashmap_disable_auto_rehash(&istate->dir_hash);
>>   		threaded_lazy_init_name_hash(istate);
>> -		hashmap_disallow_rehash(&istate->dir_hash, 0);
>> +		hashmap_enable_auto_rehash(&istate->dir_hash);
>> +		hashmap_enable_item_counting(&istate->dir_hash);
> 
> By your rationale, it would be enough to simply disable and re-enable
> counting...
> 
> The rest of the patch looks just dandy to me.
> 
> Thanks,
> Dscho
> 

thanks
Jeff


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
  2017-09-01 23:31       ` Johannes Schindelin
@ 2017-09-02  8:05       ` Jeff King
  2017-09-05 17:07         ` Jeff Hostetler
  2017-09-02  8:39       ` Simon Ruderich
  2017-09-06  1:24       ` Junio C Hamano
  3 siblings, 1 reply; 64+ messages in thread
From: Jeff King @ 2017-09-02  8:05 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: martin.agren, git, jeffhost, gitster

On Wed, Aug 30, 2017 at 06:59:22PM +0000, Jeff Hostetler wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> This is to address concerns raised by ThreadSanitizer on the
> mailing list about threaded unprotected R/W access to map.size with my previous
> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).  See:
> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
> 
> Add API to hashmap to disable item counting and to disable automatic rehashing.  
> Also include APIs to re-enable item counting and automatica rehashing.

It may be worth summarizing the discussion at that thread here.

At first glance, it looks like this is woefully inadequate for allowing
multi-threaded access. But after digging, the issue is that we're really
trying to accommodate one specific callers which is doing its own
per-bucket locking, and which needs the internals of the hashmap to be
truly read-only.

I suspect the code might be easier to follow if that pattern were pushed
into its own threaded_hashmap that disabled the size and handled the
mod-n locking, but I don't insist on that as a blocker to this fix.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-02  8:05       ` Jeff King
@ 2017-09-05 17:07         ` Jeff Hostetler
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-05 17:07 UTC (permalink / raw)
  To: Jeff King; +Cc: martin.agren, git, jeffhost, gitster



On 9/2/2017 4:05 AM, Jeff King wrote:
> On Wed, Aug 30, 2017 at 06:59:22PM +0000, Jeff Hostetler wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> This is to address concerns raised by ThreadSanitizer on the
>> mailing list about threaded unprotected R/W access to map.size with my previous
>> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).  See:
>> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
>>
>> Add API to hashmap to disable item counting and to disable automatic rehashing.
>> Also include APIs to re-enable item counting and automatica rehashing.
> 
> It may be worth summarizing the discussion at that thread here.
> 
> At first glance, it looks like this is woefully inadequate for allowing
> multi-threaded access. But after digging, the issue is that we're really
> trying to accommodate one specific callers which is doing its own
> per-bucket locking, and which needs the internals of the hashmap to be
> truly read-only.
> 
> I suspect the code might be easier to follow if that pattern were pushed
> into its own threaded_hashmap that disabled the size and handled the
> mod-n locking, but I don't insist on that as a blocker to this fix.

Agreed.  It would be better and easier to understand to produce
a thread-safe version of the hashmap code -- there are other uses
of the code that are currently doing their own locking that might
benefit, but I'm not looking to gratuitously refactor things.

The other issue was that we are multi-threaded during the
lazy-init phase, but the main status-or-whatever code that then
uses the hashmap is not.  So we only pay for the locking during
the multi-threaded usage.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
  2017-09-01 23:31       ` Johannes Schindelin
  2017-09-02  8:05       ` Jeff King
@ 2017-09-02  8:39       ` Simon Ruderich
  2017-09-06  1:24       ` Junio C Hamano
  3 siblings, 0 replies; 64+ messages in thread
From: Simon Ruderich @ 2017-09-02  8:39 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: martin.agren, git, jeffhost, gitster, peff

On Wed, Aug 30, 2017 at 06:59:22PM +0000, Jeff Hostetler wrote:
> [snip]
> +/*
> + * Re-enable item couting when adding/removing items.
> + * If counting is currently disabled, it will force count them.

The code always recounts them. Either the comment or the
code should be adjusted.

> + * This might be used after leaving a threaded section.
> + */
> +static inline void hashmap_enable_item_counting(struct hashmap *map)
> +{
> +	void *item;
> +	unsigned int n = 0;
> +	struct hashmap_iter iter;
> +
> +	hashmap_iter_init(map, &iter);
> +	while ((item = hashmap_iter_next(&iter)))
> +		n++;
> +
> +	map->do_count_items = 1;
> +	map->private_size = n;
> +}

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
                         ` (2 preceding siblings ...)
  2017-09-02  8:39       ` Simon Ruderich
@ 2017-09-06  1:24       ` Junio C Hamano
  2017-09-06 15:33         ` Jeff Hostetler
  3 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-09-06  1:24 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: martin.agren, git, jeffhost, peff

Jeff Hostetler <git@jeffhostetler.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> This is to address concerns raised by ThreadSanitizer on the
> mailing list about threaded unprotected R/W access to map.size with my previous
> "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).  See:
> https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
>
> Add API to hashmap to disable item counting and to disable automatic rehashing.  
> Also include APIs to re-enable item counting and automatica rehashing.
>
> When item counting is disabled, the map.size field is invalid.  So to
> prevent accidents, the field has been renamed and an accessor function
> hashmap_get_size() has been added.  All direct references to this
> field have been been updated.  And the name of the field changed
> to map.private_size to communicate thie.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  attr.c                  | 14 ++++---
>  builtin/describe.c      |  2 +-
>  hashmap.c               | 31 +++++++++++-----
>  hashmap.h               | 98 ++++++++++++++++++++++++++++++++++++++-----------
>  name-hash.c             |  6 ++-
>  t/helper/test-hashmap.c |  2 +-
>  6 files changed, 113 insertions(+), 40 deletions(-)

I feel somewhat stupid to say this, especially after seeing many
people applaud this patch, but I do not seem to be able to even
build Git with this patch.  I am getting:

    common-main.o: In function `hashmap_get_size':
    /home/gitster/w/git.git/hashmap.h:260: multiple definition of `hashmap_get_size'
    fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
    libgit.a(argv-array.o): In function `hashmap_get_size':
    /home/gitster/w/git.git/hashmap.h:260: multiple definition of `hashmap_get_size'
    fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
    libgit.a(attr.o): In function `hashmap_get_size':
    ...

and wonder if others are building with different options or something..

> diff --git a/hashmap.h b/hashmap.h
> index 7a8fa7f..7b8e6f4 100644
> --- a/hashmap.h
> +++ b/hashmap.h
> @@ -253,6 +253,19 @@ static inline void hashmap_entry_init(void *entry, unsigned int hash)
>  }
>  
>  /*
> + * Return the number of items in the map.
> + */
> +inline unsigned int hashmap_get_size(struct hashmap *map)

I think this must become static inline like everybody else in this
file, at least.

I also wonder if this header is excessively inlining too many
functions without measuring first, but that is a different story.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] hashmap: add API to disable item counting when threaded
  2017-09-06  1:24       ` Junio C Hamano
@ 2017-09-06 15:33         ` Jeff Hostetler
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-06 15:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: martin.agren, git, jeffhost, peff



On 9/5/2017 9:24 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
> 
> I feel somewhat stupid to say this, especially after seeing many
> people applaud this patch, but I do not seem to be able to even
> build Git with this patch.  I am getting:
> 
>      common-main.o: In function `hashmap_get_size':
>      /home/gitster/w/git.git/hashmap.h:260: multiple definition of `hashmap_get_size'
>      fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
>      libgit.a(argv-array.o): In function `hashmap_get_size':
>      /home/gitster/w/git.git/hashmap.h:260: multiple definition of `hashmap_get_size'
>      fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
>      libgit.a(attr.o): In function `hashmap_get_size':
>      ...
> 
> and wonder if others are building with different options or something..

That's odd. I'm not seeing that.  My Ubuntu VM is reporting:
     $ gcc --version
     gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005

I'll change it to static in the next version.  Hopefully that will
take care of it.

Thanks,
Jeff


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2] hashmap: address ThreadSanitizer concerns
  2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
  2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
@ 2017-09-06 15:43     ` Jeff Hostetler
  2017-09-06 15:43       ` [PATCH v2] hashmap: add API to disable item counting when threaded Jeff Hostetler
  1 sibling, 1 reply; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-06 15:43 UTC (permalink / raw)
  To: git; +Cc: git, gitster, jeffhost, martin.agren, peff

From: Jeff Hostetler <jeffhost@microsoft.com>

Version 2 addresses the comments and suggestions on version 1.
It removes the explicit disable/enable rehash and just relies
on the state of hashmap counting.
It changes the declaration of the hashmap_get_size() to be
static to avoid issues seen on some compilers.
It uses BUG() rather than die() for an error condition.
It adds a comment describing why lazy-init needs to disable
couting.
It fixes line length problems.
It add details from TSan in the commit message.

Jeff Hostetler (1):
  hashmap: add API to disable item counting when threaded

 attr.c                  | 15 ++++++-----
 builtin/describe.c      |  2 +-
 hashmap.c               | 26 +++++++++++-------
 hashmap.h               | 72 ++++++++++++++++++++++++++++++++++---------------
 name-hash.c             | 10 +++++--
 t/helper/test-hashmap.c |  3 ++-
 6 files changed, 88 insertions(+), 40 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2] hashmap: add API to disable item counting when threaded
  2017-09-06 15:43     ` [PATCH v2] hashmap: address ThreadSanitizer concerns Jeff Hostetler
@ 2017-09-06 15:43       ` Jeff Hostetler
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-09-06 15:43 UTC (permalink / raw)
  To: git; +Cc: git, gitster, jeffhost, martin.agren, peff

From: Jeff Hostetler <jeffhost@microsoft.com>

This is to address concerns raised by ThreadSanitizer on the mailing list
about threaded unprotected R/W access to map.size with my previous "disallow
rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).

See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/

Add API to hashmap to disable item counting and thus automatic rehashing.
Also include API to later re-enable them.

When item counting is disabled, the map.size field is invalid.  So to
prevent accidents, the field has been renamed and an accessor function
hashmap_get_size() has been added.  All direct references to this
field have been been updated.  And the name of the field changed
to map.private_size to communicate this.

Here is the relevant output from ThreadSanitizer showing the problem:

WARNING: ThreadSanitizer: data race (pid=10554)
  Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16):
    #0 hashmap_add hashmap.c:209
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
    #2 handle_range_dir name-hash.c:347
    #3 handle_range_1 name-hash.c:415
    #4 lazy_dir_thread_proc name-hash.c:471
    #5 <null> <null>

  Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31):
    #0 hashmap_add hashmap.c:209
    #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
    #2 handle_range_dir name-hash.c:347
    #3 handle_range_1 name-hash.c:415
    #4 handle_range_dir name-hash.c:380
    #5 handle_range_1 name-hash.c:415
    #6 lazy_dir_thread_proc name-hash.c:471
    #7 <null> <null>

Martin gives instructions for running TSan on test t3008 in this post:
https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4EXXOmCKvsG5MSuzxsA@mail.gmail.com/

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 attr.c                  | 15 ++++++-----
 builtin/describe.c      |  2 +-
 hashmap.c               | 26 +++++++++++-------
 hashmap.h               | 72 ++++++++++++++++++++++++++++++++++---------------
 name-hash.c             | 10 +++++--
 t/helper/test-hashmap.c |  3 ++-
 6 files changed, 88 insertions(+), 40 deletions(-)

diff --git a/attr.c b/attr.c
index 56961f0..195aa6d 100644
--- a/attr.c
+++ b/attr.c
@@ -149,10 +149,12 @@ struct all_attrs_item {
 static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 {
 	int i;
+	unsigned int size;
 
 	hashmap_lock(map);
 
-	if (map->map.size < check->all_attrs_nr)
+	size = hashmap_get_size(&map->map);
+	if (size < check->all_attrs_nr)
 		die("BUG: interned attributes shouldn't be deleted");
 
 	/*
@@ -161,13 +163,13 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 	 * field), reallocate the provided attr_check instance's all_attrs
 	 * field and fill each entry with its corresponding git_attr.
 	 */
-	if (map->map.size != check->all_attrs_nr) {
+	if (size != check->all_attrs_nr) {
 		struct attr_hash_entry *e;
 		struct hashmap_iter iter;
 		hashmap_iter_init(&map->map, &iter);
 
-		REALLOC_ARRAY(check->all_attrs, map->map.size);
-		check->all_attrs_nr = map->map.size;
+		REALLOC_ARRAY(check->all_attrs, size);
+		check->all_attrs_nr = size;
 
 		while ((e = hashmap_iter_next(&iter))) {
 			const struct git_attr *a = e->value;
@@ -235,10 +237,11 @@ static const struct git_attr *git_attr_internal(const char *name, int namelen)
 
 	if (!a) {
 		FLEX_ALLOC_MEM(a, name, name, namelen);
-		a->attr_nr = g_attr_hashmap.map.size;
+		a->attr_nr = hashmap_get_size(&g_attr_hashmap.map);
 
 		attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
-		assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+		assert(a->attr_nr ==
+		       (hashmap_get_size(&g_attr_hashmap.map) - 1));
 	}
 
 	hashmap_unlock(&g_attr_hashmap);
diff --git a/builtin/describe.c b/builtin/describe.c
index 89ea1cd..1e3cbc7 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -505,7 +505,7 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 
 	hashmap_init(&names, (hashmap_cmp_fn) commit_name_cmp, NULL, 0);
 	for_each_rawref(get_name, NULL);
-	if (!names.size && !always)
+	if (!hashmap_get_size(&names) && !always)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
diff --git a/hashmap.c b/hashmap.c
index 9b6a128..d42f01f 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -116,9 +116,6 @@ static void rehash(struct hashmap *map, unsigned int newsize)
 	unsigned int i, oldsize = map->tablesize;
 	struct hashmap_entry **oldtable = map->table;
 
-	if (map->disallow_rehash)
-		return;
-
 	alloc_table(map, newsize);
 	for (i = 0; i < oldsize; i++) {
 		struct hashmap_entry *e = oldtable[i];
@@ -166,6 +163,12 @@ void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function,
 	while (initial_size > size)
 		size <<= HASHMAP_RESIZE_BITS;
 	alloc_table(map, size);
+
+	/*
+	 * Keep track of the number of items in the map and
+	 * allow the map to automatically grow as necessary.
+	 */
+	map->do_count_items = 1;
 }
 
 void hashmap_free(struct hashmap *map, int free_entries)
@@ -206,9 +209,11 @@ void hashmap_add(struct hashmap *map, void *entry)
 	map->table[b] = entry;
 
 	/* fix size and rehash if appropriate */
-	map->size++;
-	if (map->size > map->grow_at)
-		rehash(map, map->tablesize << HASHMAP_RESIZE_BITS);
+	if (map->do_count_items) {
+		map->private_size++;
+		if (map->private_size > map->grow_at)
+			rehash(map, map->tablesize << HASHMAP_RESIZE_BITS);
+	}
 }
 
 void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)
@@ -224,9 +229,12 @@ void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)
 	old->next = NULL;
 
 	/* fix size and rehash if appropriate */
-	map->size--;
-	if (map->size < map->shrink_at)
-		rehash(map, map->tablesize >> HASHMAP_RESIZE_BITS);
+	if (map->do_count_items) {
+		map->private_size--;
+		if (map->private_size < map->shrink_at)
+			rehash(map, map->tablesize >> HASHMAP_RESIZE_BITS);
+	}
+
 	return old;
 }
 
diff --git a/hashmap.h b/hashmap.h
index 7a8fa7f..7cb29a6 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -183,7 +183,7 @@ struct hashmap {
 	const void *cmpfn_data;
 
 	/* total number of entries (0 means the hashmap is empty) */
-	unsigned int size;
+	unsigned int private_size; /* use hashmap_get_size() */
 
 	/*
 	 * tablesize is the allocated size of the hash table. A non-0 value
@@ -196,8 +196,7 @@ struct hashmap {
 	unsigned int grow_at;
 	unsigned int shrink_at;
 
-	/* See `hashmap_disallow_rehash`. */
-	unsigned disallow_rehash : 1;
+	unsigned int do_count_items : 1;
 };
 
 /* hashmap functions */
@@ -253,6 +252,18 @@ static inline void hashmap_entry_init(void *entry, unsigned int hash)
 }
 
 /*
+ * Return the number of items in the map.
+ */
+static inline unsigned int hashmap_get_size(struct hashmap *map)
+{
+	if (map->do_count_items)
+		return map->private_size;
+
+	BUG("hashmap_get_size: size not set");
+	return 0;
+}
+
+/*
  * Returns the hashmap entry for the specified key, or NULL if not found.
  *
  * `map` is the hashmap structure.
@@ -345,24 +356,6 @@ extern void *hashmap_remove(struct hashmap *map, const void *key,
 int hashmap_bucket(const struct hashmap *map, unsigned int hash);
 
 /*
- * Disallow/allow rehashing of the hashmap.
- * This is useful if the caller knows that the hashmap needs multi-threaded
- * access.  The caller is still required to guard/lock searches and inserts
- * in a manner appropriate to their usage.  This simply prevents the table
- * from being unexpectedly re-mapped.
- *
- * It is up to the caller to ensure that the hashmap is initialized to a
- * reasonable size to prevent poor performance.
- *
- * A call to allow rehashing does not force a rehash; that might happen
- * with the next insert or delete.
- */
-static inline void hashmap_disallow_rehash(struct hashmap *map, unsigned value)
-{
-	map->disallow_rehash = value;
-}
-
-/*
  * Used to iterate over all entries of a hashmap. Note that it is
  * not safe to add or remove entries to the hashmap while
  * iterating.
@@ -387,6 +380,43 @@ static inline void *hashmap_iter_first(struct hashmap *map,
 	return hashmap_iter_next(iter);
 }
 
+/*
+ * Disable item counting and automatic rehashing when adding/removing items.
+ *
+ * Normally, the hashmap keeps track of the number of items in the map
+ * and uses it to dynamically resize it.  This (both the counting and
+ * the resizing) can cause problems when the map is being used by
+ * threaded callers (because the hashmap code does not know about the
+ * locking strategy used by the threaded callers and therefore, does
+ * not know how to protect the "private_size" counter).
+ */
+static inline void hashmap_disable_item_counting(struct hashmap *map)
+{
+	map->do_count_items = 0;
+}
+
+/*
+ * Re-enable item couting when adding/removing items.
+ * If counting is currently disabled, it will force count them.
+ * It WILL NOT automatically rehash them.
+ */
+static inline void hashmap_enable_item_counting(struct hashmap *map)
+{
+	void *item;
+	unsigned int n = 0;
+	struct hashmap_iter iter;
+
+	if (map->do_count_items)
+		return;
+
+	hashmap_iter_init(map, &iter);
+	while ((item = hashmap_iter_next(&iter)))
+		n++;
+
+	map->do_count_items = 1;
+	map->private_size = n;
+}
+
 /* String interning */
 
 /*
diff --git a/name-hash.c b/name-hash.c
index 0e10f3e..3ac2e57 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -580,9 +580,15 @@ static void lazy_init_name_hash(struct index_state *istate)
 			NULL, istate->cache_nr);
 
 	if (lookup_lazy_params(istate)) {
-		hashmap_disallow_rehash(&istate->dir_hash, 1);
+		/*
+		 * Disable item counting and automatic rehashing because
+		 * we do per-chain (mod n) locking rather than whole hashmap
+		 * locking and we need to prevent the table-size from changing
+		 * and bucket items from being redistributed.
+		 */
+		hashmap_disable_item_counting(&istate->dir_hash);
 		threaded_lazy_init_name_hash(istate);
-		hashmap_disallow_rehash(&istate->dir_hash, 0);
+		hashmap_enable_item_counting(&istate->dir_hash);
 	} else {
 		int nr;
 		for (nr = 0; nr < istate->cache_nr; nr++)
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 095d739..4956d1d 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -237,7 +237,8 @@ int cmd_main(int argc, const char **argv)
 		} else if (!strcmp("size", cmd)) {
 
 			/* print table sizes */
-			printf("%u %u\n", map.tablesize, map.size);
+			printf("%u %u\n", map.tablesize,
+			       hashmap_get_size(&map));
 
 		} else if (!strcmp("intern", cmd) && l1) {
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* tsan: t5400: set_try_to_free_routine
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (5 preceding siblings ...)
  2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
@ 2017-08-15 12:53 ` Martin Ågren
  2017-08-15 17:35   ` Stefan Beller
  2017-08-17 10:57   ` Jeff King
  2017-08-20 10:06 ` [PATCH/RFC 0/5] Some ThreadSanitizer-results Jeff King
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
  8 siblings, 2 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 12:53 UTC (permalink / raw)
  To: git

Using SANITIZE=thread made t5400-send-pack.sh hit the potential race
below.

This is set_try_to_free_routine in wrapper.c. The race relates to the
reading of the "old" value. The caller doesn't care about the "old"
value, so this should be harmless right now. But it seems that using
this mechanism from multiple threads and restoring the earlier value
will probably not work out every time. (Not necessarily because of the
race in set_try_to_free_routine, but, e.g., because callers might not
restore the function pointer in the reverse order of how they
originally set it.)

Properly "fixing" this for thread-safety would probably require some
redesigning, which at the time might not be warranted. I'm just posting
this for completeness.

Martin

WARNING: ThreadSanitizer: data race (pid=21382)
  Read of size 8 at 0x000000979970 by thread T1:
    #0 set_try_to_free_routine wrapper.c:35 (git+0x0000006cde1c)
    #1 prepare_trace_line trace.c:105 (git+0x0000006a3bf0)
    #2 trace_strbuf_fl trace.c:185 (git+0x0000006a3bf0)
    #3 packet_trace pkt-line.c:80 (git+0x0000005f9f43)
    #4 packet_read pkt-line.c:309 (git+0x0000005fbe10)
    #5 recv_sideband sideband.c:37 (git+0x000000684c5e)
    #6 sideband_demux send-pack.c:216 (git+0x00000065a38c)
    #7 run_thread run-command.c:933 (git+0x000000655a93)
    #8 <null> <null> (libtsan.so.0+0x0000000230d9)

  Previous write of size 8 at 0x000000979970 by main thread:
    #0 set_try_to_free_routine wrapper.c:38 (git+0x0000006cde39)
    #1 prepare_trace_line trace.c:105 (git+0x0000006a3bf0)
    #2 trace_strbuf_fl trace.c:185 (git+0x0000006a3bf0)
    #3 packet_trace pkt-line.c:80 (git+0x0000005f9f43)
    #4 packet_read pkt-line.c:324 (git+0x0000005fc0bb)
    #5 packet_read_line_generic pkt-line.c:332 (git+0x0000005fc0bb)
    #6 packet_read_line pkt-line.c:342 (git+0x0000005fc0bb)
    #7 receive_unpack_status send-pack.c:149 (git+0x00000065c1e4)
    #8 send_pack send-pack.c:581 (git+0x00000065c1e4)
    #9 git_transport_push transport.c:574 (git+0x0000006ab2c1)
    #10 transport_push transport.c:1068 (git+0x0000006ae5d5)
    #11 push_with_options builtin/push.c:336 (git+0x00000049d580)
    #12 do_push builtin/push.c:419 (git+0x00000049f57d)
    #13 cmd_push builtin/push.c:591 (git+0x00000049f57d)
    #14 run_builtin git.c:330 (git+0x00000040949e)
    #15 handle_builtin git.c:538 (git+0x00000040949e)
    #16 run_argv git.c:590 (git+0x000000409a0e)
    #17 cmd_main git.c:667 (git+0x000000409a0e)
    #18 main common-main.c:43 (git+0x0000004079d1)

  Location is global 'try_to_free_routine' of size 8 at 0x000000979970 (git+0x000000979970)

  Thread T1 (tid=21405, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x000000027577)
    #1 start_async run-command.c:1130 (git+0x0000006582e7)
    #2 send_pack send-pack.c:562 (git+0x00000065b7c8)
    #3 git_transport_push transport.c:574 (git+0x0000006ab2c1)
    #4 transport_push transport.c:1068 (git+0x0000006ae5d5)
    #5 push_with_options builtin/push.c:336 (git+0x00000049d580)
    #6 do_push builtin/push.c:419 (git+0x00000049f57d)
    #7 cmd_push builtin/push.c:591 (git+0x00000049f57d)
    #8 run_builtin git.c:330 (git+0x00000040949e)
    #9 handle_builtin git.c:538 (git+0x00000040949e)
    #10 run_argv git.c:590 (git+0x000000409a0e)
    #11 cmd_main git.c:667 (git+0x000000409a0e)
    #12 main common-main.c:43 (git+0x0000004079d1)

SUMMARY: ThreadSanitizer: data race wrapper.c:35 set_try_to_free_routine


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t5400: set_try_to_free_routine
  2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
@ 2017-08-15 17:35   ` Stefan Beller
  2017-08-15 18:44     ` Martin Ågren
  2017-08-17 10:57   ` Jeff King
  1 sibling, 1 reply; 64+ messages in thread
From: Stefan Beller @ 2017-08-15 17:35 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git@vger.kernel.org

On Tue, Aug 15, 2017 at 5:53 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> Using SANITIZE=thread made t5400-send-pack.sh hit the potential race
> below.
>
> This is set_try_to_free_routine in wrapper.c. The race relates to the
> reading of the "old" value. The caller doesn't care about the "old"
> value, so this should be harmless right now. But it seems that using
> this mechanism from multiple threads and restoring the earlier value
> will probably not work out every time. (Not necessarily because of the
> race in set_try_to_free_routine, but, e.g., because callers might not
> restore the function pointer in the reverse order of how they
> originally set it.)
>
> Properly "fixing" this for thread-safety would probably require some
> redesigning, which at the time might not be warranted. I'm just posting
> this for completeness.

Maybe related read (also error handling in threads):
https://public-inbox.org/git/20170619220036.22656-1-avarab@gmail.com/

>
> Martin
>
> WARNING: ThreadSanitizer: data race (pid=21382)
>   Read of size 8 at 0x000000979970 by thread T1:
>     #0 set_try_to_free_routine wrapper.c:35 (git+0x0000006cde1c)
>     #1 prepare_trace_line trace.c:105 (git+0x0000006a3bf0)
>     #2 trace_strbuf_fl trace.c:185 (git+0x0000006a3bf0)
>     #3 packet_trace pkt-line.c:80 (git+0x0000005f9f43)
>     #4 packet_read pkt-line.c:309 (git+0x0000005fbe10)
>     #5 recv_sideband sideband.c:37 (git+0x000000684c5e)
>     #6 sideband_demux send-pack.c:216 (git+0x00000065a38c)
>     #7 run_thread run-command.c:933 (git+0x000000655a93)
>     #8 <null> <null> (libtsan.so.0+0x0000000230d9)
>
>   Previous write of size 8 at 0x000000979970 by main thread:
>     #0 set_try_to_free_routine wrapper.c:38 (git+0x0000006cde39)
>     #1 prepare_trace_line trace.c:105 (git+0x0000006a3bf0)
>     #2 trace_strbuf_fl trace.c:185 (git+0x0000006a3bf0)
>     #3 packet_trace pkt-line.c:80 (git+0x0000005f9f43)
>     #4 packet_read pkt-line.c:324 (git+0x0000005fc0bb)
>     #5 packet_read_line_generic pkt-line.c:332 (git+0x0000005fc0bb)
>     #6 packet_read_line pkt-line.c:342 (git+0x0000005fc0bb)
>     #7 receive_unpack_status send-pack.c:149 (git+0x00000065c1e4)
>     #8 send_pack send-pack.c:581 (git+0x00000065c1e4)
>     #9 git_transport_push transport.c:574 (git+0x0000006ab2c1)
>     #10 transport_push transport.c:1068 (git+0x0000006ae5d5)
>     #11 push_with_options builtin/push.c:336 (git+0x00000049d580)
>     #12 do_push builtin/push.c:419 (git+0x00000049f57d)
>     #13 cmd_push builtin/push.c:591 (git+0x00000049f57d)
>     #14 run_builtin git.c:330 (git+0x00000040949e)
>     #15 handle_builtin git.c:538 (git+0x00000040949e)
>     #16 run_argv git.c:590 (git+0x000000409a0e)
>     #17 cmd_main git.c:667 (git+0x000000409a0e)
>     #18 main common-main.c:43 (git+0x0000004079d1)
>
>   Location is global 'try_to_free_routine' of size 8 at 0x000000979970 (git+0x000000979970)
>
>   Thread T1 (tid=21405, running) created by main thread at:
>     #0 pthread_create <null> (libtsan.so.0+0x000000027577)
>     #1 start_async run-command.c:1130 (git+0x0000006582e7)
>     #2 send_pack send-pack.c:562 (git+0x00000065b7c8)
>     #3 git_transport_push transport.c:574 (git+0x0000006ab2c1)
>     #4 transport_push transport.c:1068 (git+0x0000006ae5d5)
>     #5 push_with_options builtin/push.c:336 (git+0x00000049d580)
>     #6 do_push builtin/push.c:419 (git+0x00000049f57d)
>     #7 cmd_push builtin/push.c:591 (git+0x00000049f57d)
>     #8 run_builtin git.c:330 (git+0x00000040949e)
>     #9 handle_builtin git.c:538 (git+0x00000040949e)
>     #10 run_argv git.c:590 (git+0x000000409a0e)
>     #11 cmd_main git.c:667 (git+0x000000409a0e)
>     #12 main common-main.c:43 (git+0x0000004079d1)
>
> SUMMARY: ThreadSanitizer: data race wrapper.c:35 set_try_to_free_routine
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t5400: set_try_to_free_routine
  2017-08-15 17:35   ` Stefan Beller
@ 2017-08-15 18:44     ` Martin Ågren
  0 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-15 18:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On 15 August 2017 at 19:35, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Aug 15, 2017 at 5:53 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>> Using SANITIZE=thread made t5400-send-pack.sh hit the potential race
>> below.
>>
>> This is set_try_to_free_routine in wrapper.c. The race relates to the
>> reading of the "old" value. The caller doesn't care about the "old"
>> value, so this should be harmless right now. But it seems that using
>> this mechanism from multiple threads and restoring the earlier value
>> will probably not work out every time. (Not necessarily because of the
>> race in set_try_to_free_routine, but, e.g., because callers might not
>> restore the function pointer in the reverse order of how they
>> originally set it.)
>>
>> Properly "fixing" this for thread-safety would probably require some
>> redesigning, which at the time might not be warranted. I'm just posting
>> this for completeness.
>
> Maybe related read (also error handling in threads):
> https://public-inbox.org/git/20170619220036.22656-1-avarab@gmail.com/

Thanks for the pointer. Threading is tricky business, indeed. I still
haven't entirely groked all the users of set_try_to_free_routine, i.e.,
what they want to avoid and what they want to achieve. I'm just happy
that the only user which cares about the old value is not threaded, to
the best of my understanding.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: tsan: t5400: set_try_to_free_routine
  2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
  2017-08-15 17:35   ` Stefan Beller
@ 2017-08-17 10:57   ` Jeff King
  1 sibling, 0 replies; 64+ messages in thread
From: Jeff King @ 2017-08-17 10:57 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

On Tue, Aug 15, 2017 at 02:53:07PM +0200, Martin Ågren wrote:

> Using SANITIZE=thread made t5400-send-pack.sh hit the potential race
> below.
> 
> This is set_try_to_free_routine in wrapper.c. The race relates to the
> reading of the "old" value. The caller doesn't care about the "old"
> value, so this should be harmless right now. But it seems that using
> this mechanism from multiple threads and restoring the earlier value
> will probably not work out every time. (Not necessarily because of the
> race in set_try_to_free_routine, but, e.g., because callers might not
> restore the function pointer in the reverse order of how they
> originally set it.)
> 
> Properly "fixing" this for thread-safety would probably require some
> redesigning, which at the time might not be warranted. I'm just posting
> this for completeness.
> 
> Martin
> 
> WARNING: ThreadSanitizer: data race (pid=21382)
>   Read of size 8 at 0x000000979970 by thread T1:
>     #0 set_try_to_free_routine wrapper.c:35 (git+0x0000006cde1c)
>     #1 prepare_trace_line trace.c:105 (git+0x0000006a3bf0)
>     #2 trace_strbuf_fl trace.c:185 (git+0x0000006a3bf0)
>     #3 packet_trace pkt-line.c:80 (git+0x0000005f9f43)
>     #4 packet_read pkt-line.c:309 (git+0x0000005fbe10)
>     #5 recv_sideband sideband.c:37 (git+0x000000684c5e)
>     #6 sideband_demux send-pack.c:216 (git+0x00000065a38c)
>     #7 run_thread run-command.c:933 (git+0x000000655a93)
>     #8 <null> <null> (libtsan.so.0+0x0000000230d9)

I was curious why the trace code would care about the free routine in
the first place. Digging in the mailing list, I didn't find a lot of
discussion. But I think the problem is basically that the trace
infrastructure wants to be thread-safe, but the default free-pack-memory
callback isn't.

It's ironic that we fix the thread-unsafety of the free-pack-memory
function by using the also-thread-unsafe set_try_to_free_routine.

Further irony: the trace routines aren't thread-safe in the first place,
as they do lazy initialization of key->fd using an "initialized" field.
In practice it probably means double-writing key->fd and leaking a
descriptor (though there are no synchronizing operations there, so it's
entirely possible a compiler could reorder the assignments to key->fd
and key->initialized and a simultaneous reader could read a garbage
key->fd value).  We also call getenv(), which isn't thread-safe with
other calls to getenv() or setenv().

I can think of a few possible directions:

  1. Make set_try_to_free_routine() skip the write if it would be a
     noop. This is racy if threads are actually changing the value, but
     in practice they aren't (the first trace of any kind will set it to
     NULL, and it will remain there).

  2. Make the free-packed routine thread-safe by taking a lock. It
     should hardly ever be called, so performance wouldn't matter. The
     big question is: _which_ lock.  pack-objects, which uses threads
     already, has a version which does this. But it knows to take the
     big program-wide "I'm accessing unsafe parts of Git" lock that the
     rest of the program uses during its multi-threaded parts.
     There's no notion in the rest of Git for "now we're going into a
     multi-threaded part, so most calls will need to take a big global
     lock before doing anything interesting".

     For parts of Git that are explicitly multi-threaded (like the
     pack-objects delta search, or index-pack's delta resolution) that's
     not so bad. But the example above is just using a sideband demuxer.
     It would be unfortunate if the entire rest of send-pack had to
     start caring about taking that lock.

  3. Is the free-pack-memory thing actually accomplishing much these
     days? It comes from 97bfeb34df (Release pack windows before
     reporting out of memory., 2006-12-24), and the primary issue is not
     actual allocated memory, but mmap'd packs clogging up the address
     space so that malloc can't find a suitable block.

     On 64-bit systems this is likely doing nothing. We have tons of
     address space. But even on 32-bit systems, the default
     core.packedGitLimit is only 256MiB (which was set around the same
     time). You can certainly come up with a corner case where freeing
     up that address space could matter. But I'd be surprised if this
     has actually helped much in practice over the years. And if you
     have a repo which is running so close to the address space limits
     of your system, the right answer is probably: upgrade to a 64-bit
     system. Even if the try-to-free thing helped in one run, it's
     likely that similar runs are not going to be so lucky, and even
     with it you're going to see sporadic out-of-memory failures.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH/RFC 0/5] Some ThreadSanitizer-results
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (6 preceding siblings ...)
  2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
@ 2017-08-20 10:06 ` Jeff King
  2017-08-20 10:45   ` Martin Ågren
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
  8 siblings, 1 reply; 64+ messages in thread
From: Jeff King @ 2017-08-20 10:06 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

On Tue, Aug 15, 2017 at 02:53:00PM +0200, Martin Ågren wrote:

> I tried running the test suite on Git compiled with ThreadSanitizer
> (SANITIZE=thread). Maybe this series could be useful for someone else
> trying to do the same. I needed the first patch to avoid warnings when
> compiling, although it actually has nothing to do with threads.
> 
> The last four patches are about avoiding some issues where
> ThreadSanitizer complains for reasonable reasons, but which to the best
> of my understanding are not real problems. These patches could be useful
> to make "actual" problems stand out more. Of course, if no-one ever runs
> ThreadSanitizer, they are of little to no (or even negative) value...

I think it's a chicken-and-egg. I'd love to run the test suite with tsan
from time to time, but there's no point if it turns up a bunch of false
positives.

The general direction here looks good to me (and I agree with the
comments made so far, especially that we should stop writing to
strbuf_slopbuf entirely).

>   ThreadSanitizer: add suppressions

This one is the most interesting because it really is just papering over
the issues. I "solved" the transfer_debug one with actual code in:

  https://public-inbox.org/git/20170710133040.yom65mjol3nmf2it@sigill.intra.peff.net/

but it just feels really dirty. I'd be inclined to go with suppressions
for now until somebody can demonstrate or argue for an actual breakage
(just because it makes the tool more useful for finding _real_
problems).

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH/RFC 0/5] Some ThreadSanitizer-results
  2017-08-20 10:06 ` [PATCH/RFC 0/5] Some ThreadSanitizer-results Jeff King
@ 2017-08-20 10:45   ` Martin Ågren
  0 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-20 10:45 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On 20 August 2017 at 12:06, Jeff King <peff@peff.net> wrote:
> On Tue, Aug 15, 2017 at 02:53:00PM +0200, Martin Ågren wrote:
>
>> I tried running the test suite on Git compiled with ThreadSanitizer
>> (SANITIZE=thread). Maybe this series could be useful for someone else
>> trying to do the same. I needed the first patch to avoid warnings when
>> compiling, although it actually has nothing to do with threads.
>>
>> The last four patches are about avoiding some issues where
>> ThreadSanitizer complains for reasonable reasons, but which to the best
>> of my understanding are not real problems. These patches could be useful
>> to make "actual" problems stand out more. Of course, if no-one ever runs
>> ThreadSanitizer, they are of little to no (or even negative) value...
>
> I think it's a chicken-and-egg. I'd love to run the test suite with tsan
> from time to time, but there's no point if it turns up a bunch of false
> positives.
>
> The general direction here looks good to me (and I agree with the
> comments made so far, especially that we should stop writing to
> strbuf_slopbuf entirely).
>
>>   ThreadSanitizer: add suppressions
>
> This one is the most interesting because it really is just papering over
> the issues. I "solved" the transfer_debug one with actual code in:
>
>   https://public-inbox.org/git/20170710133040.yom65mjol3nmf2it@sigill.intra.peff.net/

Hmm, I did search for related posts, but obviously not good enough.
Sorry that I missed this.

> but it just feels really dirty. I'd be inclined to go with suppressions
> for now until somebody can demonstrate or argue for an actual breakage
> (just because it makes the tool more useful for finding _real_
> problems).

I actually had a more intrusive version of my patch, but which I didn't
send, where I extracted transport_debug_enabled() exactly like that,
except I did it in order to minimize the scope of the suppression. In
the end, I figured the scope was already small enough.

The obvious risk of introducing any kind of "temporary" suppression is
that it remains forever... :-) I think I'll add a couple of NEEDSWORK in
the code to make it a bit more likely that someone stumbles over the
problem and addresses it.

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2 0/4] Some ThreadSanitizer-results
  2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
                   ` (7 preceding siblings ...)
  2017-08-20 10:06 ` [PATCH/RFC 0/5] Some ThreadSanitizer-results Jeff King
@ 2017-08-21 17:43 ` Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs Martin Ågren
                     ` (4 more replies)
  8 siblings, 5 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-21 17:43 UTC (permalink / raw)
  Cc: git, Torsten Bögershausen, Johannes Sixt, Junio C Hamano,
	Jeff King, Jeff Hostetler, Stefan Beller

This is the second version of my series to try to address some issues
noted by ThreadSanitizer. Thanks to all who took the time to provide
input on the first version.

The largest change is in the third patch, which moves the "avoid writing
to slopbuf"-logic into strbuf_setlen, and compiles it unconditionally.
Patch 2 hasn't been changed. The others have seen some minor changes.
The patch introducing GIT_THREAD_SANITIZER is gone.

The end result as far as ThreadSanitizer is concerned is the same:

1) set_try_to_free_routine, where Peff outlined some possibilities, and
where I don't feel like I have any idea which of them is better.

2) hashmap_add, which I could try my hands on if Jeff doesn't beat me to
it -- his proposed change should fix it and I doubt I could come up with
anything "better", considering he knows the code.

Martin

Martin Ågren (4):
  convert: always initialize attr_action in convert_attrs
  pack-objects: take lock before accessing `remaining`
  strbuf_setlen: don't write to strbuf_slopbuf
  ThreadSanitizer: add suppressions

 strbuf.h               |  5 ++++-
 builtin/pack-objects.c |  6 ++++++
 color.c                |  7 +++++++
 convert.c              |  5 +++--
 transport-helper.c     |  7 +++++++
 .tsan-suppressions     | 10 ++++++++++
 6 files changed, 37 insertions(+), 3 deletions(-)
 create mode 100644 .tsan-suppressions

-- 
2.14.1.151.gdfeca7a7e

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
@ 2017-08-21 17:43   ` Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 2/4] pack-objects: take lock before accessing `remaining` Martin Ågren
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-21 17:43 UTC (permalink / raw)
  Cc: git, Torsten Bögershausen

convert_attrs contains an "if-else". In the "if", we set attr_action
twice, and the first assignment has no effect. In the "else", we do not
set it at all. Since git_check_attr always returns the same value, we'll
always end up in the "if", so there is no problem right now. But
convert_attrs is obviously trying not to rely on such an
implementation-detail of another component.

Make the initialization of attr_action after the if-else. Remove the
earlier assignments.

Suggested-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v2: adapted per input from Torsten

 convert.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/convert.c b/convert.c
index 1012462e3..1f8116381 100644
--- a/convert.c
+++ b/convert.c
@@ -1040,7 +1040,6 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
-		ca->attr_action = ca->crlf_action;
 		ca->ident = git_path_check_ident(ccheck + 1);
 		ca->drv = git_path_check_convert(ccheck + 2);
 		if (ca->crlf_action != CRLF_BINARY) {
@@ -1054,12 +1053,14 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 			else if (eol_attr == EOL_CRLF)
 				ca->crlf_action = CRLF_TEXT_CRLF;
 		}
-		ca->attr_action = ca->crlf_action;
 	} else {
 		ca->drv = NULL;
 		ca->crlf_action = CRLF_UNDEFINED;
 		ca->ident = 0;
 	}
+
+	/* Save attr and make a decision for action */
+	ca->attr_action = ca->crlf_action;
 	if (ca->crlf_action == CRLF_TEXT)
 		ca->crlf_action = text_eol_is_crlf() ? CRLF_TEXT_CRLF : CRLF_TEXT_INPUT;
 	if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_FALSE)
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 2/4] pack-objects: take lock before accessing `remaining`
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs Martin Ågren
@ 2017-08-21 17:43   ` Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-21 17:43 UTC (permalink / raw)
  Cc: git, Johannes Sixt

When checking the conditional of "while (me->remaining)", we did not
hold the lock. Calling find_deltas would still be safe, since it checks
"remaining" (after taking the lock) and is able to handle all values. In
fact, this could (currently) not trigger any bug: a bug could happen if
`remaining` transitioning from zero to non-zero races with the evaluation
of the while-condition, but these are always separated by the
data_ready-mechanism.

Make sure we have the lock when we read `remaining`. This does mean we
release it just so that find_deltas can take it immediately again. We
could tweak the contract so that the lock should be taken before calling
find_deltas, but let's defer that until someone can actually show that
"unlock+lock" has a measurable negative impact.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v2: unchanged from v1

 builtin/pack-objects.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c753e9237..bd391e97a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2170,7 +2170,10 @@ static void *threaded_find_deltas(void *arg)
 {
 	struct thread_params *me = arg;

+	progress_lock();
 	while (me->remaining) {
+		progress_unlock();
+
 		find_deltas(me->list, &me->remaining,
 			    me->window, me->depth, me->processed);

@@ -2192,7 +2195,10 @@ static void *threaded_find_deltas(void *arg)
 			pthread_cond_wait(&me->cond, &me->mutex);
 		me->data_ready = 0;
 		pthread_mutex_unlock(&me->mutex);
+
+		progress_lock();
 	}
+	progress_unlock();
 	/* leave ->working 1 so that this doesn't get more work assigned */
 	return NULL;
 }
-- 
2.14.1.151.gdfeca7a7e

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs Martin Ågren
  2017-08-21 17:43   ` [PATCH v2 2/4] pack-objects: take lock before accessing `remaining` Martin Ågren
@ 2017-08-21 17:43   ` Martin Ågren
  2017-08-23 17:24     ` Junio C Hamano
  2017-08-23 20:37     ` Brandon Casey
  2017-08-21 17:43   ` [PATCH v2 4/4] ThreadSanitizer: add suppressions Martin Ågren
  2017-08-28 20:56   ` [PATCH v2 0/4] Some ThreadSanitizer-results Jeff Hostetler
  4 siblings, 2 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-21 17:43 UTC (permalink / raw)
  Cc: git, Junio C Hamano

strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
allocated and unique to sb, or the global slopbuf. The slopbuf is meant
to provide a guarantee that buf is not NULL and that a freshly
initialized buffer contains the empty string, but it is not supposed to
be written to. That strbuf_setlen writes to slopbuf has at least two
implications:

First, it's wrong in principle. Second, it might be hiding misuses which
are just waiting to wreak havoc. Third, ThreadSanitizer detects a race
when multiple threads write to slopbuf at roughly the same time, thus
potentially making any more critical races harder to spot.

Avoid writing to strbuf_slopbuf in strbuf_setlen. Let's instead assert
on the first byte of slopbuf being '\0', since it helps ensure the
promised invariant of buf[len] == '\0'. (We know that "len" was already
0, or someone has messed with "alloc". If someone has fiddled with the
fields that much beyond the correct interface, they're on their own.)

This is a function which is used in many places, possibly also in hot
code paths. There are two branches in strbuf_setlen already, and we are
adding a third and possibly a fourth (in the assert). In hot code paths,
we hopefully reuse the buffer in order to avoid continous reallocations.
Thus, after a start-up phase, we should always take the same path,
which might help branch prediction, and we would never make the assert.
If a hot code path continuously reallocates, we probably have bigger
performance problems than this new safety-check.

Simple measurements do not contradict this reasoning. 100000000 times
resetting a buffer and adding the empty string takes 5.29/5.26 seconds
with/without this patch (best of three). Releasing at every iteration
yields 18.01/17.87. Adding a 30-character string instead of the empty
string yields 5.61/5.58 and 17.28/17.28(!).

This patch causes the git binary emitted by gcc 5.4.0 -O2 on my machine
to grow from 11389848 bytes to 11497184 bytes, an increase of 0.9%.

I also tried to piggy-back on the fact that we already check alloc,
which should already tell us whether we are using the slopbuf:

        if (sb->alloc) {
                if (len > sb->alloc - 1)
                        die("BUG: strbuf_setlen() beyond buffer");
                sb->buf[len] = '\0';
        } else {
                if (len)
                        die("BUG: strbuf_setlen() beyond buffer");
                assert(!strbuf_slopbuf[0]);
        }
        sb->len = len;

That didn't seem to be much slower (5.38, 18.02, 5.70, 17.32 seconds),
but it does introduce some minor code duplication. The resulting git
binary was 11510528 bytes large (another 0.1% increase).

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v2: no "ifdef TSAN"; moved check from strbuf_reset into strbuf_setlen

 strbuf.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/strbuf.h b/strbuf.h
index e705b94db..7496cb8ec 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
 	if (len > (sb->alloc ? sb->alloc - 1 : 0))
 		die("BUG: strbuf_setlen() beyond buffer");
 	sb->len = len;
-	sb->buf[len] = '\0';
+	if (sb->buf != strbuf_slopbuf)
+		sb->buf[len] = '\0';
+	else
+		assert(!strbuf_slopbuf[0]);
 }
 
 /**
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
@ 2017-08-23 17:24     ` Junio C Hamano
  2017-08-23 17:43       ` Martin Ågren
  2017-08-23 20:37     ` Brandon Casey
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-08-23 17:24 UTC (permalink / raw)
  To: Martin Ågren

Martin Ågren <martin.agren@gmail.com> writes:

> strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
> allocated and unique to sb, or the global slopbuf. The slopbuf is meant
> to provide a guarantee that buf is not NULL and that a freshly
> initialized buffer contains the empty string, but it is not supposed to
> be written to. That strbuf_setlen writes to slopbuf has at least two
> implications:
>
> First, it's wrong in principle. Second, it might be hiding misuses which
> are just waiting to wreak havoc. Third, ThreadSanitizer detects a race
> when multiple threads write to slopbuf at roughly the same time, thus
> potentially making any more critical races harder to spot.

There are two hard things in computer science ;-).

> Suggested-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> v2: no "ifdef TSAN"; moved check from strbuf_reset into strbuf_setlen

Looks much better.  I have a mild objection to "suggested-by",
though.  It makes it sound as if this were my itch, but it is not.

All the credit for being motivate to fix the issue should go to you.
For what I did during the review of the previous one to lead to this
simpler version, if you want to document it, "helped-by" would be
more appropriate.

>  strbuf.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/strbuf.h b/strbuf.h
> index e705b94db..7496cb8ec 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>  	if (len > (sb->alloc ? sb->alloc - 1 : 0))
>  		die("BUG: strbuf_setlen() beyond buffer");
>  	sb->len = len;
> -	sb->buf[len] = '\0';
> +	if (sb->buf != strbuf_slopbuf)
> +		sb->buf[len] = '\0';
> +	else
> +		assert(!strbuf_slopbuf[0]);
>  }
>  
>  /**

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 17:24     ` Junio C Hamano
@ 2017-08-23 17:43       ` Martin Ågren
  2017-08-23 18:30         ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-23 17:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On 23 August 2017 at 19:24, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
>> allocated and unique to sb, or the global slopbuf. The slopbuf is meant
>> to provide a guarantee that buf is not NULL and that a freshly
>> initialized buffer contains the empty string, but it is not supposed to
>> be written to. That strbuf_setlen writes to slopbuf has at least two
>> implications:
>>
>> First, it's wrong in principle. Second, it might be hiding misuses which
>> are just waiting to wreak havoc. Third, ThreadSanitizer detects a race
>> when multiple threads write to slopbuf at roughly the same time, thus
>> potentially making any more critical races harder to spot.
>
> There are two hard things in computer science ;-).

Indeed. :-)

>> Suggested-by: Junio C Hamano <gitster@pobox.com>
>> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
>> ---
>> v2: no "ifdef TSAN"; moved check from strbuf_reset into strbuf_setlen
>
> Looks much better.  I have a mild objection to "suggested-by",
> though.  It makes it sound as if this were my itch, but it is not.
>
> All the credit for being motivate to fix the issue should go to you.
> For what I did during the review of the previous one to lead to this
> simpler version, if you want to document it, "helped-by" would be
> more appropriate.

Ok, so that's two things to tweak in the commit message. I'll hold off
on v3 in case I get some more feedback the coming days. Thanks.

Martin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 17:43       ` Martin Ågren
@ 2017-08-23 18:30         ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2017-08-23 18:30 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

Martin Ågren <martin.agren@gmail.com> writes:

> On 23 August 2017 at 19:24, Junio C Hamano <gitster@pobox.com> wrote:
>> Martin Ågren <martin.agren@gmail.com> writes:
>>
>>> strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
>>> allocated and unique to sb, or the global slopbuf. The slopbuf is meant
>>> to provide a guarantee that buf is not NULL and that a freshly
>>> initialized buffer contains the empty string, but it is not supposed to
>>> be written to. That strbuf_setlen writes to slopbuf has at least two
>>> implications:
>>>
>>> First, it's wrong in principle. Second, it might be hiding misuses which
>>> are just waiting to wreak havoc. Third, ThreadSanitizer detects a race
>>> when multiple threads write to slopbuf at roughly the same time, thus
>>> potentially making any more critical races harder to spot.
>>
>> There are two hard things in computer science ;-).
>
> Indeed. :-)
>
>>> Suggested-by: Junio C Hamano <gitster@pobox.com>
>>> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
>>> ---
>>> v2: no "ifdef TSAN"; moved check from strbuf_reset into strbuf_setlen
>>
>> Looks much better.  I have a mild objection to "suggested-by",
>> though.  It makes it sound as if this were my itch, but it is not.
>>
>> All the credit for being motivate to fix the issue should go to you.
>> For what I did during the review of the previous one to lead to this
>> simpler version, if you want to document it, "helped-by" would be
>> more appropriate.
>
> Ok, so that's two things to tweak in the commit message. I'll hold off
> on v3 in case I get some more feedback the coming days. Thanks.

Well, this one is good enough and your "at least two" is technically
fine ;-)  Let's not reroll this any further.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
  2017-08-23 17:24     ` Junio C Hamano
@ 2017-08-23 20:37     ` Brandon Casey
  2017-08-23 21:04       ` Junio C Hamano
  1 sibling, 1 reply; 64+ messages in thread
From: Brandon Casey @ 2017-08-23 20:37 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git@vger.kernel.org, Junio C Hamano

On Mon, Aug 21, 2017 at 10:43 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
> allocated and unique to sb, or the global slopbuf. The slopbuf is meant
> to provide a guarantee that buf is not NULL and that a freshly
> initialized buffer contains the empty string, but it is not supposed to
> be written to. That strbuf_setlen writes to slopbuf has at least two
> implications:
<snip very well written commit message>

> diff --git a/strbuf.h b/strbuf.h
> index e705b94db..7496cb8ec 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
>         if (len > (sb->alloc ? sb->alloc - 1 : 0))
>                 die("BUG: strbuf_setlen() beyond buffer");
>         sb->len = len;
> -       sb->buf[len] = '\0';
> +       if (sb->buf != strbuf_slopbuf)
> +               sb->buf[len] = '\0';
> +       else
> +               assert(!strbuf_slopbuf[0]);
>  }
>
>  /**
> --
> 2.14.1.151.gdfeca7a7e
>

I know this must have been discussed before and/or I'm having a neuron
misfire, but is there any reason why we didn't just make slopbuf a
macro for ""?

i.e.

   #define strbuf_slopbuf ""

That way it should point to readonly memory and any attempt to assign
to it should produce a crash.

One benefit that I can think of for making strbuf_slopbuf be a real
variable is that we can then compare the pointer stored in the strbuf
to the strbuf_slopbuf address to detect whether the strbuf held the
slopbuf.  With the static string macro, each execution unit may get
it's own instance of the empty string.  But, before this patch, we
don't actually seem to be doing that anywhere and instead rely on the
value of alloc being accurate to determine whether the strbuf contains
the slopbuf or not.

So is there any reason why didn't do something like the following in
the first place?

diff --git a/strbuf.h b/strbuf.h
index e705b94..fcca618 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -67,7 +67,7 @@ struct strbuf {
        char *buf;
 };

-extern char strbuf_slopbuf[];
+#define strbuf_slopbuf ""
 #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = strbuf_slopbuf }

 /**
@@ -147,7 +147,9 @@ static inline void strbuf_setlen(struct strbuf
*sb, size_t len)
        if (len > (sb->alloc ? sb->alloc - 1 : 0))
                die("BUG: strbuf_setlen() beyond buffer");
        sb->len = len;
-       sb->buf[len] = '\0';
+       if (sb->alloc) {
+               sb->buf[len] = '\0';
+       }
 }

-Brandon

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 20:37     ` Brandon Casey
@ 2017-08-23 21:04       ` Junio C Hamano
  2017-08-23 21:20         ` Brandon Casey
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-08-23 21:04 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Martin Ågren, git@vger.kernel.org

Brandon Casey <drafnel@gmail.com> writes:

> So is there any reason why didn't do something like the following in
> the first place?

My guess is that we didn't bother; if we cared, we would have used a
single instance of const char in a read-only segment, instead of
such a macro.

> diff --git a/strbuf.h b/strbuf.h
> index e705b94..fcca618 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -67,7 +67,7 @@ struct strbuf {
>         char *buf;
>  };
>
> -extern char strbuf_slopbuf[];
> +#define strbuf_slopbuf ""
>  #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = strbuf_slopbuf }
>
>  /**
> @@ -147,7 +147,9 @@ static inline void strbuf_setlen(struct strbuf
> *sb, size_t len)
>         if (len > (sb->alloc ? sb->alloc - 1 : 0))
>                 die("BUG: strbuf_setlen() beyond buffer");
>         sb->len = len;
> -       sb->buf[len] = '\0';
> +       if (sb->alloc) {
> +               sb->buf[len] = '\0';
> +       }
>  }
>
> -Brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 21:04       ` Junio C Hamano
@ 2017-08-23 21:20         ` Brandon Casey
  2017-08-23 21:54           ` Brandon Casey
  2017-08-23 22:24           ` Junio C Hamano
  0 siblings, 2 replies; 64+ messages in thread
From: Brandon Casey @ 2017-08-23 21:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git@vger.kernel.org

On Wed, Aug 23, 2017 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>> So is there any reason why didn't do something like the following in
>> the first place?
>
> My guess is that we didn't bother; if we cared, we would have used a
> single instance of const char in a read-only segment, instead of
> such a macro.

I think you mean something like this:

   const char * const strbuf_slopbuf = "";

..with or without "const" at the beginning.  We can't use an actual
variable like that since we also want to be able to do initialization
like:

   struct strbuf b = STRBUF_INIT;

i.e.

   struct strbuf b = { 0, 0, strbuf_slopbuf };

So the compiler needs to be able to determine that everything within
the curly braces is constant and apparently gcc cannot.


On a related note... I was just looking at object.c which also uses a
slopbuf.  We could similarly protect it from inadvertent modification
by doing something like this:

diff --git a/object.c b/object.c
index 321d7e9..4c7a041 100644
--- a/object.c
+++ b/object.c
@@ -303,7 +303,7 @@ int object_list_contains(struct object_list *list, struct ob
ject *obj)
  * A zero-length string to which object_array_entry::name can be
  * initialized without requiring a malloc/free.
  */
-static char object_array_slopbuf[1];
+static const char * const object_array_slopbuf = "";

 void add_object_array_with_path(struct object *obj, const char *name,
                                struct object_array *array,
@@ -326,7 +326,7 @@ void add_object_array_with_path(struct object
*obj, const char *name,
                entry->name = NULL;
        else if (!*name)
                /* Use our own empty string instead of allocating one: */
-               entry->name = object_array_slopbuf;
+               entry->name = (char*) object_array_slopbuf;
        else
                entry->name = xstrdup(name);
        entry->mode = mode;

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 21:20         ` Brandon Casey
@ 2017-08-23 21:54           ` Brandon Casey
  2017-08-23 22:11             ` Brandon Casey
  2017-08-24 16:52             ` Junio C Hamano
  2017-08-23 22:24           ` Junio C Hamano
  1 sibling, 2 replies; 64+ messages in thread
From: Brandon Casey @ 2017-08-23 21:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git@vger.kernel.org

On Wed, Aug 23, 2017 at 2:20 PM, Brandon Casey <drafnel@gmail.com> wrote:
> On Wed, Aug 23, 2017 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Brandon Casey <drafnel@gmail.com> writes:
>>
>>> So is there any reason why didn't do something like the following in
>>> the first place?
>>
>> My guess is that we didn't bother; if we cared, we would have used a
>> single instance of const char in a read-only segment, instead of
>> such a macro.
>
> I think you mean something like this:
>
>    const char * const strbuf_slopbuf = "";

Ah, you probably meant something like this:

   const char strbuf_slopbuf = '\0';

which gcc will apparently place in the read-only segment.  I did not know that.

And assignment and initialization would look like:

   sb->buf = (char*) &strbuf_slopbuf;

and

   #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = (char*) &strbuf_slopbuf }

respectively.  Yeah, that's definitely preferable to a macro.
Something similar could be done in object.c.

-Brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 21:54           ` Brandon Casey
@ 2017-08-23 22:11             ` Brandon Casey
  2017-08-24 16:52             ` Junio C Hamano
  1 sibling, 0 replies; 64+ messages in thread
From: Brandon Casey @ 2017-08-23 22:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git@vger.kernel.org

On Wed, Aug 23, 2017 at 2:54 PM, Brandon Casey <drafnel@gmail.com> wrote:
> On Wed, Aug 23, 2017 at 2:20 PM, Brandon Casey <drafnel@gmail.com> wrote:
>> On Wed, Aug 23, 2017 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Brandon Casey <drafnel@gmail.com> writes:
>>>
>>>> So is there any reason why didn't do something like the following in
>>>> the first place?
>>>
>>> My guess is that we didn't bother; if we cared, we would have used a
>>> single instance of const char in a read-only segment, instead of
>>> such a macro.
>>
>> I think you mean something like this:
>>
>>    const char * const strbuf_slopbuf = "";

Hmm, apparently it is sufficient to mark our current strbuf_slopbuf
array as const and initialize it with a static string to trigger its
placement into the read-only section by gcc (and clang).

   const char strbuf_slopbuf[1] = "";

-Brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 21:54           ` Brandon Casey
  2017-08-23 22:11             ` Brandon Casey
@ 2017-08-24 16:52             ` Junio C Hamano
  2017-08-24 18:29               ` Brandon Casey
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-08-24 16:52 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Martin Ågren, git@vger.kernel.org

Brandon Casey <drafnel@gmail.com> writes:

> Ah, you probably meant something like this:
>
>    const char strbuf_slopbuf = '\0';
>
> which gcc will apparently place in the read-only segment.  I did not know that.

Yes but I highly suspect that it would be very compiler dependent
and not something the language lawyers would recommend us to rely
on.

My response was primarily to answer "why?" with "because we did not
bother".  The above is a mere tangent, i.e. "multiple copies of
empty strings is a horrible implementation (and there would be a way
to do it with a single instance)".

>    #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = (char*) &strbuf_slopbuf }
>
> respectively.  Yeah, that's definitely preferable to a macro.
> Something similar could be done in object.c.

What is the main objective for doing this change?  The "make sure we
do not write into that slopbuf" assert() bothers you and you want to
replace it with an address in the read-only segment?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-24 16:52             ` Junio C Hamano
@ 2017-08-24 18:29               ` Brandon Casey
  2017-08-24 19:16                 ` Martin Ågren
  0 siblings, 1 reply; 64+ messages in thread
From: Brandon Casey @ 2017-08-24 18:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git@vger.kernel.org

On Thu, Aug 24, 2017 at 9:52 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>> Ah, you probably meant something like this:
>>
>>    const char strbuf_slopbuf = '\0';
>>
>> which gcc will apparently place in the read-only segment.  I did not know that.
>
> Yes but I highly suspect that it would be very compiler dependent
> and not something the language lawyers would recommend us to rely
> on.

I think compilers may have the option of placing variables that are
explicitly initialized to zero in the bss segment too, in addition to
those that are not explicitly initialized.  So I agree that no one
should write code that depends on their variables being placed in one
segment or the other, but I could see someone using this behavior as
an additional safety check; kind of a free assert, aside from the
additional space in the .rodata segment.

> My response was primarily to answer "why?" with "because we did not
> bother".  The above is a mere tangent, i.e. "multiple copies of
> empty strings is a horrible implementation (and there would be a way
> to do it with a single instance)".

Merely adding const to our current strbuf_slopbuf declaration does not
buy us anything since it will be allocated in r/w memory.  i.e. it
would still be possible to modify it via the buf member of strbuf.  So
you can't just do this:

   const char strbuf_slopbuf[1];

That's pretty much equivalent to what we currently have since it only
restricts modifying the contents of strbuf_slopbuf directly through
the strbuf_slopbuf variable, but it does not restrict modifying it
through a pointer to that object.

Until yesterday, I was under the impression that the only way to
access data in the rodata segment was through a constant literal.  So
my initial thought was that we could do something like:

   const char * const strbuf_slopbuf = "";

..but that variable cannot be used in a static assignment like:

   struct strbuf foo = {0, 0, (char*) strbuf_slopbuf};

So it seemed like our only option was to use a literal "" everywhere
instead of a slopbuf variable _if_ we wanted to have the guarantee
that our "slopbuf" could not be modified.

But what I learned yesterday, is that at least gcc/clang will place
the entire variable in the rodata segment if the variable is both
marked const _and_ initialized.

i.e. this will be allocated in the .rodata segment:

   const char strbuf_slopbuf[1] = "";

>>    #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = (char*) &strbuf_slopbuf }
>>
>> respectively.  Yeah, that's definitely preferable to a macro.
>> Something similar could be done in object.c.
>
> What is the main objective for doing this change?  The "make sure we
> do not write into that slopbuf" assert() bothers you and you want to
> replace it with an address in the read-only segment?

Actually nothing about the patch bothers me.  The point of that patch
is to make sure we don't accidentally modify the slopbuf.  I was just
looking for a way for the compiler to help out and wondering if there
was a reason we didn't attempt to do so in the first place.

I think the main takeaway here is that I learned something yesterday
:-)  I didn't actually intend to submit a patch for any of this, but
if anything useful came out of the discussion I thought Martin may
incorporate it into his patch if he wanted to.

-Brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-24 18:29               ` Brandon Casey
@ 2017-08-24 19:16                 ` Martin Ågren
  0 siblings, 0 replies; 64+ messages in thread
From: Martin Ågren @ 2017-08-24 19:16 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Junio C Hamano, git@vger.kernel.org

On 24 August 2017 at 20:29, Brandon Casey <drafnel@gmail.com> wrote:
> On Thu, Aug 24, 2017 at 9:52 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Brandon Casey <drafnel@gmail.com> writes:
>>
>>> Ah, you probably meant something like this:
>>>
>>>    const char strbuf_slopbuf = '\0';
>>>
>>> which gcc will apparently place in the read-only segment.  I did not know that.
>>
>> Yes but I highly suspect that it would be very compiler dependent
>> and not something the language lawyers would recommend us to rely
>> on.
>
> I think compilers may have the option of placing variables that are
> explicitly initialized to zero in the bss segment too, in addition to
> those that are not explicitly initialized.  So I agree that no one
> should write code that depends on their variables being placed in one
> segment or the other, but I could see someone using this behavior as
> an additional safety check; kind of a free assert, aside from the
> additional space in the .rodata segment.
>
>> My response was primarily to answer "why?" with "because we did not
>> bother".  The above is a mere tangent, i.e. "multiple copies of
>> empty strings is a horrible implementation (and there would be a way
>> to do it with a single instance)".
>
> Merely adding const to our current strbuf_slopbuf declaration does not
> buy us anything since it will be allocated in r/w memory.  i.e. it
> would still be possible to modify it via the buf member of strbuf.  So
> you can't just do this:
>
>    const char strbuf_slopbuf[1];
>
> That's pretty much equivalent to what we currently have since it only
> restricts modifying the contents of strbuf_slopbuf directly through
> the strbuf_slopbuf variable, but it does not restrict modifying it
> through a pointer to that object.
>
> Until yesterday, I was under the impression that the only way to
> access data in the rodata segment was through a constant literal.  So
> my initial thought was that we could do something like:
>
>    const char * const strbuf_slopbuf = "";
>
> ..but that variable cannot be used in a static assignment like:
>
>    struct strbuf foo = {0, 0, (char*) strbuf_slopbuf};
>
> So it seemed like our only option was to use a literal "" everywhere
> instead of a slopbuf variable _if_ we wanted to have the guarantee
> that our "slopbuf" could not be modified.
>
> But what I learned yesterday, is that at least gcc/clang will place
> the entire variable in the rodata segment if the variable is both
> marked const _and_ initialized.
>
> i.e. this will be allocated in the .rodata segment:
>
>    const char strbuf_slopbuf[1] = "";
>
>>>    #define STRBUF_INIT  { .alloc = 0, .len = 0, .buf = (char*) &strbuf_slopbuf }
>>>
>>> respectively.  Yeah, that's definitely preferable to a macro.
>>> Something similar could be done in object.c.
>>
>> What is the main objective for doing this change?  The "make sure we
>> do not write into that slopbuf" assert() bothers you and you want to
>> replace it with an address in the read-only segment?
>
> Actually nothing about the patch bothers me.  The point of that patch
> is to make sure we don't accidentally modify the slopbuf.  I was just
> looking for a way for the compiler to help out and wondering if there
> was a reason we didn't attempt to do so in the first place.
>
> I think the main takeaway here is that I learned something yesterday
> :-)  I didn't actually intend to submit a patch for any of this, but
> if anything useful came out of the discussion I thought Martin may
> incorporate it into his patch if he wanted to.

Thanks for interesting information. I also learned something new. :-)

My first thought was, well, maybe someone writes '\0' to sb.buf[len].
That should intuitively be a no-op and "ok", but the documentation
actually only says that it's safe to write to positions 0 .. len-1, so
sb.buf[len] is supposedly not safe (no-op or not). Maybe a degenerate
and rarely tested case of otherwise sane code could end up writing '\0'
to slopbuf[0]. (Arguably strbuf_setlen should have been used instead.)

I can see the value of placing the slopbuf in read-only memory, but I
think that would be a follow-up patch with its own pros and cons.

Martin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 21:20         ` Brandon Casey
  2017-08-23 21:54           ` Brandon Casey
@ 2017-08-23 22:24           ` Junio C Hamano
  2017-08-23 22:39             ` Brandon Casey
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2017-08-23 22:24 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Martin Ågren, git@vger.kernel.org

Brandon Casey <drafnel@gmail.com> writes:

> On Wed, Aug 23, 2017 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Brandon Casey <drafnel@gmail.com> writes:
>>
>>> So is there any reason why didn't do something like the following in
>>> the first place?
>>
>> My guess is that we didn't bother; if we cared, we would have used a
>> single instance of const char in a read-only segment, instead of
>> such a macro.
>
> I think you mean something like this:
>
>    const char * const strbuf_slopbuf = "";

Rather, more like "const char slopbuf[1];"

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf
  2017-08-23 22:24           ` Junio C Hamano
@ 2017-08-23 22:39             ` Brandon Casey
  0 siblings, 0 replies; 64+ messages in thread
From: Brandon Casey @ 2017-08-23 22:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git@vger.kernel.org

On Wed, Aug 23, 2017 at 3:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>> On Wed, Aug 23, 2017 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Brandon Casey <drafnel@gmail.com> writes:
>>>
>>>> So is there any reason why didn't do something like the following in
>>>> the first place?
>>>
>>> My guess is that we didn't bother; if we cared, we would have used a
>>> single instance of const char in a read-only segment, instead of
>>> such a macro.
>>
>> I think you mean something like this:
>>
>>    const char * const strbuf_slopbuf = "";
>
> Rather, more like "const char slopbuf[1];"

You'd think that would be sufficient right?  Apparently gcc and clang
will only place a variable marked as const in the read-only section if
you initialize it with a constant too.  (gcc 4.9.2, clang 3.8.1 on
linux, clang 8.1.0 on osx)

I might have to adjust my stance on initializing global variables
moving forward :-)

-Brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v2 4/4] ThreadSanitizer: add suppressions
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
                     ` (2 preceding siblings ...)
  2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
@ 2017-08-21 17:43   ` Martin Ågren
  2017-08-25 17:04     ` Jeff King
  2017-08-28 20:56   ` [PATCH v2 0/4] Some ThreadSanitizer-results Jeff Hostetler
  4 siblings, 1 reply; 64+ messages in thread
From: Martin Ågren @ 2017-08-21 17:43 UTC (permalink / raw)
  Cc: git, Jeff King

Add a file .tsan-suppressions and list two functions in it: want_color()
and transfer_debug(). Both of these use the pattern

	static int foo = -1;
	if (foo < 0)
		foo = bar();

where bar always returns the same non-negative value. This can cause
ThreadSanitizer to diagnose a race when foo is written from two threads.
That is indeed a race, although it arguably doesn't matter in practice
since it's always the same value that is written.

Add NEEDSWORK-comments to the functions so that this problem is not
forever swept way under the carpet.

The suppressions-file is used by setting the environment variable
TSAN_OPTIONS to, e.g., "suppressions=$(pwd)/.tsan-suppressions". Observe
that relative paths such as ".tsan-suppressions" might not work.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v2: added NEEDSWORK; reworded the comments in the new file
 color.c            |  7 +++++++
 transport-helper.c |  7 +++++++
 .tsan-suppressions | 10 ++++++++++
 3 files changed, 24 insertions(+)
 create mode 100644 .tsan-suppressions

diff --git a/color.c b/color.c
index 7aa8b076f..9ccd954d6 100644
--- a/color.c
+++ b/color.c
@@ -338,6 +338,13 @@ static int check_auto_color(void)
 
 int want_color(int var)
 {
+	/*
+	 * NEEDSWORK: This function is sometimes used from multiple threads, and
+	 * we end up using want_auto racily. That "should not matter" since
+	 * we always write the same value, but it's still wrong. This function
+	 * is listed in .tsan-suppressions for the time being.
+	 */
+
 	static int want_auto = -1;
 
 	if (var < 0)
diff --git a/transport-helper.c b/transport-helper.c
index 8f68d69a8..f50b34df2 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1117,6 +1117,13 @@ int transport_helper_init(struct transport *transport, const char *name)
 __attribute__((format (printf, 1, 2)))
 static void transfer_debug(const char *fmt, ...)
 {
+	/*
+	 * NEEDSWORK: This function is sometimes used from multiple threads, and
+	 * we end up using debug_enabled racily. That "should not matter" since
+	 * we always write the same value, but it's still wrong. This function
+	 * is listed in .tsan-suppressions for the time being.
+	 */
+
 	va_list args;
 	char msgbuf[PBUFFERSIZE];
 	static int debug_enabled = -1;
diff --git a/.tsan-suppressions b/.tsan-suppressions
new file mode 100644
index 000000000..8c85014a0
--- /dev/null
+++ b/.tsan-suppressions
@@ -0,0 +1,10 @@
+# Suppressions for ThreadSanitizer (tsan).
+#
+# This file is used by setting the environment variable TSAN_OPTIONS to, e.g.,
+# "suppressions=$(pwd)/.tsan-suppressions". Observe that relative paths such as
+# ".tsan-suppressions" might not work.
+
+# A static variable is written to racily, but we always write the same value, so
+# in practice it (hopefully!) doesn't matter.
+race:^want_color$
+race:^transfer_debug$
-- 
2.14.1.151.gdfeca7a7e


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 4/4] ThreadSanitizer: add suppressions
  2017-08-21 17:43   ` [PATCH v2 4/4] ThreadSanitizer: add suppressions Martin Ågren
@ 2017-08-25 17:04     ` Jeff King
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff King @ 2017-08-25 17:04 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

On Mon, Aug 21, 2017 at 07:43:48PM +0200, Martin Ågren wrote:

> Add a file .tsan-suppressions and list two functions in it: want_color()
> and transfer_debug(). Both of these use the pattern
> 
> 	static int foo = -1;
> 	if (foo < 0)
> 		foo = bar();
> 
> where bar always returns the same non-negative value. This can cause
> ThreadSanitizer to diagnose a race when foo is written from two threads.
> That is indeed a race, although it arguably doesn't matter in practice
> since it's always the same value that is written.
> 
> Add NEEDSWORK-comments to the functions so that this problem is not
> forever swept way under the carpet.
> 
> The suppressions-file is used by setting the environment variable
> TSAN_OPTIONS to, e.g., "suppressions=$(pwd)/.tsan-suppressions". Observe
> that relative paths such as ".tsan-suppressions" might not work.

This looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2 0/4] Some ThreadSanitizer-results
  2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
                     ` (3 preceding siblings ...)
  2017-08-21 17:43   ` [PATCH v2 4/4] ThreadSanitizer: add suppressions Martin Ågren
@ 2017-08-28 20:56   ` Jeff Hostetler
  4 siblings, 0 replies; 64+ messages in thread
From: Jeff Hostetler @ 2017-08-28 20:56 UTC (permalink / raw)
  To: Martin Ågren
  Cc: git, Torsten Bögershausen, Johannes Sixt, Junio C Hamano,
	Jeff King, Stefan Beller



On 8/21/2017 1:43 PM, Martin Ågren wrote:
> This is the second version of my series to try to address some issues
> ... 
> 2) hashmap_add, which I could try my hands on if Jeff doesn't beat me to
> it -- his proposed change should fix it and I doubt I could come up with
> anything "better", considering he knows the code.

Now that I'm back from vacation, let me take another stab at this.

Jeff


^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2017-09-06 15:44 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
2017-08-15 14:17   ` Torsten Bögershausen
2017-08-15 14:29     ` Torsten Bögershausen
2017-08-15 14:40     ` Martin Ågren
2017-08-15 12:53 ` [PATCH 2/5] pack-objects: take lock before accessing `remaining` Martin Ågren
2017-08-15 19:50   ` Johannes Sixt
2017-08-15 12:53 ` [PATCH 3/5] Makefile: define GIT_THREAD_SANITIZER Martin Ågren
2017-08-15 12:53 ` [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer Martin Ågren
2017-08-15 18:43   ` Junio C Hamano
2017-08-15 19:06     ` Martin Ågren
2017-08-15 19:19       ` Junio C Hamano
2017-08-15 12:53 ` [PATCH 5/5] ThreadSanitizer: add suppressions Martin Ågren
2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
2017-08-15 17:59   ` Jeff Hostetler
2017-08-15 18:17     ` Stefan Beller
2017-08-15 18:40       ` Martin Ågren
2017-08-15 18:48         ` Stefan Beller
2017-08-15 19:21           ` Martin Ågren
2017-08-15 20:46             ` Jeff Hostetler
2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
2017-09-01 23:31       ` Johannes Schindelin
2017-09-01 23:50         ` Jonathan Nieder
2017-09-05 16:39           ` Jeff Hostetler
2017-09-05 17:13             ` Martin Ågren
2017-09-02  8:17         ` Jeff King
2017-09-04 15:59           ` Johannes Schindelin
2017-09-05 16:54           ` Jeff Hostetler
2017-09-06  3:43           ` Junio C Hamano
2017-09-05 16:33         ` Jeff Hostetler
2017-09-02  8:05       ` Jeff King
2017-09-05 17:07         ` Jeff Hostetler
2017-09-02  8:39       ` Simon Ruderich
2017-09-06  1:24       ` Junio C Hamano
2017-09-06 15:33         ` Jeff Hostetler
2017-09-06 15:43     ` [PATCH v2] hashmap: address ThreadSanitizer concerns Jeff Hostetler
2017-09-06 15:43       ` [PATCH v2] hashmap: add API to disable item counting when threaded Jeff Hostetler
2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
2017-08-15 17:35   ` Stefan Beller
2017-08-15 18:44     ` Martin Ågren
2017-08-17 10:57   ` Jeff King
2017-08-20 10:06 ` [PATCH/RFC 0/5] Some ThreadSanitizer-results Jeff King
2017-08-20 10:45   ` Martin Ågren
2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
2017-08-21 17:43   ` [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs Martin Ågren
2017-08-21 17:43   ` [PATCH v2 2/4] pack-objects: take lock before accessing `remaining` Martin Ågren
2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
2017-08-23 17:24     ` Junio C Hamano
2017-08-23 17:43       ` Martin Ågren
2017-08-23 18:30         ` Junio C Hamano
2017-08-23 20:37     ` Brandon Casey
2017-08-23 21:04       ` Junio C Hamano
2017-08-23 21:20         ` Brandon Casey
2017-08-23 21:54           ` Brandon Casey
2017-08-23 22:11             ` Brandon Casey
2017-08-24 16:52             ` Junio C Hamano
2017-08-24 18:29               ` Brandon Casey
2017-08-24 19:16                 ` Martin Ågren
2017-08-23 22:24           ` Junio C Hamano
2017-08-23 22:39             ` Brandon Casey
2017-08-21 17:43   ` [PATCH v2 4/4] ThreadSanitizer: add suppressions Martin Ågren
2017-08-25 17:04     ` Jeff King
2017-08-28 20:56   ` [PATCH v2 0/4] Some ThreadSanitizer-results Jeff Hostetler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).