git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	Brandon Williams <bmwill@google.com>,
	Duy Nguyen <pclouds@gmail.com>
Subject: Re: [PATCH 28/28] attr: convert to new threadsafe API
Date: Tue, 11 Oct 2016 11:56:04 -0700	[thread overview]
Message-ID: <CAGZ79kbdvs88Gt_XW801DPs7rj8kvcfPqCV7kjL_Jz9XiVn7fg@mail.gmail.com> (raw)
In-Reply-To: <xmqqa8eaagoq.fsf@gitster.mtv.corp.google.com>

On Tue, Oct 11, 2016 at 11:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> I find this description a bit confusing.  At least the way I
>>> envisioned was that when this piece of code is run by multiple
>>> people at the same time,
>>>
>>>         static struct git_attr_check *check = NULL;
>>>         git_attr_check_initl(&check, ...);
>>>
>>> we won't waste the "check" by allocated by the first one by
>>> overwriting it with another one allocated by the second one.  So
>>> "the same arguments" does not come into the picture.  A single
>>> variable is either
>>>
>>>  * already allocated and initialised by the an earlier call to
>>>    initl() by somebody else, or
>>>
>>>  * originally NULL when you call initl(), and the implementation
>>>    makes sure that other people wait while you allocate, initialise
>>>    and assign it to the variable, or
>>>
>>>  * originally NULL when you call initl(), but the implementation
>>>    notices that somebody else is doing the second bullet point
>>>    above, and you wait until that somebody else finishes and then
>>>    you return without doing anything (because by that time, "check"
>>>    is filled by that other party doing the second bullet point
>>>    above).
>>>
>>> There is no need to check for "the same arguments".
>>>
>>
>> I see. So we assume that there are no different arguments at the same time,
>> i.e. all threads run the same code when it comes to attrs.
>
> Sorry, but I fail to see how you can jump to that conclusion.
> Puzzled.
>
> You can have many different callsites (think: hits "git grep" finds)
> that call git_attr_check_initl() and they all may be asking for
> different set of attributes.  As long as they are using different
> "check" instance to receive these sets of attributes, they are OK.

Right, but that requires a mutex per callsite; up to now I imagined
a global mutex only, which is how I came to the conclusion.

>
> It is insane to use the same "check" variable to receive sets of
> attributes for different attributes,

I agree.

> be it from the same call or
> different one, it is insane to do this:
>
>         func(char *anotherattr)
>         {
>                 static struct git_attr_check *check = NULL;
>                 git_attr_check_initl(&check, "one", anotherattr, ...);
>
>                 ... use "check" to ask question ...
>         }
>
> The whole point of having a static, initialize-once instance of
> "check" is so that initl() can do potentially heavy preparation just
> once and keep reusing it.  Allowing a later caller of func() to pass
> a value of anotherattr that is different from the one used in the
> first call that did cause initl() to allocate-initialise-assign to
> "check" is simply insane, even there is no threading issue.

I was imagining a file.c like that:

static struct git_attr_check *check = NULL;

void one_func()
{
    git_attr_check_initl(&check, "one", ...);
    ...
}

void two_func()
{
    git_attr_check_initl(&check, "two", ...);
    ...
}


int foo_cmd(const char *prefix int argc, char **argv)
{
    foreach_path(get_paths(...))
        one_func();
    check = NULL;
    foreach_path(get_paths(...))
        two_func();
}

This is correct single threaded code, but as soon as you want to
put phase one,two into threads, as they can be parallelized, this
goes horribly wrong.


>
> And in a threaded environment it is even worse; the first thread may
> call initl() to get one set of attributes in "check" and it may be
> about to ask the question, while the second call may call initl()
> and by your definition it will notice they have different sets of
> attributes and returns different "check"?  Either the earlier one is
> leaked, or it gets destroyed even though the first thread hasn't
> finished with "check" it got.
>
> It is perfectly OK to drop "static" from the above example code.
> Then it no longer is insane--it is perfectly normal code whose
> inefficiency cannot be avoided because it wants to do dynamic
> queries.

I think we had a misunderstanding here, as I was just assuming a
single mutex later on.

  reply	other threads:[~2016-10-11 18:56 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-11  0:20 [PATCH 00/28] Revamping the attr subsystem! Stefan Beller
2016-10-11  0:20 ` [PATCH 01/28] commit.c: use strchrnul() to scan for one line Stefan Beller
2016-10-11  0:20 ` [PATCH 02/28] attr.c: " Stefan Beller
2016-10-11  0:20 ` [PATCH 03/28] attr.c: update a stale comment on "struct match_attr" Stefan Beller
2016-10-11  0:20 ` [PATCH 04/28] attr.c: explain the lack of attr-name syntax check in parse_attr() Stefan Beller
2016-10-11  0:20 ` [PATCH 05/28] attr.c: complete a sentence in a comment Stefan Beller
2016-10-11  0:20 ` [PATCH 06/28] attr.c: mark where #if DEBUG ends more clearly Stefan Beller
2016-10-11  0:20 ` [PATCH 07/28] attr.c: simplify macroexpand_one() Stefan Beller
2016-10-11  0:20 ` [PATCH 08/28] attr.c: tighten constness around "git_attr" structure Stefan Beller
2016-10-11  0:20 ` [PATCH 09/28] attr.c: plug small leak in parse_attr_line() Stefan Beller
2016-10-11  0:20 ` [PATCH 10/28] attr: rename function and struct related to checking attributes Stefan Beller
2016-10-11  0:20 ` [PATCH 11/28] attr: (re)introduce git_check_attr() and struct git_attr_check Stefan Beller
2016-10-11 16:59   ` Brandon Williams
2016-10-11 17:42     ` Junio C Hamano
2016-10-11  0:20 ` [PATCH 12/28] attr: convert git_all_attrs() to use "struct git_attr_check" Stefan Beller
2016-10-11  0:21 ` [PATCH 13/28] attr: convert git_check_attrs() callers to use the new API Stefan Beller
2016-10-11  0:21 ` [PATCH 14/28] attr: retire git_check_attrs() API Stefan Beller
2016-10-11  0:21 ` [PATCH 15/28] attr: add counted string version of git_check_attr() Stefan Beller
2016-10-11  0:21 ` [PATCH 16/28] attr: add counted string version of git_attr() Stefan Beller
2016-10-11  0:21 ` [PATCH 17/28] attr: expose validity check for attribute names Stefan Beller
2016-10-11 17:28   ` Brandon Williams
2016-10-11 18:28     ` Stefan Beller
2016-10-11 18:40       ` Brandon Williams
2016-10-11 18:44         ` Stefan Beller
2016-10-11 18:49           ` Brandon Williams
2016-10-11  0:21 ` [PATCH 18/28] attr: support quoting pathname patterns in C style Stefan Beller
2016-10-11  0:21 ` [PATCH 19/28] attr.c: add push_stack() helper Stefan Beller
2016-10-11  0:21 ` [PATCH 20/28] attr.c: pass struct git_attr_check down the callchain Stefan Beller
2016-10-11  0:21 ` [PATCH 21/28] attr.c: rename a local variable check Stefan Beller
2016-10-11  0:21 ` [PATCH 22/28] attr.c: correct ugly hack for git_all_attrs() Stefan Beller
2016-10-11  0:21 ` [PATCH 23/28] attr.c: introduce empty_attr_check_elems() Stefan Beller
2016-10-11  0:21 ` [PATCH 24/28] attr.c: always pass check[] to collect_some_attrs() Stefan Beller
2016-10-11  0:21 ` [PATCH 25/28] attr.c: outline the future plans by heavily commenting Stefan Beller
2016-10-11  0:21 ` [PATCH 26/28] attr: make git_attr_counted static Stefan Beller
2016-10-11 17:37   ` Brandon Williams
2016-10-11 21:53     ` Stefan Beller
2016-10-11  0:21 ` [PATCH 27/28] attr: make git_check_attr_counted static Stefan Beller
2016-10-11  0:21 ` [PATCH 28/28] attr: convert to new threadsafe API Stefan Beller
2016-10-11 17:40   ` Junio C Hamano
2016-10-11 17:56     ` Stefan Beller
2016-10-11 18:23       ` Junio C Hamano
2016-10-11 18:56         ` Stefan Beller [this message]
2016-10-11 19:47           ` Junio C Hamano
2016-10-11 17:45   ` Brandon Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kbdvs88Gt_XW801DPs7rj8kvcfPqCV7kjL_Jz9XiVn7fg@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).