From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Brandon Williams <bmwill@google.com>,
"git@vger.kernel.org" <git@vger.kernel.org>,
Johannes Sixt <j6t@kdbg.org>,
Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCHv3] attr: convert to new threadsafe API
Date: Tue, 18 Oct 2016 16:52:51 -0700 [thread overview]
Message-ID: <CAGZ79kbS4mP7sVTCM+QJXTwKsgZ40xvVDng-F3igZnJWLYek0A@mail.gmail.com> (raw)
In-Reply-To: <xmqq8ttr0wny.fsf@gitster.mtv.corp.google.com>
On Fri, Oct 14, 2016 at 8:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> *1* Would we need a wrapping struct around the array of results?
>
> By the way, I do see a merit on the "check" side (tl;dr: but I do
> not think "result" needs it, hence I do not see the need for the
> "ugly" variants).
So we'd rather go with const char **result instead of our own new struct there.
Ok, got it.
>
> Take "archive" for example. For each path, it wants to see the
> attribute "export-ignore" to decide if it is to be omitted. In
> addition, the usual set of attributes used to smudge blobs into the
> working tree representation are inspected by the convert.c API as
> part of its implementation of convert_to_working_tree(). This
> program has at least two sets of <"check", "result"> that are used
> by two git_check_attr() callsites that are unaware of each other.
>
> One of the optimizations we discussed is to trim down the attr-stack
> (which caches the attributes read from .gitattributes files that are
> in effect for the "last" directory that has the path for which
> attrbiutes are queried for) by reading/keeping only the entries that
> affect the attributes the caller is interested in. But when there
> are multiple callsites that are interested in different sets of
> attributes, we obviously cannot do such an optimization without
> taking too much cache-invalidation hit. Because these callsites are
> not unaware of each other, I do not think we can say "keep the
> entries that affects the union of all active callsites" very easily,
> even if it were possible.
>
> But we could tie this cache to "check", which keeps a constant
> subset of attributes that the caller is interested in (i.e. each
> callsite would keep its own cache that is useful for its query).
> While we are single-threaded, "struct git_attr_check" being a
> wrapping struct around the array of "what attributes are of
> interest?" is a good place to add that per-check attr-stack cache.
> When we go multi-threaded, the attr-stack cache must become
> per-thread, and needs to be moved to per-thread storage, and such a
> per-thread storage would have multiple attr-stack, one per "check"
> instance (i.e. looking up the attr-stack may have to say "who/what
> thread am I?" to first go to the thread-local storage for the
> current thread, where a table of pointers to attr-stacks is kept and
> from there, index into that table to find the attr-stack that
> corresponds to the particular "check"). We could use the address of
> "check" as the key into this table, but "struct git_attr_check" that
> wraps the array gives us another option to allocate a small
> consecutive integer every time initl() creates a new "check" and use
> it as the index into that attr-stack table, as that integer index
> can be in the struct that wraps the array of wanted attributes.
>
> Note. none of the above is a suggestion to do the attr
> caching the way exactly described. The above is primarily
> to illustrate how a wrapping struct may give us future
> flexibility without affecting a single line of code in the
> user of API.
>
> It may turn out that we do not need to have anything other than the
> array of wanted attributes in the "check" struct, but unlike
> "result", "check" is shared across threads, and do not have to live
> directly on the stack, so we can prepare for flexibility.
>
> I do not foresee a similar need for wrapping struct for "result",
> and given that we do want to keep the option of having them directly
> on the stack, I am inclined to say we shouldn't introduce one.
>
> If we were still to do the wrapping for result, I would say that
> basing it around the FLEX_ARRAY idiom, i.e.
>
>> struct git_attr_result {
>> int num_slots;
>> const char *value[FLEX_ARRAY];
>> };
>
> is a horrible idea. It would be less horrible if it were
>
> struct git_attr_result {
> int num_slots;
> const char **value;
> };
So const char** but with an additional number of slots, all we do
would be to compare this number of slots to the checks number of slots and
die("BUG:..."), which is just a burden and no help.
>
> then make the API user write via a convenience macro something like
> this
>
> const char *result_values[NUM_ATTRS_OF_INTEREST];
> struct git_attr_result result = {
> ARRAY_SIZE(result_values), &result_values
> };
>
> instead. That way, at least the side that implements git_check_attr()
> would not have to be type-unsafe like the example of ugliness in the
> message I am following-up on.
Ok I will reroll with the const char** instead of the macro stuff that
I came up with,
(that would be type safe though uglier than the pure variant).
next prev parent reply other threads:[~2016-10-18 23:52 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-12 22:41 [PATCHv3] attr: convert to new threadsafe API Stefan Beller
2016-10-12 23:33 ` Junio C Hamano
2016-10-13 18:42 ` Stefan Beller
2016-10-13 22:08 ` Stefan Beller
2016-10-14 15:37 ` Junio C Hamano
2016-10-18 23:52 ` Stefan Beller [this message]
2016-10-19 0:06 ` Junio C Hamano
2016-10-19 0:20 ` Stefan Beller
2016-10-19 0:40 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGZ79kbS4mP7sVTCM+QJXTwKsgZ40xvVDng-F3igZnJWLYek0A@mail.gmail.com \
--to=sbeller@google.com \
--cc=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
--cc=jacob.keller@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).