git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Brandon Williams <bmwill@google.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Johannes Sixt <j6t@kdbg.org>,
	Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCHv3] attr: convert to new threadsafe API
Date: Tue, 18 Oct 2016 16:52:51 -0700	[thread overview]
Message-ID: <CAGZ79kbS4mP7sVTCM+QJXTwKsgZ40xvVDng-F3igZnJWLYek0A@mail.gmail.com> (raw)
In-Reply-To: <xmqq8ttr0wny.fsf@gitster.mtv.corp.google.com>

On Fri, Oct 14, 2016 at 8:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> *1* Would we need a wrapping struct around the array of results?
>
> By the way, I do see a merit on the "check" side (tl;dr: but I do
> not think "result" needs it, hence I do not see the need for the
> "ugly" variants).

So we'd rather go with const char **result instead of our own new struct there.
Ok, got it.

>
> Take "archive" for example.  For each path, it wants to see the
> attribute "export-ignore" to decide if it is to be omitted.  In
> addition, the usual set of attributes used to smudge blobs into the
> working tree representation are inspected by the convert.c API as
> part of its implementation of convert_to_working_tree().  This
> program has at least two sets of <"check", "result"> that are used
> by two git_check_attr() callsites that are unaware of each other.
>
> One of the optimizations we discussed is to trim down the attr-stack
> (which caches the attributes read from .gitattributes files that are
> in effect for the "last" directory that has the path for which
> attrbiutes are queried for) by reading/keeping only the entries that
> affect the attributes the caller is interested in.  But when there
> are multiple callsites that are interested in different sets of
> attributes, we obviously cannot do such an optimization without
> taking too much cache-invalidation hit.  Because these callsites are
> not unaware of each other, I do not think we can say "keep the
> entries that affects the union of all active callsites" very easily,
> even if it were possible.
>
> But we could tie this cache to "check", which keeps a constant
> subset of attributes that the caller is interested in (i.e. each
> callsite would keep its own cache that is useful for its query).
> While we are single-threaded, "struct git_attr_check" being a
> wrapping struct around the array of "what attributes are of
> interest?" is a good place to add that per-check attr-stack cache.
> When we go multi-threaded, the attr-stack cache must become
> per-thread, and needs to be moved to per-thread storage, and such a
> per-thread storage would have multiple attr-stack, one per "check"
> instance (i.e. looking up the attr-stack may have to say "who/what
> thread am I?" to first go to the thread-local storage for the
> current thread, where a table of pointers to attr-stacks is kept and
> from there, index into that table to find the attr-stack that
> corresponds to the particular "check").  We could use the address of
> "check" as the key into this table, but "struct git_attr_check" that
> wraps the array gives us another option to allocate a small
> consecutive integer every time initl() creates a new "check" and use
> it as the index into that attr-stack table, as that integer index
> can be in the struct that wraps the array of wanted attributes.
>
>         Note. none of the above is a suggestion to do the attr
>         caching the way exactly described.  The above is primarily
>         to illustrate how a wrapping struct may give us future
>         flexibility without affecting a single line of code in the
>         user of API.
>
> It may turn out that we do not need to have anything other than the
> array of wanted attributes in the "check" struct, but unlike
> "result", "check" is shared across threads, and do not have to live
> directly on the stack, so we can prepare for flexibility.
>
> I do not foresee a similar need for wrapping struct for "result",
> and given that we do want to keep the option of having them directly
> on the stack, I am inclined to say we shouldn't introduce one.
>
> If we were still to do the wrapping for result, I would say that
> basing it around the FLEX_ARRAY idiom, i.e.
>
>>         struct git_attr_result {
>>                 int num_slots;
>>                 const char *value[FLEX_ARRAY];
>>         };
>
> is a horrible idea.  It would be less horrible if it were
>
>         struct git_attr_result {
>                 int num_slots;
>                 const char **value;
>         };

So const char** but with an additional number of slots, all we do
would be to compare this number of slots to the checks number of slots and
die("BUG:..."),  which is just a burden and no help.

>
> then make the API user write via a convenience macro something like
> this
>
>         const char *result_values[NUM_ATTRS_OF_INTEREST];
>         struct git_attr_result result = {
>                 ARRAY_SIZE(result_values), &result_values
>         };
>
> instead.  That way, at least the side that implements git_check_attr()
> would not have to be type-unsafe like the example of ugliness in the
> message I am following-up on.

Ok I will reroll with the const char** instead of the macro stuff that
I came up with,
(that would be type safe though uglier than the pure variant).

  reply	other threads:[~2016-10-18 23:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-12 22:41 [PATCHv3] attr: convert to new threadsafe API Stefan Beller
2016-10-12 23:33 ` Junio C Hamano
2016-10-13 18:42   ` Stefan Beller
2016-10-13 22:08     ` Stefan Beller
2016-10-14 15:37   ` Junio C Hamano
2016-10-18 23:52     ` Stefan Beller [this message]
2016-10-19  0:06       ` Junio C Hamano
2016-10-19  0:20         ` Stefan Beller
2016-10-19  0:40           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kbS4mP7sVTCM+QJXTwKsgZ40xvVDng-F3igZnJWLYek0A@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=jacob.keller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).