unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Florian Weimer via Libc-alpha <libc-alpha@sourceware.org>
To: Huang Shijie <shijie@os.amperecomputing.com>
Cc: zwang@amperecomputing.com,
	Huang Shijie via Libc-alpha <libc-alpha@sourceware.org>,
	patches@amperecomputing.com
Subject: Re: [PATCH] Add LD_NUMA_REPLICATION for glibc
Date: Fri, 10 Sep 2021 13:01:46 +0200	[thread overview]
Message-ID: <878s04k82t.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <YTnfnSpCNJK+ZO/Y@hsj> (Huang Shijie's message of "Thu, 9 Sep 2021 10:19:09 +0000")

* Huang Shijie:

> Hi Florian,
> On Fri, Sep 03, 2021 at 08:28:57AM +0200, Florian Weimer wrote:
>> * Huang Shijie via Libc-alpha:
>> 
>> > This patch adds LD_NUMA_REPLICATION which influences the linkage of shared libraries at run time.
>> >
>> > If LD_NUMA_REPLICATION is set for program foo like this:
>> > 	#LD_NUMA_REPLICATION=1 ./foo
>> >
>> > At the time ld.so mmaps the shared libraries, it will uses
>> > 	mmap(, c->prot | PROT_WRITE, MAP_COPY | MAP_FILE | MAP_POPULATE,)
>> > for them, and the mmap will trigger COW(copy on write) for the shared
>> > libraries at the NUMA node which the program `foo` runs. After the
>> > COW, the foo will have a copy of the shared library segment(mmap
>> > covered) which belong to the same NUMA node.
>> >
>> > So when enable LD_NUMA_REPLICATION, it will consume more memory,
>> > but it will reduce the remote-access in NUMA.
>> 
>> I think the kernel could do this in a much better way, avoiding
>> duplicating the pages within the same NUMA node.
>
> https://marc.info/?l=linux-kernel&m=163070220429222&w=2
> Since Linus did not think it a good choice to do it in kernel,
> glibc is the only place to do it now.
> So could you please re-evaluate this patch?

The name of the environment variable is quite misleading.  It should
refer to MAP_POPULATE, not NUMA.

As far as I can tell, it does not necessarily have the desired effect
for multi-threaded applications (if some threads end up running on other
NUMA nodes).

And it would be helpful to have some performance numbers.

And I wonder if a FUSE file system could do better, by making one
backing copy per NUMA node instead of one copy per process.

>> The other issue is the temporary RWX mapping, which does not
>> interoperate well with some security hardening features.
> Could you please tell me in detail? I am confused at it.

Some environments block mapping files with PROT_WRITE | PROT_EXEC.

Thanks,
Florian


  reply	other threads:[~2021-09-10 11:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 12:14 [PATCH] Add LD_NUMA_REPLICATION for glibc Huang Shijie via Libc-alpha
2021-09-03  6:28 ` Florian Weimer via Libc-alpha
2021-09-03 15:15   ` Huang Shijie via Libc-alpha
2021-09-03 22:16     ` Song Bao Hua (Barry Song) via Libc-alpha
2021-09-06  9:14       ` Huang Shijie via Libc-alpha
2021-09-09 10:19   ` Huang Shijie via Libc-alpha
2021-09-10 11:01     ` Florian Weimer via Libc-alpha [this message]
2021-09-13 14:40       ` Huang Shijie via Libc-alpha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878s04k82t.fsf@oldenburg.str.redhat.com \
    --to=libc-alpha@sourceware.org \
    --cc=fweimer@redhat.com \
    --cc=patches@amperecomputing.com \
    --cc=shijie@os.amperecomputing.com \
    --cc=zwang@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).