unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Guillaume Morin <guillaume@morinfr.org>
To: Siddhesh Poyarekar <siddhesh@gotplt.org>
Cc: fweimer@redhat.com, Siddhesh Poyarekar <siddhesh@sourceware.org>,
	Guillaume Morin <guillaume@morinfr.org>,
	libc-alpha@sourceware.org
Subject: Re: [PATCH v8 03/10] Remove __morecore and __default_morecore
Date: Wed, 14 Jul 2021 18:42:05 +0200	[thread overview]
Message-ID: <20210714164204.GA1103@bender.morinfr.org> (raw)
In-Reply-To: <45ba79b4-02e6-fae5-ae9b-db6c8a01aecb@gotplt.org>

On 14 Jul 19:43, Siddhesh Poyarekar wrote:
> > But basically, this breaks https://github.com/libhugetlbfs/libhugetlbfs/
> > which is old (started in 2006) and commonly used library.  The library
> > is plugging its own morecore() implementation to use hugetlb pages to
> > back the malloc heap.
> Thanks, AFAICT, libhugetlbfs only uses __morecore and not the rest of the
> interfaces.  The thing is, __morecore and friends have been deprecated for a
> year and building anything with __morecore ought to give the deprecation
> warning.

Probably. But you surely know that most people get glibc from their
distributions and do not build their own glibc. My main box is running
Debian stable for example. The next release (bullseye) seems like it'll
be shipping 2.31 (where it's not deprecated yet, I believe). RHEL8 ships
with a libhugetlbfs rpm (so users do not need to build it at all unlike
Debian), which will become useless with this change! Most users will
never have seen the deprecation warning before they see the actual
removal.

That said, I was personally aware since it was pointed out on a
libhugetlbfs github issue that it was deprecated in 2.32. I think the
hope was it would be replaced by something usable, and users would not
be left with *no* solution (I was also quietly hoping Eric Munson, the
libhugetlbfs maintainer would be reaching out). But if you remove it, I
am not quite sure what you expect libhugetlbfs users to do, really.

> I don't think we would like to export a different interface to do the same
> thing because it still doesn't allow us to clamp down further on the system
> malloc implementation.  Besides, the morecore hooking interface is severely
> limited because it only affects the main arena. Non-main arenas have their
> own allocation techniques and it doesn't really make sense to have an
> interface to control just the main arena.

Well, I do not see why morecore() does not make sense as a plugin. I am
not seeing any technical argument supporting that it will prevent
clamping down the implementation provided you document the currently
acceptable semantics for this interface.

libhugetlbfs users are aware this only works in the main arena. I
certainly agree this is not ideal and this is definitely something that
could be fixed as well. But again, keep in mind people have been using
this for 15+ years.

Obviously, if you're arguing to extend the glibc hookability so it works
for all arenas, I am all ears, certainly would help out and would be a
day one user :-)

> There is only the one known issue that you reported but that may well be
> because the interface isn't well tested.  There have been many changes to
> malloc over the years and I cannot say for certain that the idea that
> morecore could be anything other than brk has been consistently considered.
> We don't have tests to verify that either. 

I guess you mean sbrk() above and not brk().

Again, you seem to be ignoring the 15 years of people using
libhugetblfs. It's been tested extensively by users. I certainly agree
it should get testing in the glibc testsuite, but it's actually trivial
afaict:

- morecore() in effect has very simple semantics. It takes a size and
  returns a pointer to the beginning of the new area. The only thing
  that differs here from sbrk() is the fact that a new area might not be
  contiguous with the previous segment.  But that's supported in the
  malloc code, otherwise libhugetlbfs morecore() would never have worked
  (that's the MORECORE_CONTIGUOUS handling in the malloc code). That's
  really _all_ you need to test. And I would presume non contiguous
  heaps are implemented not just for morecore() sake (not sure though).
  If that's case, no actual specific testing needs to be done.

- If you _only_ allow morecore interposition through a DSO (as I was
  suggesting), you do _not_ even need to support non contiguous heaps
  (at least for morecore()) since at that point it becomes guaranteed
  that morecore() will always point to the same implementation. Then
  there is _nothing_ to test at that point. Proper morecore()
  implementations can semantically behave exactly like sbrk!

- A morecore2() hook that takes the old top of the heap as well as the
  size would also allow libhugetlbfs() to implement a perfectly
  continuous heap and so would semantically behave like sbrk().  But it
  does not look like it's being considered here :-)

> Further, morecore limits its influence to the main arena and it is
> possible that future changes could break this in ways that we cannot
> anticipate today.
 
As I said, we're aware and it not ideal :-). But if you're using single
threaded software, that covers all allocations under the mmap threshold
(which is tunable). So it's _very_ easy to make this hook usable for a lot
of software. The limitations are not such a big deal (but again, I'd prefer
if they were not there).

> Basically this is technical debt we've accumulated from the days when malloc
> assumed single threaded programs and was incrementally hacked to add in
> multi-threaded support.  We'd like to clean this up.

Honestly from my PoV, some usual feature was added a while ago. The rest
of the code was changed without paying attention to the feature (bug
report ignored for 5 years as well). And you're using this as an
argument to remove it.... I don't think this is tech debt at all, the
feature is simply paying the price of being ignored. It does _seem_ to
me that little effort have been made to see if it was workable at all.
 
> The malloc code in glibc is quite general purpose and has little dependency
> on libc internals.  It may well be possible for you to copy the source files
> over into libhugetlbfs to implement the interposer within libhugetlbfs that
> does exactly what you need.  In fact, that may even allow you to extend
> hugetlbfs support to non-main arenas.

To be clear, I am not the libhugetblfs maintainer, just a user (that
investigated the trimming issue). I think everybody is aware they can
fork the glibc malloc implementation.  Surely you realize that it's not
ideal but even feasible for most users.

Guillaume.

-- 
Guillaume Morin <guillaume@morinfr.org>

  reply	other threads:[~2021-07-14 16:45 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13  7:38 [PATCH v8 00/10] malloc hooks removal Siddhesh Poyarekar via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 01/10] Make mcheck tests conditional on GLIBC_2.24 or earlier Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:03   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 02/10] Remove __after_morecore_hook Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:03   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 03/10] Remove __morecore and __default_morecore Siddhesh Poyarekar via Libc-alpha
2021-07-14  7:01   ` Siddhesh Poyarekar
2021-07-14 12:54     ` Guillaume Morin
2021-07-14 14:13       ` Siddhesh Poyarekar
2021-07-14 16:42         ` Guillaume Morin [this message]
2021-07-14 17:15           ` Carlos O'Donell via Libc-alpha
2021-07-14 17:42             ` Adhemerval Zanella via Libc-alpha
2021-07-14 18:37               ` Guillaume Morin
2021-07-14 18:48               ` Siddhesh Poyarekar
2021-07-14 18:31             ` Guillaume Morin
2021-07-14 17:32           ` Siddhesh Poyarekar
2021-07-14 18:25             ` Guillaume Morin
2021-07-14 18:43               ` Siddhesh Poyarekar
2021-07-14 18:51                 ` Guillaume Morin
2021-07-17 22:03   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 04/10] Move malloc hooks into a compat DSO Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 05/10] mcheck: Wean away from malloc hooks Siddhesh Poyarekar via Libc-alpha
2021-07-13 15:47   ` H.J. Lu via Libc-alpha
2021-07-14  2:44     ` Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 06/10] Simplify __malloc_initialized Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 07/10] mtrace: Wean away from malloc hooks Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 08/10] glibc.malloc.check: " Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 09/10] Remove " Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13  7:38 ` [PATCH v8 10/10] mcheck Fix malloc_usable_size [BZ #22057] Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:04   ` Carlos O'Donell via Libc-alpha
2021-07-13 14:48 ` [PATCH v8 00/10] malloc hooks removal H.J. Lu via Libc-alpha
2021-07-13 15:41   ` Siddhesh Poyarekar via Libc-alpha
2021-07-17 22:03 ` Carlos O'Donell via Libc-alpha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210714164204.GA1103@bender.morinfr.org \
    --to=guillaume@morinfr.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=siddhesh@gotplt.org \
    --cc=siddhesh@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).