unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Zack Weinberg <zackw@panix.com>
To: "Carlos O'Donell" <carlos@redhat.com>
Cc: Joseph Myers <joseph@codesourcery.com>,
	GNU C Library <libc-alpha@sourceware.org>
Subject: Re: [PATCH v4] Use a proper C tokenizer to implement the obsolete typedefs test.
Date: Thu, 14 Mar 2019 09:21:34 -0400	[thread overview]
Message-ID: <CAKCAbMhuP7-GXfd6bnw12Ecu9EMEEcXhbj__NCpm3pY5qMB9sA@mail.gmail.com> (raw)
In-Reply-To: <b27ba9aa-c809-ba50-29e8-0d799ec44be1@redhat.com>

On Thu, Mar 14, 2019 at 9:00 AM Carlos O'Donell <carlos@redhat.com> wrote:
>
> On 3/13/19 6:16 PM, Joseph Myers wrote:
> > I'm seeing failures from build-many-glibcs.py for
> > resource/check-obsolete-constructs:
> >
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3198: ordinal not in range(128)
> >
> > This is with LC_ALL=C (and bits/resource.h headers containing UTF-8 µ in a
> > comment).

This did not happen in my build-many-glibcs run, possibly because I’m
running it in a UTF-8 locale.  Should build-many-glibcs perhaps be
setting LC_ALL=C for all subprocesses?

As an immediate fix, I am going to commit a patch to
check-obsolete-constructs that specifies encoding="utf-8" since that’s
what we have in header files right now.

> > There is also a case that the encoding specified should be
> > ASCII - that installed headers should be required to be pure ASCII so they
> > can be included in source files with any ASCII-compatible character set if
> > compiling with -finput-charset= (which affects included headers as well as
> > the main source file, so compiling "#include <sys/resource.h>" with
> > -finput-charset=ascii currently fails).
>
> Do we have a requirement that #incldue <sys/resources.h> be compilable with
> -finput-charset=ascii?

I think a requirement that our installed header files be compilable
with *any* valid setting of -finput-charset= by application Makefiles
is reasonable (or, in other words, all installed header files should
use only the basic source character set).  This is technically a
stronger constraint than requiring -finput-charset=ascii to work, but
in practice I think testing against -finput-charset=ascii would be
sufficient.

I think it’s a bug in GCC that -finput-charset=ascii causes an error
for non-ASCII characters inside comments, but there have been so many
releases with that bug that we have to cope.

A counterargument is that clang apparently only implements
-finput-charset=utf-8; *any other value* is rejected.  That this was
considered adequate Makefile compatibility for the feature, strongly
suggests that nobody is using any other extended source character set
and we should be OK to continue using UTF-8 in installed headers, at
least in comments.

Whatever we do should be enforced by some test or other.  It might be
more appropriate to add it to check-installed-headers.sh than
check-obsolete-constructs.py, though.

zw

  reply	other threads:[~2019-03-14 13:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-11 14:59 [PATCH v4] Use a proper C tokenizer to implement the obsolete typedefs test Zack Weinberg
2019-03-11 18:57 ` Carlos O'Donell
2019-03-12  0:59   ` Zack Weinberg
2019-03-12  3:47     ` Carlos O'Donell
2019-03-13 13:47       ` Zack Weinberg
2019-03-13 22:16         ` Joseph Myers
2019-03-14 13:00           ` Carlos O'Donell
2019-03-14 13:21             ` Zack Weinberg [this message]
2019-03-14 18:06               ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKCAbMhuP7-GXfd6bnw12Ecu9EMEEcXhbj__NCpm3pY5qMB9sA@mail.gmail.com \
    --to=zackw@panix.com \
    --cc=carlos@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).