git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "D. Ben Knoble" <ben.knoble@gmail.com>
To: Jeff King <peff@peff.net>
Cc: demerphq <demerphq@gmail.com>, git@vger.kernel.org
Subject: Re: grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f)
Date: Fri, 3 Feb 2023 12:06:55 -0500	[thread overview]
Message-ID: <CALnO6CAfKA2atXwHXGJxnBGJ46EMbjGmFU54mb4FJL2O8ceXyQ@mail.gmail.com> (raw)
In-Reply-To: <Y908f2qaxKeljtj+@coredump.intra.peff.net>

On Fri, Feb 3, 2023 at 11:55 AM Jeff King <peff@peff.net> wrote:
> Just a guess, but does calling:
>
>   setlocale(LC_CTYPE, "");
>
> at the start of the program change things (you'll probably need to also
> include locale.h)?

Indeed, the new output is

    illegal byte sequence

For the following program

    #include <regex.h>
    #include <assert.h>
    #include <stddef.h>
    #include <stdio.h>
    #include <locale.h>

    int main(int argc, char **argv) {
        char *loc = setlocale(LC_CTYPE, "");
        assert (loc != NULL);
        regex_t re;
        int ret = regcomp(&re, "[\xc0-\xff][\x80-\xbf]+", REG_EXTENDED
| REG_NEWLINE);
        /* assert(ret != 0); */
        size_t errbuf_size = regerror(ret, &re, NULL, 0);
        char errbuf[errbuf_size];
        regerror(ret, &re, errbuf, errbuf_size);
        printf("%s\n", errbuf);
    }

My own locale output, for completion's sake:

    LANG="fr_FR.UTF-8"
    LC_COLLATE="fr_FR.UTF-8"
    LC_CTYPE="fr_FR.UTF-8"
    LC_MESSAGES="fr_FR.UTF-8"
    LC_MONETARY="fr_FR.UTF-8"
    LC_NUMERIC="fr_FR.UTF-8"
    LC_TIME="fr_FR.UTF-8"
    LC_ALL="fr_FR.UTF-8"


-- 
D. Ben Knoble

      reply	other threads:[~2023-02-03 17:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-01 15:18 grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f) D. Ben Knoble
2023-02-01 16:09 ` demerphq
2023-02-01 16:21   ` D. Ben Knoble
2023-02-01 18:23     ` demerphq
2023-02-01 18:54       ` Junio C Hamano
2023-02-01 21:33         ` D. Ben Knoble
2023-02-01 21:34           ` D. Ben Knoble
2023-02-01 22:15           ` Junio C Hamano
2023-02-01 23:03   ` Jeff King
2023-02-02 16:22     ` demerphq
2023-02-02 20:49       ` D. Ben Knoble
2023-02-03 17:01       ` Jeff King
2023-02-03 21:56         ` Ævar Arnfjörð Bjarmason
2023-02-04 11:17           ` Jeff King
2023-02-04 11:32         ` demerphq
2023-02-05 19:51           ` D. Ben Knoble
2023-02-07 18:23             ` Jeff King
2023-02-07 22:27               ` D. Ben Knoble
2023-02-07 18:19           ` Jeff King
2023-02-02 20:47     ` D. Ben Knoble
2023-02-03 16:55       ` Jeff King
2023-02-03 17:06         ` D. Ben Knoble [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALnO6CAfKA2atXwHXGJxnBGJ46EMbjGmFU54mb4FJL2O8ceXyQ@mail.gmail.com \
    --to=ben.knoble@gmail.com \
    --cc=demerphq@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).