git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Git List" <git@vger.kernel.org>,
	"Diomidis Spinellis" <dds@aueb.gr>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	demerphq <demerphq@gmail.com>,
	"Mario Grgic" <mario_grgic@hotmail.com>,
	"D. Ben Knoble" <ben.knoble@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH] userdiff: support regexec(3) with multi-byte support
Date: Fri, 7 Apr 2023 12:56:00 +0200	[thread overview]
Message-ID: <a548ca26-2766-1a95-4928-c22063d5d890@kdbg.org> (raw)
In-Reply-To: <39eb2a9f-83e0-449e-1157-152c43d49b48@web.de>

Am 07.04.23 um 09:49 schrieb René Scharfe:
> Am 07.04.23 um 00:35 schrieb Johannes Sixt:
>> This is not equivalent. The original treated a sequence of non-ASCII
>> characters as a word. The new version treats each individual non-space
>> character (both ASCII and non-ASCII) as a word.
> 
> I assume you mean "The original treated [a single non-space as well as]
> a sequence of non-ASCII characters [making up a single multi-byte
> character] as a word.".  That works as intended by 664d44ee7f (userdiff:
> simplify word-diff safeguard, 2011-01-11).

I misread the original RE. I thought it would lump multiple multi-byte
characters together into one word, but it does not; sorry for that. It
looks like your suggested replacement is behaviorally identical to the
original after all, except perhaps for this one:

> The new one doesn't match multi-byte whitespace anymore.

but I did not find a reference that confirms it. I don't think we need
to bend over backwards to keep this compatibility, though.

-- Hannes


  reply	other threads:[~2023-04-07 10:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-29 22:55 regex compilation error with --color-words Eric Sunshine
2023-03-30  7:55 ` Diomidis Spinellis
2023-03-31 20:44   ` René Scharfe
2023-04-02  9:44     ` René Scharfe
2023-04-03 16:29       ` Junio C Hamano
2023-04-03 19:32         ` René Scharfe
2023-04-06 20:19           ` [PATCH] userdiff: support regexec(3) with multi-byte support René Scharfe
2023-04-06 22:35             ` Johannes Sixt
2023-04-07  7:49               ` René Scharfe
2023-04-07 10:56                 ` Johannes Sixt [this message]
2023-04-07 14:41             ` D. Ben Knoble
2023-04-07 16:02               ` Junio C Hamano
2023-04-07 17:23             ` Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a548ca26-2766-1a95-4928-c22063d5d890@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=avarab@gmail.com \
    --cc=ben.knoble@gmail.com \
    --cc=dds@aueb.gr \
    --cc=demerphq@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=mario_grgic@hotmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).