git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: phillip.wood123@gmail.com
To: Jeff King <peff@peff.net>, phillip.wood@dunelm.org.uk
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"René Scharfe" <l.s.r@web.de>,
	git@vger.kernel.org, "Dragan Simic" <dsimic@manjaro.org>,
	"Kristoffer Haugsbakk" <code@khaugsbakk.name>,
	"Manlio Perillo" <manlio.perillo@gmail.com>
Subject: Re: [PATCH 11/15] find multi-byte comment chars in unterminated buffers
Date: Tue, 12 Mar 2024 14:36:36 +0000	[thread overview]
Message-ID: <9b63eab9-7b6d-4024-9ec4-b58c5ca3b084@gmail.com> (raw)
In-Reply-To: <20240312081931.GC47852@coredump.intra.peff.net>

Hi Peff

On 12/03/2024 08:19, Jeff King wrote:
> On Fri, Mar 08, 2024 at 04:20:12PM +0000, Phillip Wood wrote:
>> We could certainly leave it as-is and tell users they are only hurting
>> themselves if they complain when it does not work.
> 
> That was mostly my plan. To some degree I think this is orthogonal to my
> series. You can already set core.commentChar to space or newline, and
> I'm sure the results are not very good. Actually, I guess it is easy to
> try:
> 
>    git -c core.commentChar=$'\n' commit --allow-empty
> 
> treats everything as not-a-comment.
> 
> Maybe it's worth forbidding this at the start of the series, and then
> carrying it through. I really do think newline is the most special
> character here, just because it's obviously going to be meaningful to
> all of our line-oriented parsing. So you'll get weird results, as
> opposed to broken multibyte characters, where things would still work if
> you choose to consistently use them (and arguably we cannot even define
> "broken" as the user can use a different encoding).

I agree newline is a special case compared to broken multibyte 
characters, I see you've disallowed it in v2 which seems like a good idea.

> Likewise, I guess people might complain that their core.commentChar is
> NFD and their editor writes out NFC characters or something, and we
> don't match. I was hoping we could just punt on that and nobody would
> ever notice (certainly I think it is OK to punt for now and somebody who
> truly cares can make a utf8_starts_with() or similar).
> 
>>> Also, what exactly is the definition of "nonsense" will become can
>>> of worms.  I can sympathise if somebody wants to use "#\t" to give
>>> themselves a bit more room than usual on the left for visibility,
>>> for example, so there might be a case to want whitespace characters.
>>
>> That's fair, maybe we could just ban leading whitespace if we do decide to
>> restrict core.commentChar
> 
> Leading whitespace actually does work, though I think you'd be slightly
> insane to use it.

For "git rebase" in only works if you edit the todo list with "git 
rebase --edit-todo" which calls strbuf_stripspace() and therefore 
parse_insn_line() never sees the comments. If you edit the todo list 
directly then it will error out. You can see this with

     git -c core.commentChar=' ' rebase -x 'echo " this is a comment" 
 >>"$(git rev-parse --git-path rebase-merge/git-rebase-todo)"' HEAD^

which successfully picks HEAD but then gives

     error: invalid command 'this'

when it tries to parse the todo list after the exec command is run. 
Given it is broken already I'm not sure we should worry about it here. 
In any case it is not clear how much we should worry about problems 
caused by users editing the todo list without using "git rebase 
--edit-todo". There is code in parse_insn_line() which is clearly there 
to handle direct editing of the file but I don't think it is tested and 
directly editing the file probably bypasses the 
rebase.missingCommitsCheck checks as well.

Best Wishes

Phillip

> I'm currently using "! COMMENT !" (after using a unicode char for a few
> days). It's horribly ugly, but I wanted to see if any bugs cropped up
> (and vim's built-in git syntax highlighting colors it correctly ;) ).
> 
> -Peff


  reply	other threads:[~2024-03-12 14:36 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-05  8:43 Clarify the meaning of "character" in the documentation Manlio Perillo
2024-03-05  9:00 ` Kristoffer Haugsbakk
2024-03-05 15:32   ` Junio C Hamano
2024-03-05 15:42     ` Dragan Simic
2024-03-05 16:38       ` Junio C Hamano
2024-03-05 17:28         ` Dragan Simic
2024-03-06  8:08         ` [messy PATCH] multi-byte core.commentChar Jeff King
2024-03-07  9:14           ` [PATCH 0/15] allow " Jeff King
2024-03-07  9:15             ` [PATCH 01/15] strbuf: simplify comment-handling in add_lines() helper Jeff King
2024-03-07  9:16             ` [PATCH 02/15] strbuf: avoid static variables in strbuf_add_commented_lines() Jeff King
2024-03-07  9:18             ` [PATCH 03/15] commit: refactor base-case of adjust_comment_line_char() Jeff King
2024-03-07  9:19             ` [PATCH 04/15] strbuf: avoid shadowing global comment_line_char name Jeff King
2024-03-07  9:20             ` [PATCH 05/15] environment: store comment_line_char as a string Jeff King
2024-03-07  9:21             ` [PATCH 06/15] strbuf: accept a comment string for strbuf_stripspace() Jeff King
2024-03-07  9:53               ` Jeff King
2024-03-07  9:22             ` [PATCH 07/15] strbuf: accept a comment string for strbuf_commented_addf() Jeff King
2024-03-07  9:23             ` [PATCH 08/15] strbuf: accept a comment string for strbuf_add_commented_lines() Jeff King
2024-03-07  9:23             ` [PATCH 09/15] prefer comment_line_str to comment_line_char for printing Jeff King
2024-03-07  9:24             ` [PATCH 10/15] find multi-byte comment chars in NUL-terminated strings Jeff King
2024-03-07  9:26             ` [PATCH 11/15] find multi-byte comment chars in unterminated buffers Jeff King
2024-03-07 11:08               ` Jeff King
2024-03-07 19:41                 ` René Scharfe
2024-03-07 19:47                   ` René Scharfe
2024-03-07 19:42               ` René Scharfe
2024-03-08 10:17                 ` Phillip Wood
2024-03-08 15:58                   ` Junio C Hamano
2024-03-08 16:20                     ` Phillip Wood
2024-03-12  8:19                       ` Jeff King
2024-03-12 14:36                         ` phillip.wood123 [this message]
2024-03-13  6:23                           ` Jeff King
2024-03-12  8:05                 ` Jeff King
2024-03-14 19:37                   ` René Scharfe
2024-03-07  9:27             ` [PATCH 12/15] sequencer: handle multi-byte comment characters when writing todo list Jeff King
2024-03-08 10:20               ` Phillip Wood
2024-03-12  8:21                 ` Jeff King
2024-03-07  9:28             ` [PATCH 13/15] wt-status: drop custom comment-char stringification Jeff King
2024-03-07  9:30             ` [PATCH 14/15] environment: drop comment_line_char compatibility macro Jeff King
2024-03-07  9:34             ` [PATCH 15/15] config: allow multi-byte core.commentChar Jeff King
2024-03-08 11:07             ` [PATCH 0/15] " Phillip Wood
2024-03-12  9:10             ` [PATCH v2 0/16] " Jeff King
2024-03-12  9:17               ` [PATCH v2 01/16] config: forbid newline as core.commentChar Jeff King
2024-03-12  9:17               ` [PATCH v2 02/16] strbuf: simplify comment-handling in add_lines() helper Jeff King
2024-03-12  9:17               ` [PATCH v2 03/16] strbuf: avoid static variables in strbuf_add_commented_lines() Jeff King
2024-03-12  9:17               ` [PATCH v2 04/16] commit: refactor base-case of adjust_comment_line_char() Jeff King
2024-03-12  9:17               ` [PATCH v2 05/16] strbuf: avoid shadowing global comment_line_char name Jeff King
2024-03-12  9:17               ` [PATCH v2 06/16] environment: store comment_line_char as a string Jeff King
2024-03-12  9:17               ` [PATCH v2 07/16] strbuf: accept a comment string for strbuf_stripspace() Jeff King
2024-03-12  9:17               ` [PATCH v2 08/16] strbuf: accept a comment string for strbuf_commented_addf() Jeff King
2024-03-12  9:17               ` [PATCH v2 09/16] strbuf: accept a comment string for strbuf_add_commented_lines() Jeff King
2024-03-12  9:17               ` [PATCH v2 10/16] prefer comment_line_str to comment_line_char for printing Jeff King
2024-03-12  9:17               ` [PATCH v2 11/16] find multi-byte comment chars in NUL-terminated strings Jeff King
2024-03-12  9:17               ` [PATCH v2 12/16] find multi-byte comment chars in unterminated buffers Jeff King
2024-03-12  9:17               ` [PATCH v2 13/16] sequencer: handle multi-byte comment characters when writing todo list Jeff King
2024-03-12  9:17               ` [PATCH v2 14/16] wt-status: drop custom comment-char stringification Jeff King
2024-03-12  9:17               ` [PATCH v2 15/16] environment: drop comment_line_char compatibility macro Jeff King
2024-03-12  9:17               ` [PATCH v2 16/16] config: allow multi-byte core.commentChar Jeff King
2024-03-13 18:23                 ` Kristoffer Haugsbakk
2024-03-13 18:39                   ` Junio C Hamano
2024-03-15  5:59                   ` Jeff King
2024-03-15  7:16                     ` Kristoffer Haugsbakk
2024-03-15  8:10                       ` Jeff King
2024-03-15 13:30                         ` Kristoffer Haugsbakk
2024-03-15 15:40                         ` Junio C Hamano
2024-03-16  5:50                           ` Jeff King
2024-03-26 22:10                         ` Junio C Hamano
2024-03-26 22:12                           ` Kristoffer Haugsbakk
2024-03-27  7:46                           ` Jeff King
2024-03-27  8:19                             ` [PATCH 17/16] config: add core.commentString Jeff King
2024-03-27 12:45                               ` Chris Torek
2024-03-27 16:13                               ` Junio C Hamano
2024-03-28  9:47                                 ` Jeff King
2024-03-27 14:53                             ` [PATCH v2 16/16] config: allow multi-byte core.commentChar Junio C Hamano
2024-03-12 14:40               ` [PATCH v2 0/16] " phillip.wood123
2024-03-12 20:30                 ` Junio C Hamano
2024-03-05 16:58       ` Clarify the meaning of "character" in the documentation Kristoffer Haugsbakk
2024-03-05 17:20         ` Dragan Simic
2024-03-05 17:37           ` Kristoffer Haugsbakk
2024-03-05 21:19             ` Dragan Simic
2024-03-05 16:51     ` Kristoffer Haugsbakk
2024-03-05 17:37       ` Junio C Hamano
2024-03-05 17:49         ` Kristoffer Haugsbakk
2024-03-05 22:48   ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b63eab9-7b6d-4024-9ec4-b58c5ca3b084@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=code@khaugsbakk.name \
    --cc=dsimic@manjaro.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=manlio.perillo@gmail.com \
    --cc=peff@peff.net \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).