git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>,
	Christian Couder <christian.couder@gmail.com>,
	Hariom Verma <hariom18599@gmail.com>,
	Karthik Nayak <karthik.188@gmail.com>
Subject: Re: [PATCH 0/3] [GSOC][RFC] ref-filter: add contents:raw atom
Date: Wed, 26 May 2021 14:45:58 +0800	[thread overview]
Message-ID: <CAOLTT8ReZffY5gznSDD=Fgbt7YTtA5aJWX+f8Q8npcj0OwcuFQ@mail.gmail.com> (raw)
In-Reply-To: <xmqq8s42cnyb.fsf@gitster.g>

>
> Another thing to keep in mind is that not all host languages may be
> capable of expressing a string with NUL in it.  Most notably, shell.
> The --shell quoting rule used by for-each-ref would produce an
> equivalent of the "script" produced like this:
>
>         $ tr Q '\000' >script <<\EOF
>         #!/bin/sh
>         varname='varQname'
>         echo "$varname"
>         EOF
>
> but I do not think it would say 'var' followed by a NUL followed by
> 'name'.  The NUL is likely lost when assigned to the variable.
>

Yes, in the following example you mentioned earlier, I have also
noticed the loss of '\0'.

> >     git for-each-ref --format='
> >               name=%(refname)
> >               var=%(placeholder)
> >                 mkdir -p "$(dirname "$name")"
> >               printf "%%s" "$var" >"$name"
> >     ' --shell | /bin/sh
> >

> So for some host languages, binaries may be useless with or without
> quoting.  But for ones that can use strings to hold arbitrary byte
> sequence, it should be OK to let for-each-ref to quote the byte
> sequence as a string literal for the language (so that the exact
> byte sequence will end up being in the variable after assignment).
>

I agree, and maybe some'\0' can be escaped appropriately to let host
languages recognize it....

> That reminds me of another thing.  The --python thing was written
> back when Python3 was still a distant dream and strings were the
> appropriate type for a random sequence of bytes (as opposed to
> unicode, which cannot have a random sequence of bytes).  Somebody
> needs to check if it needs any update to work with Python3.

$ printf '%b' "name='a\\\0b\\\0c'\nprint(name)" | python2.7 | od -c
0000000   a  \0   b  \0   c  \n
0000006

$ printf '%b' "name='a\\\0b\\\0c'\necho -e \"\$name\"" | sh | od -c
0000000   a  \0   b  \0   c  \n
0000006

In shell or python2/3, we can replace'\0' with "\\0".

In Tcl and perl, they are ok with '\0'.

$ printf '%b' "set name \"a\0b\0c\"\nputs \$name" | tclsh | od -c
0000000   '   a  \0   b  \0   c   '  \n
0000010

$ printf '%b' "\$name = 'a\0b\0c';\n print \"\$name\""  | perl | od -c
0000000   a  \0   b  \0   c
0000005

So I currently think that a possible strategy is to modify
`python_quote_buf_with_size()` and `sq_quote_buf_with_size()` from
'\0' to "\\0".

Thanks!

--
ZheNing Hu

  reply	other threads:[~2021-05-26  6:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-23  9:53 [PATCH 0/3] [GSOC][RFC] ref-filter: add contents:raw atom ZheNing Hu via GitGitGadget
2021-05-23  9:53 ` [PATCH 1/3] [GSOC] quote: add *.quote_buf_with_size functions ZheNing Hu via GitGitGadget
2021-05-23  9:53 ` [PATCH 2/3] [GSOC] ref-filter: support %(contents) for blob, tree ZheNing Hu via GitGitGadget
2021-05-25  5:03   ` Junio C Hamano
2021-05-25  5:47     ` Junio C Hamano
2021-05-25  9:28       ` ZheNing Hu
2021-05-25 17:11         ` Junio C Hamano
2021-05-26  7:48           ` ZheNing Hu
2021-05-23  9:53 ` [PATCH 3/3] [GSOC] ref-filter: add contents:raw atom ZheNing Hu via GitGitGadget
2021-05-24  1:09 ` [PATCH 0/3] [GSOC][RFC] " Junio C Hamano
2021-05-24  2:41   ` Felipe Contreras
2021-05-24  5:22     ` Bagas Sanjaya
2021-05-24 15:21       ` Junio C Hamano
2021-05-24 13:09   ` ZheNing Hu
2021-05-26  0:56   ` Junio C Hamano
2021-05-26  6:45     ` ZheNing Hu [this message]
2021-05-26  7:06       ` Junio C Hamano
2021-05-26  9:17         ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8ReZffY5gznSDD=Fgbt7YTtA5aJWX+f8Q8npcj0OwcuFQ@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=karthik.188@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).