git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thalia Archibald <thalia@archibald.dev>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 0/8] fast-import: tighten parsing of paths
Date: Sun, 07 Apr 2024 21:19:08 +0000	[thread overview]
Message-ID: <A820E635-14FA-403C-B932-D7F9FE57C092@archibald.dev> (raw)
In-Reply-To: <cover.1711960552.git.thalia@archibald.dev>

On Apr 1, 2024, at 02:02, Thalia Archibald wrote:
>> fast-import has subtle differences in how it parses file paths between each
>> occurrence of <path> in the grammar. Many errors are suppressed or not checked,
>> which could lead to silent data corruption. A particularly bad case is when a
>> front-end sent escapes that Git doesn't recognize (e.g., hex escapes are not
>> supported), it would be treated as literal bytes instead of a quoted string.
>> 
>> Bring path parsing into line with the documented behavior and improve
>> documentation to fill in missing details.
> 
> Thanks for the review, Patrick. I've made several changes, which I think address
> your feedback. What's the protocol for adding `Reviewed-by`? Since I don't know
> whether I, you, or Junio add it, I've refrained from attaching your name to
> these patches.

Hello! Friendly ping here. It’s been a week since the last emails for this patch
set, so I’d like to check in on the status.

As a new contributor to the project, I don’t yet have a full view of the
contribution flow, aside from what I’ve read. I suspect the latency is because I
may not have cc’d all the area experts. (When I sent v1, I used separate Cc
lines in send-email --compose, but it dropped all but the last. After Patrick
reviewed it, I figured I could leave the cc list as-is for v2, assuming I’d get
another review.) I’ve now cc’d everyone listed by contrib/contacts, as well as
the maintainer. For anyone not a part of the earlier discussion, the latest
version is at https://lore.kernel.org/git/cover.1711960552.git.thalia@archibald.dev/.
If that’s not the problem, I’d be keen to know what I could do better.

I have several more patch sets in the works, that I’ve held back on sending
until this one is finished, in case I’ve been doing something wrong. I want to
move forward. Thank you for your time.

Thalia

> Changes since v1:
> * In fast-import:
>  * Move `strbuf_release` outside of `parse_path_space` and `parse_path_eol`.
>  * Retain allocations for static `strbuf`s.
>  * Rename `allow_spaces` parameter of `parse_path` to `include_spaces`.
>  * Extract change to neighboring comments as patch 8.
> * In tests:
>  * Test `` for the root path additionally in all tests using `""`.
>  * Pass all arguments by positional variables.
>  * Use `local`.
>  * Use `test_when_finished` instead of manual cleanup.
> * In documentation:
>  * Better document conditions under which a path is considered quoted or
>    unquoted.
> * Reword commit messages.
> 
> Thalia
> 
> 
> Thalia Archibald (8):
>  fast-import: tighten path unquoting
>  fast-import: directly use strbufs for paths
>  fast-import: allow unquoted empty path for root
>  fast-import: remove dead strbuf
>  fast-import: improve documentation for unquoted paths
>  fast-import: document C-style escapes for paths
>  fast-import: forbid escaped NUL in paths
>  fast-import: make comments more precise
> 
> Documentation/git-fast-import.txt |  30 +-
> builtin/fast-import.c             | 156 ++++----
> t/t9300-fast-import.sh            | 617 +++++++++++++++++++++---------
> 3 files changed, 541 insertions(+), 262 deletions(-)
> 
> Range-diff against v1:
> 1:  8d9e0b25cb ! 1:  e790bdf714 fast-import: tighten parsing of paths
>    @@ Metadata
>     Author: Thalia Archibald <thalia@archibald.dev>
> 
>      ## Commit message ##
>    -    fast-import: tighten parsing of paths
>    +    fast-import: tighten path unquoting
> 
>         Path parsing in fast-import is inconsistent and many unquoting errors
>    -    are suppressed.
>    +    are suppressed or not checked.
> 
>    -    `<path>` appears in the grammar in these places:
>    +    <path> appears in the grammar in these places:
> 
>             filemodify ::= 'M' SP <mode> (<dataref> | 'inline') SP <path> LF
>             filedelete ::= 'D' SP <path> LF
>    @@ Commit message
>         and fast-import.c parses them in five different ways:
> 
>         1. For filemodify and filedelete:
>    -       If `<path>` is a valid quoted string, unquote it;
>    -       otherwise, treat it as literal bytes (including any number of SP).
>    +       Try to unquote <path>. If it unquotes without errors, use the
>    +       unquoted version; otherwise, treat it as literal bytes to the end of
>    +       the line (including any number of SP).
>         2. For filecopy (source) and filerename (source):
>    -       If `<path>` is a valid quoted string, unquote it;
>    -       otherwise, treat it as literal bytes until the next SP.
>    +       Try to unquote <path>. If it unquotes without errors, use the
>    +       unquoted version; otherwise, treat it as literal bytes up to, but not
>    +       including, the next SP.
>         3. For filecopy (dest) and filerename (dest):
>    -       Like 1., but an unquoted empty string is an error.
>    +       Like 1., but an unquoted empty string is forbidden.
>         4. For ls:
>    -       If `<path>` starts with `"`, unquote it and report parse errors;
>    -       otherwise, treat it as literal bytes (including any number of SP).
>    +       If <path> starts with `"`, unquote it and report parse errors;
>    +       otherwise, treat it as literal bytes to the end of the line
>    +       (including any number of SP).
>         5. For ls-commit:
>    -       Unquote `<path>` and report parse errors.
>    +       Unquote <path> and report parse errors.
>            (It must start with `"` to disambiguate from ls.)
> 
>         In the first three, any errors from trying to unquote a string are
>         suppressed, so a quoted string that contains invalid escapes would be
>         interpreted as literal bytes. For example, `"\xff"` would fail to
>         unquote (because hex escapes are not supported), and it would instead be
>    -    interpreted as the byte sequence `"` `\` `x` `f` `f` `"`, which is
>    +    interpreted as the byte sequence '"', '\\', 'x', 'f', 'f', '"', which is
>         certainly not intended. Some front-ends erroneously use their language's
>    -    standard quoting routine and could silently introduce escapes that would
>    -    be incorrectly parsed due to this.
>    +    standard quoting routine instead of matching Git's, which could silently
>    +    introduce escapes that would be incorrectly parsed due to this and lead
>    +    to data corruption.
> 
>    -    The documentation states that “To use a source path that contains SP the
>    -    path must be quoted.”, so it is expected that some implementations
>    -    depend on spaces being allowed in paths in the final position. Thus we
>    -    have two documented ways to parse paths, so simplify the implementation
>    -    to that.
>    +    The documentation states “To use a source path that contains SP the path
>    +    must be quoted.”, so it is expected that some implementations depend on
>    +    spaces being allowed in paths in the final position. Thus we have two
>    +    documented ways to parse paths, so simplify the implementation to that.
> 
>         Now we have:
> 
>         1. `parse_path_eol` for filemodify, filedelete, filecopy (dest),
>            filerename (dest), ls, and ls-commit:
> 
>    -       If `<path>` starts with `"`, unquote it and report parse errors;
>    -       otherwise, treat it as literal bytes (including any number of SP).
>    -       Garbage after a quoted string or an unquoted empty string are errors.
>    -       (In ls-commit, it must be quoted to disambiguate from ls.)
>    +       If <path> starts with `"`, unquote it and report parse errors;
>    +       otherwise, treat it as literal bytes to the end of the line
>    +       (including any number of SP).
> 
>         2. `parse_path_space` for filecopy (source) and filerename (source):
> 
>    -       If `<path>` starts with `"`, unquote it and report parse errors;
>    -       otherwise, treat it as literal bytes until the next SP.
>    -       It must be followed by a SP. An unquoted empty string is an error.
>    +       If <path> starts with `"`, unquote it and report parse errors;
>    +       otherwise, treat it as literal bytes up to, but not including, the
>    +       next SP. It must be followed by SP.
>    +
>    +    There remain two special cases: The dest <path> in filecopy and rename
>    +    cannot be an unquoted empty string (this will be addressed subsequently)
>    +    and <path> in ls-commit must be quoted to disambiguate it from ls.
> 
>         Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> 
>    - ## Documentation/git-fast-import.txt ##
>    -@@ Documentation/git-fast-import.txt: The value of `<path>` must be in canonical form. That is it must not:
>    - * contain the special component `.` or `..` (e.g. `foo/./bar` and
>    -   `foo/../bar` are invalid).
>    -
>    --The root of the tree can be represented by an empty string as `<path>`.
>    -+The root of the tree can be represented by a quoted empty string (`""`)
>    -+as `<path>`.
>    -
>    - It is recommended that `<path>` always be encoded using UTF-8.
>    -
>    -
>      ## builtin/fast-import.c ##
>    -@@ builtin/fast-import.c: static int parse_mapped_oid_hex(const char *hex, struct object_id *oid, const ch
>    -  *
>    -  *   idnum ::= ':' bigint;
>    -  *
>    -- * Return the first character after the value in *endptr.
>    -+ * Update *endptr to point to the first character after the value.
>    -  *
>    -  * Complain if the following character is not what is expected,
>    -  * either a space or end of the string.
>    -@@ builtin/fast-import.c: static uintmax_t parse_mark_ref_eol(const char *p)
>    - }
>    -
>    - /*
>    -- * Parse the mark reference, demanding a trailing space.  Return a
>    -- * pointer to the space.
>    -+ * Parse the mark reference, demanding a trailing space. Update *p to
>    -+ * point to the first character after the space.
>    -  */
>    - static uintmax_t parse_mark_ref_space(const char **p)
>    - {
>     @@ builtin/fast-import.c: static uintmax_t parse_mark_ref_space(const char **p)
>       return mark;
>      }
>    @@ builtin/fast-import.c: static uintmax_t parse_mark_ref_space(const char **p)
>     +/*
>     + * Parse the path string into the strbuf. It may be quoted with escape sequences
>     + * or unquoted without escape sequences. When unquoted, it may only contain a
>    -+ * space if `allow_spaces` is nonzero.
>    ++ * space if `include_spaces` is nonzero.
>     + */
>    -+static void parse_path(struct strbuf *sb, const char *p, const char **endp, int allow_spaces, const char *field)
>    ++static void parse_path(struct strbuf *sb, const char *p, const char **endp, int include_spaces, const char *field)
>     +{
>    -+ strbuf_reset(sb);
>     + if (*p == '"') {
>     + if (unquote_c_style(sb, p, endp))
>     + die("Invalid %s: %s", field, command_buf.buf);
>     + } else {
>    -+ if (allow_spaces)
>    ++ if (include_spaces)
>     + *endp = p + strlen(p);
>     + else
>     + *endp = strchr(p, ' ');
>    -+ if (*endp == p)
>    -+ die("Missing %s: %s", field, command_buf.buf);
>     + strbuf_add(sb, p, *endp - p);
>     + }
>     +}
>    @@ builtin/fast-import.c: static uintmax_t parse_mark_ref_space(const char **p)
>       struct object_id oid;
>       uint16_t mode, inline_data = 0;
>     @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b)
>    - die("Missing space after SHA1: %s", command_buf.buf);
>       }
> 
>    -- strbuf_reset(&uq);
>    + strbuf_reset(&uq);
>     - if (!unquote_c_style(&uq, p, &endp)) {
>     - if (*endp)
>     - die("Garbage after path in: %s", command_buf.buf);
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>       static struct strbuf uq = STRBUF_INIT;
>     - const char *endp;
> 
>    -- strbuf_reset(&uq);
>    + strbuf_reset(&uq);
>     - if (!unquote_c_style(&uq, p, &endp)) {
>     - if (*endp)
>     - die("Garbage after path in: %s", command_buf.buf);
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>     - const char *endp;
>       struct tree_entry leaf;
> 
>    -- strbuf_reset(&s_uq);
>    + strbuf_reset(&s_uq);
>     - if (!unquote_c_style(&s_uq, s, &endp)) {
>     - if (*endp != ' ')
>     - die("Missing space after source: %s", command_buf.buf);
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>     - strbuf_add(&s_uq, s, endp - s);
>     - }
>     + parse_path_space(&s_uq, p, &p, "source");
>    -+ parse_path_eol(&d_uq, p, "dest");
>       s = s_uq.buf;
>    --
>    +
>     - endp++;
>     - if (!*endp)
>    -- die("Missing dest: %s", command_buf.buf);
>    ++ if (!p)
>    + die("Missing dest: %s", command_buf.buf);
>     -
>     - d = endp;
>    -- strbuf_reset(&d_uq);
>    + strbuf_reset(&d_uq);
>     - if (!unquote_c_style(&d_uq, d, &endp)) {
>     - if (*endp)
>     - die("Garbage after dest in: %s", command_buf.buf);
>     - d = d_uq.buf;
>     - }
>    ++ parse_path_eol(&d_uq, p, "dest");
>     + d = d_uq.buf;
> 
>       memset(&leaf, 0, sizeof(leaf));
>       if (rename)
>    -@@ builtin/fast-import.c: static void parse_ls(const char *p, struct branch *b)
>    +@@ builtin/fast-import.c: static void print_ls(int mode, const unsigned char *hash, const char *path)
>    +
>    + static void parse_ls(const char *p, struct branch *b)
>      {
>    ++ static struct strbuf uq = STRBUF_INIT;
>       struct tree_entry *root = NULL;
>       struct tree_entry leaf = {NULL};
>    -+ static struct strbuf uq = STRBUF_INIT;
> 
>    - /* ls SP (<tree-ish> SP)? <path> */
>    - if (*p == '"') {
>     @@ builtin/fast-import.c: static void parse_ls(const char *p, struct branch *b)
>       root->versions[1].mode = S_IFDIR;
>       load_tree(root);
>    @@ builtin/fast-import.c: static void parse_ls(const char *p, struct branch *b)
>     - die("Garbage after path in: %s", command_buf.buf);
>     - p = uq.buf;
>     - }
>    ++ strbuf_reset(&uq);
>     + parse_path_eol(&uq, p, "path");
>     + p = uq.buf;
>       tree_content_get(root, p, &leaf, 1);
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +# Path parsing
>     +#
>     +# There are two sorts of ways a path can be parsed, depending on whether it is
>    -+# the last field on the line. Additionally, ls without a <tree-ish> has a
>    -+# special case. Test every occurrence of <path> in the grammar against every
>    -+# error case.
>    ++# the last field on the line. Additionally, ls without a <dataref> has a special
>    ++# case. Test every occurrence of <path> in the grammar against every error case.
>     +#
>     +
>     +#
>     +# Valid paths at the end of a line: filemodify, filedelete, filecopy (dest),
>     +# filerename (dest), and ls.
>     +#
>    -+# commit :301 from root -- modify hello.c
>    ++# commit :301 from root -- modify hello.c (for setup)
>     +# commit :302 from :301 -- modify $path
>     +# commit :303 from :302 -- delete $path
>     +# commit :304 from :301 -- copy hello.c $path
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +# ls :305 $path
>     +#
>     +test_path_eol_success () {
>    -+ test="$1" path="$2" unquoted_path="$3"
>    ++ local test="$1" path="$2" unquoted_path="$3"
>     + test_expect_success "S: paths at EOL with $test must work" '
>    ++ test_when_finished "git branch -D S-path-eol" &&
>    ++
>     + git fast-import --export-marks=marks.out <<-EOF >out 2>err &&
>     + blob
>     + mark :401
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + hallo welt
>     + BLOB
>     +
>    -+ commit refs/heads/path-eol
>    ++ commit refs/heads/S-path-eol
>     + mark :301
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + COMMIT
>     + M 100644 :401 hello.c
>     +
>    -+ commit refs/heads/path-eol
>    ++ commit refs/heads/S-path-eol
>     + mark :302
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + from :301
>     + M 100644 :402 '"$path"'
>     +
>    -+ commit refs/heads/path-eol
>    ++ commit refs/heads/S-path-eol
>     + mark :303
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + from :302
>     + D '"$path"'
>     +
>    -+ commit refs/heads/path-eol
>    ++ commit refs/heads/S-path-eol
>     + mark :304
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + from :301
>     + C hello.c '"$path"'
>     +
>    -+ commit refs/heads/path-eol
>    ++ commit refs/heads/S-path-eol
>     + mark :305
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + git ls-tree $commit_r >tree_r.out &&
>     + test_cmp tree_r.exp tree_r.out &&
>     +
>    -+ test_cmp out tree_r.exp &&
>    -+
>    -+ git branch -D path-eol
>    ++ test_cmp out tree_r.exp
>     + '
>     +}
>     +
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +#
>     +# Valid paths before a space: filecopy (source) and filerename (source).
>     +#
>    -+# commit :301 from root -- modify $path
>    ++# commit :301 from root -- modify $path (for setup)
>     +# commit :302 from :301 -- copy $path hello2.c
>     +# commit :303 from :301 -- rename $path hello2.c
>     +#
>     +test_path_space_success () {
>    -+ test="$1" path="$2" unquoted_path="$3"
>    ++ local test="$1" path="$2" unquoted_path="$3"
>     + test_expect_success "S: paths before space with $test must work" '
>    ++ test_when_finished "git branch -D S-path-space" &&
>    ++
>     + git fast-import --export-marks=marks.out <<-EOF 2>err &&
>     + blob
>     + mark :401
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + hello world
>     + BLOB
>     +
>    -+ commit refs/heads/path-space
>    ++ commit refs/heads/S-path-space
>     + mark :301
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + COMMIT
>     + M 100644 :401 '"$path"'
>     +
>    -+ commit refs/heads/path-space
>    ++ commit refs/heads/S-path-space
>     + mark :302
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     + from :301
>     + C '"$path"' hello2.c
>     +
>    -+ commit refs/heads/path-space
>    ++ commit refs/heads/S-path-space
>     + mark :303
>     + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
>     + data <<COMMIT
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +
>     + printf "100644 blob $blob\thello2.c\n" >tree_r.exp &&
>     + git ls-tree $commit_r >tree_r.out &&
>    -+ test_cmp tree_r.exp tree_r.out &&
>    -+
>    -+ git branch -D path-space
>    ++ test_cmp tree_r.exp tree_r.out
>     + '
>     +}
>     +
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +# of <path> in the grammar against all error kinds.
>     +#
>     +test_path_fail () {
>    -+ what="$1" path="$2" err_grep="$3"
>    ++ local change="$1" what="$2" prefix="$3" path="$4" suffix="$5" err_grep="$6"
>     + test_expect_success "S: $change with $what must fail" '
>     + test_must_fail git fast-import <<-EOF 2>err &&
>     + blob
>    @@ t/t9300-fast-import.sh: test_expect_success 'S: ls with garbage after sha1 must
>     +}
>     +
>     +test_path_base_fail () {
>    -+ test_path_fail 'unclosed " in '"$field"          '"hello.c'    "Invalid $field"
>    -+ test_path_fail "invalid escape in quoted $field" '"hello\xff"' "Invalid $field"
>    ++ local change="$1" prefix="$2" field="$3" suffix="$4"
>    ++ test_path_fail "$change" 'unclosed " in '"$field"          "$prefix" '"hello.c'    "$suffix" "Invalid $field"
>    ++ test_path_fail "$change" "invalid escape in quoted $field" "$prefix" '"hello\xff"' "$suffix" "Invalid $field"
>     +}
>     +test_path_eol_quoted_fail () {
>    -+ test_path_base_fail
>    -+ test_path_fail "garbage after quoted $field" '"hello.c"x' "Garbage after $field"
>    -+ test_path_fail "space after quoted $field"   '"hello.c" ' "Garbage after $field"
>    ++ local change="$1" prefix="$2" field="$3" suffix="$4"
>    ++ test_path_base_fail "$change" "$prefix" "$field" "$suffix"
>    ++ test_path_fail "$change" "garbage after quoted $field" "$prefix" '"hello.c"x' "$suffix" "Garbage after $field"
>    ++ test_path_fail "$change" "space after quoted $field"   "$prefix" '"hello.c" ' "$suffix" "Garbage after $field"
>     +}
>     +test_path_eol_fail () {
>    -+ test_path_eol_quoted_fail
>    -+ test_path_fail 'empty unquoted path' '' "Missing $field"
>    ++ local change="$1" prefix="$2" field="$3" suffix="$4"
>    ++ test_path_eol_quoted_fail "$change" "$prefix" "$field" "$suffix"
>     +}
>     +test_path_space_fail () {
>    -+ test_path_base_fail
>    -+ test_path_fail 'empty unquoted path' '' "Missing $field"
>    -+ test_path_fail "missing space after quoted $field" '"hello.c"x' "Missing space after $field"
>    ++ local change="$1" prefix="$2" field="$3" suffix="$4"
>    ++ test_path_base_fail "$change" "$prefix" "$field" "$suffix"
>    ++ test_path_fail "$change" "missing space after quoted $field" "$prefix" '"hello.c"x' "$suffix" "Missing space after $field"
>     +}
>     +
>    -+change=filemodify       prefix='M 100644 :1 ' field=path   suffix=''         test_path_eol_fail
>    -+change=filedelete       prefix='D '           field=path   suffix=''         test_path_eol_fail
>    -+change=filecopy         prefix='C '           field=source suffix=' world.c' test_path_space_fail
>    -+change=filecopy         prefix='C hello.c '   field=dest   suffix=''         test_path_eol_fail
>    -+change=filerename       prefix='R '           field=source suffix=' world.c' test_path_space_fail
>    -+change=filerename       prefix='R hello.c '   field=dest   suffix=''         test_path_eol_fail
>    -+change='ls (in commit)' prefix='ls :2 '       field=path   suffix=''         test_path_eol_fail
>    ++test_path_eol_fail   filemodify       'M 100644 :1 ' path   ''
>    ++test_path_eol_fail   filedelete       'D '           path   ''
>    ++test_path_space_fail filecopy         'C '           source ' world.c'
>    ++test_path_eol_fail   filecopy         'C hello.c '   dest   ''
>    ++test_path_space_fail filerename       'R '           source ' world.c'
>    ++test_path_eol_fail   filerename       'R hello.c '   dest   ''
>    ++test_path_eol_fail   'ls (in commit)' 'ls :2 '       path   ''
>     +
>    -+# When 'ls' has no <tree-ish>, the <path> must be quoted.
>    -+change='ls (without tree-ish in commit)' prefix='ls ' field=path suffix='' \
>    -+test_path_eol_quoted_fail &&
>    -+test_path_fail 'empty unquoted path' '' "Invalid dataref"
>    ++# When 'ls' has no <dataref>, the <path> must be quoted.
>    ++test_path_eol_quoted_fail 'ls (without dataref in commit)' 'ls ' path ''
>     +
>      ###
>      ### series T (ls)
> 2:  a2aca9f9e6 ! 2:  82a6f53c13 fast-import: directly use strbufs for paths
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>       die("Missing space after SHA1: %s", command_buf.buf);
>       }
> 
>    +- strbuf_reset(&uq);
>     - parse_path_eol(&uq, p, "path");
>     - p = uq.buf;
>    ++ strbuf_reset(&path);
>     + parse_path_eol(&path, p, "path");
> 
>       /* Git does not track empty, non-toplevel directories. */
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>     - static struct strbuf uq = STRBUF_INIT;
>     + static struct strbuf path = STRBUF_INIT;
> 
>    +- strbuf_reset(&uq);
>     - parse_path_eol(&uq, p, "path");
>     - p = uq.buf;
>     - tree_content_remove(&b->branch_tree, p, NULL, 1);
>    ++ strbuf_reset(&path);
>     + parse_path_eol(&path, p, "path");
>     + tree_content_remove(&b->branch_tree, path.buf, NULL, 1);
>      }
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>     + static struct strbuf dest = STRBUF_INIT;
>       struct tree_entry leaf;
> 
>    +- strbuf_reset(&s_uq);
>     - parse_path_space(&s_uq, p, &p, "source");
>    -- parse_path_eol(&d_uq, p, "dest");
>     - s = s_uq.buf;
>    -- d = d_uq.buf;
>    ++ strbuf_reset(&source);
>     + parse_path_space(&source, p, &p, "source");
>    +
>    + if (!p)
>    + die("Missing dest: %s", command_buf.buf);
>    +- strbuf_reset(&d_uq);
>    +- parse_path_eol(&d_uq, p, "dest");
>    +- d = d_uq.buf;
>    ++ strbuf_reset(&dest);
>     + parse_path_eol(&dest, p, "dest");
> 
>       memset(&leaf, 0, sizeof(leaf));
>    @@ builtin/fast-import.c: static void file_change_m(const char *p, struct branch *b
>       &leaf.versions[1].oid,
>       leaf.versions[1].mode,
>       leaf.tree);
>    -@@ builtin/fast-import.c: static void parse_ls(const char *p, struct branch *b)
>    +@@ builtin/fast-import.c: static void print_ls(int mode, const unsigned char *hash, const char *path)
>    +
>    + static void parse_ls(const char *p, struct branch *b)
>      {
>    - struct tree_entry *root = NULL;
>    - struct tree_entry leaf = {NULL};
>     - static struct strbuf uq = STRBUF_INIT;
>     + static struct strbuf path = STRBUF_INIT;
>    + struct tree_entry *root = NULL;
>    + struct tree_entry leaf = {NULL};
> 
>    - /* ls SP (<tree-ish> SP)? <path> */
>    - if (*p == '"') {
>     @@ builtin/fast-import.c: static void parse_ls(const char *p, struct branch *b)
>       root->versions[1].mode = S_IFDIR;
>       load_tree(root);
>       }
>    +- strbuf_reset(&uq);
>     - parse_path_eol(&uq, p, "path");
>     - p = uq.buf;
>     - tree_content_get(root, p, &leaf, 1);
>    ++ strbuf_reset(&path);
>     + parse_path_eol(&path, p, "path");
>     + tree_content_get(root, path.buf, &leaf, 1);
>       /*
> 3:  ecaf4883d1 < -:  ---------- fast-import: release unfreed strbufs
> -:  ---------- > 3:  893bbf5e73 fast-import: allow unquoted empty path for root
> 4:  058a38416a ! 4:  cb05a184e6 fast-import: remove dead strbuf
>    @@ Metadata
>      ## Commit message ##
>         fast-import: remove dead strbuf
> 
>    -    The strbuf in `note_change_n` has been unused since the function was
>    +    The strbuf in `note_change_n` is to copy the remainder of `p` before
>    +    potentially invalidating it when reading the next line. However, `p` is
>    +    not used after that point. It has been unused since the function was
>         created in a8dd2e7d2b (fast-import: Add support for importing commit
>         notes, 2009-10-09) and looks to be a fossil from adapting
>    -    `note_change_m`. Remove it.
>    +    `file_change_m`. Remove it.
> 
>         Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> 
> 5:  a5e8df0759 < -:  ---------- fast-import: document C-style escapes for paths
> 6:  9792940ba9 < -:  ---------- fast-import: forbid escaped NUL in paths
> -:  ---------- > 5:  1f34b632d7 fast-import: improve documentation for unquoted paths
> -:  ---------- > 6:  82a4da68af fast-import: document C-style escapes for paths
> -:  ---------- > 7:  c087c6a860 fast-import: forbid escaped NUL in paths
> -:  ---------- > 8:  a503c55b83 fast-import: make comments more precise
> --
> 2.44.0




  parent reply	other threads:[~2024-04-07 21:19 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22  0:03 [PATCH 0/6] fast-import: tighten parsing of paths Thalia Archibald
2024-03-22  0:03 ` [PATCH 1/6] " Thalia Archibald
2024-03-22  0:11   ` Thalia Archibald
2024-03-28  8:21   ` Patrick Steinhardt
     [not found]     ` <E01C617F-3720-42C0-83EE-04BB01643C86@archibald.dev>
2024-04-01  9:05       ` Thalia Archibald
2024-03-22  0:03 ` [PATCH 2/6] fast-import: directly use strbufs for paths Thalia Archibald
2024-03-28  8:21   ` Patrick Steinhardt
2024-03-22  0:03 ` [PATCH 3/6] fast-import: release unfreed strbufs Thalia Archibald
2024-03-28  8:21   ` Patrick Steinhardt
2024-04-01  9:06     ` Thalia Archibald
2024-03-22  0:03 ` [PATCH 4/6] fast-import: remove dead strbuf Thalia Archibald
2024-03-28  8:21   ` Patrick Steinhardt
2024-03-22  0:03 ` [PATCH 5/6] fast-import: document C-style escapes for paths Thalia Archibald
2024-03-28  8:21   ` Patrick Steinhardt
2024-04-01  9:06     ` Thalia Archibald
2024-03-22  0:03 ` [PATCH 6/6] fast-import: forbid escaped NUL in paths Thalia Archibald
2024-04-01  9:02 ` [PATCH v2 0/8] fast-import: tighten parsing of paths Thalia Archibald
2024-04-01  9:02   ` [PATCH v2 1/8] fast-import: tighten path unquoting Thalia Archibald
2024-04-10  6:27     ` Patrick Steinhardt
2024-04-10  8:18       ` Chris Torek
2024-04-10  8:44         ` Thalia Archibald
2024-04-10  8:51           ` Chris Torek
2024-04-10  9:14             ` Thalia Archibald
2024-04-10  9:42               ` Patrick Steinhardt
2024-04-10  9:16             ` Thalia Archibald
2024-04-10  9:12       ` Thalia Archibald
2024-04-01  9:03   ` [PATCH v2 2/8] fast-import: directly use strbufs for paths Thalia Archibald
2024-04-10  6:27     ` Patrick Steinhardt
2024-04-10 10:07       ` Thalia Archibald
2024-04-10 10:18         ` Patrick Steinhardt
2024-04-01  9:03   ` [PATCH v2 3/8] fast-import: allow unquoted empty path for root Thalia Archibald
2024-04-10  6:27     ` Patrick Steinhardt
2024-04-01  9:03   ` [PATCH v2 4/8] fast-import: remove dead strbuf Thalia Archibald
2024-04-01  9:03   ` [PATCH v2 5/8] fast-import: improve documentation for unquoted paths Thalia Archibald
2024-04-01  9:03   ` [PATCH v2 6/8] fast-import: document C-style escapes for paths Thalia Archibald
2024-04-01  9:03   ` [PATCH v2 7/8] fast-import: forbid escaped NUL in paths Thalia Archibald
2024-04-01  9:03   ` [PATCH v2 8/8] fast-import: make comments more precise Thalia Archibald
2024-04-07 21:19   ` Thalia Archibald [this message]
2024-04-07 23:46     ` [PATCH v2 0/8] fast-import: tighten parsing of paths Eric Sunshine
2024-04-08  6:25       ` Patrick Steinhardt
2024-04-08  7:15         ` Thalia Archibald
2024-04-08  9:07           ` Patrick Steinhardt
2024-04-08 14:52         ` Junio C Hamano
2024-04-10  9:54   ` [PATCH v3 " Thalia Archibald
2024-04-10  9:55     ` [PATCH v3 1/8] fast-import: tighten path unquoting Thalia Archibald
2024-04-10  9:55     ` [PATCH v3 2/8] fast-import: directly use strbufs for paths Thalia Archibald
2024-04-10  9:55     ` [PATCH v3 3/8] fast-import: allow unquoted empty path for root Thalia Archibald
2024-04-11 19:59       ` Junio C Hamano
2024-04-10  9:55     ` [PATCH v3 4/8] fast-import: remove dead strbuf Thalia Archibald
2024-04-11 19:53       ` Junio C Hamano
2024-04-10  9:55     ` [PATCH v3 5/8] fast-import: improve documentation for unquoted paths Thalia Archibald
2024-04-11 19:51       ` Junio C Hamano
2024-04-10  9:56     ` [PATCH v3 6/8] fast-import: document C-style escapes for paths Thalia Archibald
2024-04-10 18:28       ` Junio C Hamano
2024-04-10 22:50         ` Thalia Archibald
2024-04-11  5:32           ` Junio C Hamano
2024-04-11  9:14             ` Patrick Steinhardt
2024-04-10  9:56     ` [PATCH v3 7/8] fast-import: forbid escaped NUL in paths Thalia Archibald
2024-04-10 18:51       ` Junio C Hamano
2024-04-10  9:56     ` [PATCH v3 8/8] fast-import: make comments more precise Thalia Archibald
2024-04-10 19:21       ` Junio C Hamano
2024-04-12  8:01     ` [PATCH v4 0/8] fast-import: tighten parsing of paths Thalia Archibald
2024-04-12  8:02       ` [PATCH v4 1/8] fast-import: tighten path unquoting Thalia Archibald
2024-04-12 16:34         ` Junio C Hamano
2024-04-13  0:07           ` Thalia Archibald
2024-04-13 18:33             ` Junio C Hamano
2024-04-12  8:03       ` [PATCH v4 2/8] fast-import: directly use strbufs for paths Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 3/8] fast-import: allow unquoted empty path for root Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 4/8] fast-import: remove dead strbuf Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 5/8] fast-import: improve documentation for path quoting Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 6/8] fast-import: document C-style escapes for paths Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 7/8] fast-import: forbid escaped NUL in paths Thalia Archibald
2024-04-12  8:03       ` [PATCH v4 8/8] fast-import: make comments more precise Thalia Archibald
2024-04-14  1:11       ` [PATCH v5 0/8] fast-import: tighten parsing of paths Thalia Archibald
2024-04-14  1:11         ` [PATCH v5 1/8] fast-import: tighten path unquoting Thalia Archibald
2024-04-14  1:11         ` [PATCH v5 2/8] fast-import: directly use strbufs for paths Thalia Archibald
2024-04-14  1:11         ` [PATCH v5 3/8] fast-import: allow unquoted empty path for root Thalia Archibald
2024-04-14  1:11         ` [PATCH v5 4/8] fast-import: remove dead strbuf Thalia Archibald
2024-04-14  1:12         ` [PATCH v5 5/8] fast-import: improve documentation for path quoting Thalia Archibald
2024-04-14  1:12         ` [PATCH v5 6/8] fast-import: document C-style escapes for paths Thalia Archibald
2024-04-14  1:12         ` [PATCH v5 7/8] fast-import: forbid escaped NUL in paths Thalia Archibald
2024-04-14  1:12         ` [PATCH v5 8/8] fast-import: make comments more precise Thalia Archibald
2024-04-15  7:06         ` [PATCH v5 0/8] fast-import: tighten parsing of paths Patrick Steinhardt
2024-04-15 17:07           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A820E635-14FA-403C-B932-D7F9FE57C092@archibald.dev \
    --to=thalia@archibald.dev \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).