From: Duy Nguyen <pclouds@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Eric Sunshine <sunshine@sunshineco.com>,
Git List <git@vger.kernel.org>,
Konstantin Kharlamov <hi-angel@yandex.ru>,
Ramsay Jones <ramsay@ramsayjones.plus.com>
Subject: Re: [PATCH v3 1/1] worktree add: sanitize worktree names
Date: Mon, 4 Mar 2019 19:04:24 +0700 [thread overview]
Message-ID: <20190304120424.GA7966@ash> (raw)
In-Reply-To: <CACsJy8D0o6-ihNcpmfhCfQPNo-t2i=NySp65Y8h2e3md2GvXVw@mail.gmail.com>
On Mon, Mar 04, 2019 at 06:19:15PM +0700, Duy Nguyen wrote:
> On Wed, Feb 27, 2019 at 11:05 PM Jeff King <peff@peff.net> wrote:
> >
> > On Wed, Feb 27, 2019 at 09:23:33AM -0500, Eric Sunshine wrote:
> >
> > > > If we just cared about saying "is this worktree name valid", I'd suggest
> > > > actually constructing a sample refname with the worktree name embedded
> > > > in it and feeding that to check_refname_format(). But because you want
> > > > to actually sanitize, I don't think there's an easy way to reuse it.
> > > >
> > > > So this approach is probably the best we can do, though I do still think
> > > > it's worth renaming that function (and/or putting a big warning comment
> > > > in front of it).
> > >
> > > The above arguments seem to suggest the introduction of a companion to
> > > check_refname_format() for sanitizing, perhaps named
> > > sanitize_refname_format(), in ref.[hc]. The potential difficulty with
> > > that is defining exactly what "sanitize" means. Will it be contextual?
> > > (That is, will git-worktree have differently sanitation needs than
> > > some other facility?) If so, perhaps a 'flags' argument could control
> > > how sanitization is done.
> >
> > I agree that sanitize_refname_format() would be nice, but I'm pretty
> > sure it's going to end up having to duplicate many of the rules from
> > check_refname_format(). Which is ugly if the two ever get out of sync.
> >
> > But if we could write it in a way that keeps the actual policy logic in
> > one factored-out portion, I think it would be worth doing.
>
> I think we could make check_refname_format() returns the bad position
> and several different error codes depending on context. Then
> sanitize_.. can just repeatedly call check_refname_format and fix up
> whatever error it reports. Performance goes straight to hell but I
> don't think that's a big deal for git-worktree, and it keeps
> check_refname_format() simple (relatively speaking).
The new refs.c code would look something like this.
do_check_refname_component() does not look so bad.
-- 8< --
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 21469eb52c..ca63dd3df6 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -262,36 +262,6 @@ static void validate_worktree_add(const char *path, const struct add_opts *opts)
free_worktrees(worktrees);
}
-static void sanitize_worktree_name(struct strbuf *name)
-{
- struct strbuf sb = STRBUF_INIT;
- int i;
-
- for (i = 0; i < name->len; i++) {
- int ch = name->buf[i];
-
- if (char_allowed_in_refname(ch))
- strbuf_addch(&sb, ch);
- else if (sb.len > 0 && sb.buf[sb.len - 1] != '-')
- strbuf_addch(&sb, '-');
- }
- if (sb.len > 0 && sb.buf[sb.len - 1] == '-')
- strbuf_setlen(&sb, sb.len - 1);
- /*
- * a worktree name of only special chars would be reduced to
- * an empty string
- */
- if (sb.len == 0)
- strbuf_addstr(&sb, "worktree");
-
- if (check_refname_format(sb.buf, REFNAME_ALLOW_ONELEVEL))
- BUG("worktree name '%s' (from '%s') is not a valid refname",
- sb.buf, name->buf);
-
- strbuf_swap(&sb, name);
- strbuf_release(&sb);
-}
-
static int add_worktree(const char *path, const char *refname,
const struct add_opts *opts)
{
@@ -322,7 +292,7 @@ static int add_worktree(const char *path, const char *refname,
name = worktree_basename(path, &len);
strbuf_add(&sb_name, name, path + len - name);
- sanitize_worktree_name(&sb_name);
+ sanitize_worktree_refname(&sb_name);
name = sb_name.buf;
git_path_buf(&sb_repo, "worktrees/%s", name);
len = sb_repo.len;
diff --git a/refs.c b/refs.c
index f23f583db1..2d9730e792 100644
--- a/refs.c
+++ b/refs.c
@@ -63,6 +63,17 @@ int char_allowed_in_refname(int ch)
refname_disposition[ch] == 0;
}
+enum check_code {
+ refname_ok = 0,
+ refname_contains_dotdot,
+ refname_contains_atopen,
+ refname_has_badchar,
+ refname_contains_wildcard,
+ refname_starts_with_dot,
+ refname_ends_with_dotlock,
+ refname_component_has_zero_length
+};
+
/*
* Try to read one refname component from the front of refname.
* Return the length of the component found, or -1 if the component is
@@ -78,10 +89,11 @@ int char_allowed_in_refname(int ch)
* - it ends with ".lock", or
* - it contains a "@{" portion
*/
-static int check_refname_component(const char *refname, int *flags)
+static enum check_code do_check_refname_component(const char *refname, int *flags, const char **cp_out)
{
const char *cp;
char last = '\0';
+ enum check_code ret = refname_ok;
for (cp = refname; ; cp++) {
int ch = *cp & 255;
@@ -90,18 +102,26 @@ static int check_refname_component(const char *refname, int *flags)
case 1:
goto out;
case 2:
- if (last == '.')
- return -1; /* Refname contains "..". */
+ if (last == '.') {
+ ret = refname_contains_dotdot;
+ goto done;
+ }
break;
case 3:
- if (last == '@')
- return -1; /* Refname contains "@{". */
+ if (last == '@') {
+ ret = refname_contains_atopen; /* @{ */
+ goto done;
+ }
break;
case 4:
- return -1;
+ ret = refname_has_badchar;
+ goto done;
case 5:
- if (!(*flags & REFNAME_REFSPEC_PATTERN))
- return -1; /* refspec can't be a pattern */
+ if (!(*flags & REFNAME_REFSPEC_PATTERN)) {
+ /* refspec can't be a pattern */
+ ret = refname_contains_wildcard;
+ goto done;
+ }
/*
* Unset the pattern flag so that we only accept
@@ -113,16 +133,67 @@ static int check_refname_component(const char *refname, int *flags)
last = ch;
}
out:
- if (cp == refname)
- return 0; /* Component has zero length. */
- if (refname[0] == '.')
- return -1; /* Component starts with '.'. */
+ if (cp == refname) {
+ ret = refname_component_has_zero_length;
+ goto done;
+ }
+ if (refname[0] == '.') {
+ ret = refname_starts_with_dot;
+ cp = refname;
+ goto done;
+ }
if (cp - refname >= LOCK_SUFFIX_LEN &&
- !memcmp(cp - LOCK_SUFFIX_LEN, LOCK_SUFFIX, LOCK_SUFFIX_LEN))
- return -1; /* Refname ends with ".lock". */
+ !memcmp(cp - LOCK_SUFFIX_LEN, LOCK_SUFFIX, LOCK_SUFFIX_LEN)) {
+ cp -= LOCK_SUFFIX_LEN;
+ ret = refname_ends_with_dotlock;
+ }
+done:
+ *cp_out = cp;
+ return ret;
+}
+
+static int check_refname_component(const char *refname, int *flags)
+{
+ const char *cp;
+ enum check_code ret;
+
+ ret = do_check_refname_component(refname, flags, &cp);
+ if (ret)
+ return -1;
return cp - refname;
}
+void sanitize_worktree_refname(struct strbuf *name)
+{
+ int last_length = -1;
+ int flags = 0;
+
+ while (1) {
+ const char *cp;
+
+ enum check_code ret = do_check_refname_component(name->buf, &flags, &cp);
+ if (last_length != -1 && cp - name->buf == last_length)
+ BUG("stuck in infinite loop! pos = %d buf = %s",
+ last_length, name->buf);
+ last_length = cp - name->buf;
+ switch (ret) {
+ case refname_ok:
+ return;
+ case refname_contains_dotdot:
+ case refname_contains_atopen:
+ case refname_has_badchar:
+ case refname_contains_wildcard:
+ case refname_ends_with_dotlock:
+ case refname_starts_with_dot:
+ name->buf[last_length] = '-';
+ break;
+ case refname_component_has_zero_length:
+ strbuf_addstr(name, "worktree");
+ return;
+ }
+ }
+}
+
int check_refname_format(const char *refname, int flags)
{
int component_len, component_count = 0;
diff --git a/refs.h b/refs.h
index 61b4073f76..3b65b8d27a 100644
--- a/refs.h
+++ b/refs.h
@@ -459,7 +459,7 @@ int for_each_reflog(each_ref_fn fn, void *cb_data);
* repeated slashes are accepted.
*/
int check_refname_format(const char *refname, int flags);
-int char_allowed_in_refname(int ch);
+void sanitize_worktree_refname(struct strbuf *name);
const char *prettify_refname(const char *refname);
diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
index ea22207361..24c574f365 100755
--- a/t/t2025-worktree-add.sh
+++ b/t/t2025-worktree-add.sh
@@ -571,10 +571,10 @@ test_expect_success '"add" an existing locked but missing worktree' '
'
test_expect_success 'sanitize generated worktree name' '
- git worktree add --detach ". weird*..?.lock.lock." &&
- test -d .git/worktrees/weird-lock-lock &&
+ git worktree add --detach ". weird*..?.lock.lock" &&
+ test -d .git/worktrees/---weird-.--.lock-lock &&
git worktree add --detach .... &&
- test -d .git/worktrees/worktree
+ test -d .git/worktrees/--.-
'
test_done
-- 8< --
--
Duy
next prev parent reply other threads:[~2019-03-04 12:04 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-18 14:36 git gc fails with "unable to resolve reference" for worktree hi-angel
2019-02-18 15:02 ` Duy Nguyen
2019-02-18 15:09 ` hi-angel
2019-02-18 15:18 ` Duy Nguyen
2019-02-20 14:34 ` hi-angel
2019-02-21 11:00 ` [PATCH] worktree add: sanitize worktree names Nguyễn Thái Ngọc Duy
2019-02-21 11:28 ` Konstantin Kharlamov
2019-02-21 11:38 ` Duy Nguyen
2019-02-21 11:44 ` Konstantin Kharlamov
2019-02-21 11:52 ` Duy Nguyen
2019-02-21 13:23 ` Jeff King
2019-02-21 12:19 ` [PATCH v2 0/1] " Nguyễn Thái Ngọc Duy
2019-02-21 12:19 ` [PATCH v2 1/1] " Nguyễn Thái Ngọc Duy
2019-02-21 13:22 ` Jeff King
2019-02-21 17:41 ` Ramsay Jones
2019-02-22 9:21 ` Duy Nguyen
2019-02-26 10:58 ` [PATCH v3 0/1] " Nguyễn Thái Ngọc Duy
2019-02-26 10:58 ` [PATCH v3 1/1] " Nguyễn Thái Ngọc Duy
2019-02-27 12:08 ` Jeff King
2019-02-27 14:23 ` Eric Sunshine
2019-02-27 16:04 ` Jeff King
2019-03-03 1:22 ` Junio C Hamano
2019-03-04 11:19 ` Duy Nguyen
2019-03-04 12:04 ` Duy Nguyen [this message]
2019-03-04 15:06 ` Johannes Schindelin
2019-03-05 12:08 ` [PATCH v4 0/2] " Nguyễn Thái Ngọc Duy
2019-03-05 12:08 ` [PATCH v4 1/2] refs.c: refactor check_refname_component() Nguyễn Thái Ngọc Duy
2019-03-06 21:49 ` Jeff King
2019-03-07 23:24 ` Eric Sunshine
2019-03-05 12:08 ` [PATCH v4 2/2] worktree add: sanitize worktree names Nguyễn Thái Ngọc Duy
2019-03-08 9:28 ` [PATCH v5 0/1] " Nguyễn Thái Ngọc Duy
2019-03-08 9:28 ` [PATCH v5 1/1] " Nguyễn Thái Ngọc Duy
2019-03-10 2:02 ` Eric Sunshine
2019-03-11 6:20 ` Junio C Hamano
2019-03-11 9:24 ` Duy Nguyen
2019-03-11 22:39 ` Jeff King
2019-03-12 6:32 ` Junio C Hamano
2019-03-11 6:36 ` Junio C Hamano
2019-03-11 9:27 ` Duy Nguyen
2019-03-11 13:05 ` Johannes Schindelin
2019-03-12 6:45 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190304120424.GA7966@ash \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=hi-angel@yandex.ru \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).