git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] branch: Forbid to create local branches with confusing names
@ 2019-11-06 16:56 Ingo Rohloff
  2019-11-06 22:15 ` Johannes Schindelin
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-06 16:56 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

Without this patch, git allows to do something like this:
  git branch remotes/origin/master
  git branch refs/remotes/origin/master
  git branch heads/master
  git branch tags/v3.4
All of these local branch names lead to severe confusion,
not only for a user but also for git itself.

This patch forbids to create local branches, with names which start
with any of

  refs/
  heads/
  remotes/
  tags/

With this patch, you might still create these kind of local branches,
but now you have to additionally specify the '-f' option.

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
---
 branch.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

 This patch refers way back to the discussion from 2014:
   From: Josef Wolf
   To: git@vger.kernel.org
   Subject: error: src refspec refs/heads/master matches more than one.
   Date: Fri, 14 Feb 2014 12:31:36 +0100
 See for example here:
   https://public-inbox.org/git/20140214113136.GA17817@raven.inka.de/

 The origin of the problem is, that git has (almost) no constraints
 what kind of names are allowed for local branches.
 There nowadays is a constraint that you are NOT allowed to create
 a branch which is called HEAD. See commit 16169285f1e6 ("Merge
 branch 'jc/branch-name-sanity'", 2017-11-28).

 In the old mail thread a lot of potential constraints for local
 branch names were discussed; in particular a lot of strategies
 were discussed what kind of local branch names might be a problem
 (the gist is: avoid ambiguities, by finding out which names
 lead to ambiguities NOW).

 I personally think it makes more sense to forbid a much
 bigger class of confusing branch names.
 In particular I think all local branch names starting with
   refs/
   heads/
   remotes/
   tags/
 should be forbidden (per default, can still be forced).
 This also avoids trouble for an unforseeable future.
 Example:
   git branch remotes/blub/master
 This might not be a problem now, because you have no 
 remote called "blub" right now.
 But if you later use
   git remote add blub <URL>
   git fetch blub
 you very likely run into trouble.

 The above approach still allows you to create local branches
 with a name of the form
    <remote name>/<something ...>
 but I cannot see how this might be avoided; remotes might
 be added later so what would you do in the case that a local
 branch already existed named like the remote or a remote
 tracking branch.

 With the proposed constraints you are at least are able to use
    heads/<remote name>/<something ...>
    remotes/<remote name>/<something ...>
 to differentiate between the two.

 This really is an issue; people seem to stumble over it
 and I guess this is particularly true if you control git
 via scripts. See for example (two years later):
   From: Duy Nguyen
   To: Junio C Hamano
   Cc: Git Mailing List <git@vger.kernel.org>,
   Subject: Re: error: src refspec refs/heads/master matches more than one.
   Date: Wed, 23 Mar 2016 18:17:05 +0700

 So with this patch I want to pick up this old discussion yet again.

 This code can probably be done a lot better I guess, but I wanted to
 send in something, to start the discussion.

diff --git a/branch.c b/branch.c
index 579494738a..e74872dac5 100644
--- a/branch.c
+++ b/branch.c
@@ -256,6 +256,16 @@ void create_branch(struct repository *r,
 	int dont_change_ref = 0;
 	int explicit_tracking = 0;
 
+	if (!force && (
+		starts_with(name, "refs/") ||
+		starts_with(name, "heads/") ||
+		starts_with(name, "remotes/") ||
+		starts_with(name, "tags/")
+	)) {
+		die(_("A local branch name should not start with "
+		      "\"refs/\", \"heads/\", \"remotes/\" or \"tags/\""));
+	}
+
 	if (track == BRANCH_TRACK_EXPLICIT || track == BRANCH_TRACK_OVERRIDE)
 		explicit_tracking = 1;
 
-- 
2.24.0.1.g6c2c19214d.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-06 16:56 [PATCH] branch: Forbid to create local branches with confusing names Ingo Rohloff
@ 2019-11-06 22:15 ` Johannes Schindelin
  2019-11-07 19:04   ` Ingo Rohloff
  2019-11-07  6:05 ` Junio C Hamano
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
  2 siblings, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2019-11-06 22:15 UTC (permalink / raw)
  To: Ingo Rohloff; +Cc: git

Hi Ingo,

On Wed, 6 Nov 2019, Ingo Rohloff wrote:

> Without this patch, git allows to do something like this:

Maybe start the patch with a description of the problem it tries to
solve? In other words, I would have appreciated a first paragraph that
starts with "Many Git users ...".

>   git branch remotes/origin/master
>   git branch refs/remotes/origin/master
>   git branch heads/master
>   git branch tags/v3.4
> All of these local branch names lead to severe confusion,
> not only for a user but also for git itself.
>
> This patch forbids to create local branches, with names which start
> with any of
>
>   refs/
>   heads/
>   remotes/
>   tags/
>
> With this patch, you might still create these kind of local branches,
> but now you have to additionally specify the '-f' option.
>
> Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
> ---
>  branch.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
>  This patch refers way back to the discussion from 2014:
>    From: Josef Wolf
>    To: git@vger.kernel.org
>    Subject: error: src refspec refs/heads/master matches more than one.
>    Date: Fri, 14 Feb 2014 12:31:36 +0100
>  See for example here:
>    https://public-inbox.org/git/20140214113136.GA17817@raven.inka.de/
>
>  The origin of the problem is, that git has (almost) no constraints
>  what kind of names are allowed for local branches.
>  There nowadays is a constraint that you are NOT allowed to create
>  a branch which is called HEAD. See commit 16169285f1e6 ("Merge
>  branch 'jc/branch-name-sanity'", 2017-11-28).
>
>  In the old mail thread a lot of potential constraints for local
>  branch names were discussed; in particular a lot of strategies
>  were discussed what kind of local branch names might be a problem
>  (the gist is: avoid ambiguities, by finding out which names
>  lead to ambiguities NOW).
>
>  I personally think it makes more sense to forbid a much
>  bigger class of confusing branch names.
>  In particular I think all local branch names starting with
>    refs/
>    heads/
>    remotes/
>    tags/
>  should be forbidden (per default, can still be forced).
>  This also avoids trouble for an unforseeable future.
>  Example:
>    git branch remotes/blub/master
>  This might not be a problem now, because you have no
>  remote called "blub" right now.
>  But if you later use
>    git remote add blub <URL>
>    git fetch blub
>  you very likely run into trouble.
>
>  The above approach still allows you to create local branches
>  with a name of the form
>     <remote name>/<something ...>
>  but I cannot see how this might be avoided; remotes might
>  be added later so what would you do in the case that a local
>  branch already existed named like the remote or a remote
>  tracking branch.
>
>  With the proposed constraints you are at least are able to use
>     heads/<remote name>/<something ...>
>     remotes/<remote name>/<something ...>
>  to differentiate between the two.
>
>  This really is an issue; people seem to stumble over it
>  and I guess this is particularly true if you control git
>  via scripts. See for example (two years later):
>    From: Duy Nguyen
>    To: Junio C Hamano
>    Cc: Git Mailing List <git@vger.kernel.org>,
>    Subject: Re: error: src refspec refs/heads/master matches more than one.
>    Date: Wed, 23 Mar 2016 18:17:05 +0700
>
>  So with this patch I want to pick up this old discussion yet again.
>
>  This code can probably be done a lot better I guess, but I wanted to
>  send in something, to start the discussion.

A lot of this text should probably go into the commit message itself,
possibly with accompanying Message-IDs or even public-inbox URLs right
away.

>
> diff --git a/branch.c b/branch.c
> index 579494738a..e74872dac5 100644
> --- a/branch.c
> +++ b/branch.c
> @@ -256,6 +256,16 @@ void create_branch(struct repository *r,
>  	int dont_change_ref = 0;
>  	int explicit_tracking = 0;
>
> +	if (!force && (
> +		starts_with(name, "refs/") ||
> +		starts_with(name, "heads/") ||
> +		starts_with(name, "remotes/") ||
> +		starts_with(name, "tags/")

A more common problem for me, personally, is when I manage to fool
myself by creating a local branch like `origin/master`. Clearly, I want
to refer to the remote-tracking branch, but by mistake I create a local
branch that now conflicts with the (short) name of the remote-tracking
branch.

To remedy this, you would not only have to ensure that `create_branch()`
verifies that the branch name does not have a `<remote-name>/` prefix
where `<remote-name>` refers to a valid remote, but you would also need
a corresponding patch that teaches `git add remote <nick> <url>` to
verify that no local branch starts with `<nick>/`.

What do you think?

Ciao,
Johannes

> +	)) {
> +		die(_("A local branch name should not start with "
> +		      "\"refs/\", \"heads/\", \"remotes/\" or \"tags/\""));
> +	}
> +
>  	if (track == BRANCH_TRACK_EXPLICIT || track == BRANCH_TRACK_OVERRIDE)
>  		explicit_tracking = 1;
>
> --
> 2.24.0.1.g6c2c19214d.dirty
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-06 16:56 [PATCH] branch: Forbid to create local branches with confusing names Ingo Rohloff
  2019-11-06 22:15 ` Johannes Schindelin
@ 2019-11-07  6:05 ` Junio C Hamano
  2019-11-07 12:54   ` Ingo Rohloff
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
  2 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2019-11-07  6:05 UTC (permalink / raw)
  To: Ingo Rohloff; +Cc: git

Ingo Rohloff <ingo.rohloff@lauterbach.com> writes:

> Without this patch, git allows to do something like this:
>   git branch remotes/origin/master
>   git branch refs/remotes/origin/master
>   git branch heads/master
>   git branch tags/v3.4
> All of these local branch names lead to severe confusion,
> not only for a user but also for git itself.
>
> This patch forbids to create local branches, with names which start
> with any of
>
>   refs/
>   heads/
>   remotes/
>   tags/

Is there a reason why notes/ hierarchy is excluded?  What about
pull/?  Are we dealing with an unbounded set?  Should this list be
somehow end-user configurable so that say Gerrit users can add for/
and changes/ to the "forbidden" set?

> With this patch, you might still create these kind of local branches,
> but now you have to additionally specify the '-f' option.

This is not a change to builtin/branch.c, so other commands that
call create_branch() would be affected---are they all equipped to
toggle on the same bit that is affected by the '-f' option you have
in mind (presumably "git branch -f")?  Wouldn't forcing for those
other command have undesirable side effects?

I didn't check the code, but I think "checkout -b" also calls
create_branch() so presumably it also is affected.  Using "-B"
instead of "-b" for "checkout" might pass the force bit on, but
typically that is done only to recreate an existing branch.  Is it a
good idea to change the meaning of "-B" to also mean "do not bother
checking the sanity of the branch name"?

> --- a/branch.c
> +++ b/branch.c
> @@ -256,6 +256,16 @@ void create_branch(struct repository *r,
>  	int dont_change_ref = 0;
>  	int explicit_tracking = 0;
>  
> +	if (!force && (
> +		starts_with(name, "refs/") ||
> +		starts_with(name, "heads/") ||
> +		starts_with(name, "remotes/") ||
> +		starts_with(name, "tags/")
> +	)) {
> +		die(_("A local branch name should not start with "
> +		      "\"refs/\", \"heads/\", \"remotes/\" or \"tags/\""));
> +	}
> +
>  	if (track == BRANCH_TRACK_EXPLICIT || track == BRANCH_TRACK_OVERRIDE)
>  		explicit_tracking = 1;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-07  6:05 ` Junio C Hamano
@ 2019-11-07 12:54   ` Ingo Rohloff
  2019-11-08  2:54     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 12:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 6337 bytes --]

On Thu, 07 Nov 2019 15:05:49 +0900
Junio C Hamano <gitster@pobox.com> wrote:
> Ingo Rohloff <ingo.rohloff@lauterbach.com> writes:
> 
> > ...
> > This patch forbids to create local branches, with names which start
> > with any of
> >
> >   refs/
> >   heads/
> >   remotes/
> >   tags/  
> 
> Is there a reason why notes/ hierarchy is excluded?  What about
> pull/?  Are we dealing with an unbounded set?  Should this list be
> somehow end-user configurable so that say Gerrit users can add for/
> and changes/ to the "forbidden" set?

I think I did not explain the intention that well.
What I basically want to avoid is a situation in which there is
no way at all to refer unambiguously to a particular reference.

Right now this is easily possible. 
I use the following sequence of commands to get into this
situation ("gcc_build" is just a test repository with nothing special).

  rm -rf gcc_build
  git clone ssh://ds1/home/irohloff/git/gcc_build.git
  cd gcc_build
  git checkout -b work
  echo "work..." >> 00_readme.txt
  git commit -a -m "work..."

  git branch origin/master
  git branch remotes/origin/master
  git branch refs/remotes/origin/master
  git branch -avv

The last "git branch -avv" outputs:

  master                     520d29e [refs/remotes/origin/master] initial scripts for building cross compiler GCC
  origin/master              b8efa13 work...
  refs/remotes/origin/master b8efa13 work...
  remotes/origin/master      b8efa13 work...
* work                       b8efa13 work...
  remotes/origin/HEAD        -> refs/remotes/origin/master
  remotes/origin/master      520d29e initial scripts for building cross compiler GCC


So this already is a major confusion, because git reports there are two references 
with the same name "remotes/origin/master" which point to DIFFERENT objects.

What's worse: I cannot figure out a way to unambiguously refer to 
the remote tracking branch remotes/origin/master:

git log origin/master
  warning: refname 'origin/master' is ambiguous.
  True: This refers to both
      refs/remotes/origin/master
      refs/heads/origin/master

git log remotes/origin/master
  warning: refname 'remotes/origin/master' is ambiguous.
  True: This refers to both
      refs/remotes/origin/master
      refs/heads/remotes/origin/master

git branch refs/remotes/origin/master
  warning: refname 'remotes/origin/master' is ambiguous.
  True: This refers to both
      refs/remotes/origin/master
      refs/heads/refs/remotes/origin/master

Now I have run out of ideas how to refer to the remote tracking branch.

So what I want to ensure is that there exists at least one way to 
address a reference unambiguously.
(The SHA-1 is not really an option, because it changes each time you
update the branch head.)

I do not want to prevent people from creating ambiguities in general
or weird branch names in general, because I think that's a much harder and 
maybe unsolvable problem.
I just want to make sure that there is at least one unambiguous way to 
refer to a specific reference.

So to answer your questions:

> Is there a reason why notes/ hierarchy is excluded?  
> What about pull/?

I basically wanted to exclude the prefixes which git auto-adds when 
expanding reference names.

As far as I understand the source code, this means excluding the prefixes 
which are a result of the declaration of
  ref_rev_parse_rules
So these are
  refs/
  remotes/
  tags/
  heads/
Maybe I do not really understand the source code (or my logic is bogus)
and git might somehow expand a reference abbreviation by adding 
"notes/" or "pull/" but I do not know how to trigger this expansion.

That is my rationale behind this set of prefixes and why 
I did not include things like "notes/", "pull/" ...

Having said this: 
I think it might be enough to just forbid any local branch names,
which have a prefix of "refs/".
Rationale behind that: 
I said that I want to have at least one way to unambiguously 
refer to a reference. 
If I forbid "refs/" as prefix, then I think in the above example,

I can always use

   git log refs/remotes/origin/master

because
   .git/refs/heads/refs/....  does not exist.

Thinking more about this:
Preventing local branches names with a "refs/" prefix 
is just the tip of the iceberg.
Someone might use
  git tag refs/remotes/origin/master
or
  git add remote refs/remotes/origin <URL>
  git fetch refs/remotes/origin

After this refs/remotes/origin/master is again ambiguous.

> Should this list be somehow end-user configurable so that say Gerrit users 
> can add for/ and changes/ to the "forbidden" set?

This might be interesting in the future, but at least for now the goal was to
prevent a situation in which there is no unambiguous name at all for a reference.

So I was not thinking about a "convenience" feature for users but really preventing
to get the repository into an almost unusable state.

So to answer your question: No; right now I did not think that this should be 
end-user configurable, because the set of prefixes is derived from the declaration 
of "ref_rev_parse_rules", which contains a non-configurable set of rules.



> 
> This is not a change to builtin/branch.c, so other commands that
> call create_branch() would be affected---are they all equipped to
> toggle on the same bit that is affected by the '-f' option you have
> in mind (presumably "git branch -f")?  Wouldn't forcing for those
> other command have undesirable side effects?
> 
> I didn't check the code, but I think "checkout -b" also calls
> create_branch() so presumably it also is affected.  Using "-B"
> instead of "-b" for "checkout" might pass the force bit on, but
> typically that is done only to recreate an existing branch.  Is it a
> good idea to change the meaning of "-B" to also mean "do not bother
> checking the sanity of the branch name"?
> 

Yes you are completely right: 
I was not at all sure where to put the check.
As mentioned above: If the goal is to prevent to get a git repository into
a super confusing state, then you probably also need to add constraints for
remote names and tag names.

Are there more names which are part of reference names ?

so long
  Ingo

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3017 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-06 22:15 ` Johannes Schindelin
@ 2019-11-07 19:04   ` Ingo Rohloff
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Hello Johannes,

On Wed, 6 Nov 2019 23:15:44 +0100 (CET)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> Hi Ingo,
> 
> On Wed, 6 Nov 2019, Ingo Rohloff wrote:
> 
> > Without this patch, git allows to do something like this:  
> 
> Maybe start the patch with a description of the problem it tries to
> solve? In other words, I would have appreciated a first paragraph that
> starts with "Many Git users ...".

That's actually one of the problems: 
It's not clear what exactly the problem is :-).

After thinking about it more: The minimal goal I can think of is to make sure 
that if you use
   git log refs/<something>

you will never get a 
   warning: refname '...' is ambiguous

Rationale behind that: If even "refs/<something>" gives you this warning, 
then you might be in a lot of trouble. It means even giving a "full" refname 
is not enough to resolve ambiguities.
I think this is bad, because it means it might be hard to get out of this 
situation, because you might get the "ambiguous" warnings when you try to 
get rid of the offending refnames.


> 
> A lot of this text should probably go into the commit message itself,
> possibly with accompanying Message-IDs or even public-inbox URLs right
> away.

I did read "Documentation/SubmittingPatches". There it says:

    Try to make sure your explanation can be understood
    without external resources. Instead of giving a URL to a 
    mailing list archive, summarize the relevant points of 
    the discussion.

so that's what I tried to do.

> 
> A more common problem for me, personally, is when I manage to fool
> myself by creating a local branch like `origin/master`. Clearly, I want
> to refer to the remote-tracking branch, but by mistake I create a local
> branch that now conflicts with the (short) name of the remote-tracking
> branch.
> 
> To remedy this, you would not only have to ensure that `create_branch()`
> verifies that the branch name does not have a `<remote-name>/` prefix
> where `<remote-name>` refers to a valid remote, but you would also need
> a corresponding patch that teaches `git add remote <nick> <url>` to
> verify that no local branch starts with `<nick>/`.
> 
> What do you think?
> 

I agree: When I first started to use git, I was quite surprised that this
"double naming" is allowed.

But I also think, this is for another patch series; you probably need to 
honor "--force", or even add a git configuration option to allow this
anyway.

I am able to imagine that people intentionally set up a local branch
called "refs/heads/repoX/master" which tracks "refs/remotes/repoX/master".

For me this sounds like an unnecessary complication (because now you always
have to use the "long" refname), but if you put some software on top of git, 
I can imagine that this might make a lot of sense...

I am not enough of a git wizard to fully grasp the implications here.

so long
  Ingo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*"
  2019-11-06 16:56 [PATCH] branch: Forbid to create local branches with confusing names Ingo Rohloff
  2019-11-06 22:15 ` Johannes Schindelin
  2019-11-07  6:05 ` Junio C Hamano
@ 2019-11-07 19:07 ` Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 1/4] refs: new function newname_has_bad_prefix Ingo Rohloff
                     ` (3 more replies)
  2 siblings, 4 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:07 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

After the previous round of input:
The intention of this patch series is to make sure, that you always
might use "refs/<something>" to unambiguously refer to a specific 
reference.

Or put in another way: With this patch series, refnames for "git log", 
starting with "refs/" should never result in a
  warning: refname '...' is ambiguous.
Exceptions: 
You already have strangely named references in your repository.
You pull such strangely named references from a remote repository.

This patch series does not prevent you from creating various other
forms of ambiguous refnames. Example:

    git branch origin/master

This will very likely result in an ambiguity, because now the
references

   refs/heads/origin/master
   refs/remotes/origin/master

will very likely both exist. "origin/master" matches both refnames.

This patch series forbids to create new reference names, which start
with "refs" or "refs/*" with the commands

  git branch <name>
  git checkout -b <name>
  git tag <name>
  git remote add <name>

Note: This patch does NOT prevent you from working with references
which already exist.  It just prevents you from creating new ones
with the commands listed above.

That's also the reason why the '--force' option is not evaluated here.
I cannot think of a good reason, why you ever should want to create
refnames matching any of the following patterns:
   refs/heads/refs/*
   refs/tags/refs/*
   refs/remotes/refs/*
  

Ingo Rohloff (4):
  refs: new function newname_has_bad_prefix
  branch: Prevent creation of local branches named "refs" or "refs/*"
  remote: Prevent users from creating remotes named "refs" or "refs/*"
  tag: Prevent users from creating tags named "refs" or "refs/*"

 builtin/branch.c   |  6 ++++++
 builtin/checkout.c |  3 +++
 builtin/remote.c   |  3 +++
 builtin/tag.c      |  3 +++
 refs.c             | 30 ++++++++++++++++++++++++++++++
 refs.h             |  2 ++
 6 files changed, 47 insertions(+)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/4] refs: new function newname_has_bad_prefix
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
@ 2019-11-07 19:07   ` Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 2/4] branch: Prevent creation of local branches named "refs" or "refs/*" Ingo Rohloff
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:07 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

The function is intended to check if a user specified name
for a new reference (for remotes, branches or tags) has
a bad prefix.

Intention: Prevent users from creating tags/remotes/branches
which are named "refs" or "refs/..."

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
---
 refs.c | 30 ++++++++++++++++++++++++++++++
 refs.h |  2 ++
 2 files changed, 32 insertions(+)

diff --git a/refs.c b/refs.c
index 1ab0bb54d3..49e4f662df 100644
--- a/refs.c
+++ b/refs.c
@@ -321,6 +321,36 @@ int ref_exists(const char *refname)
 	return refs_ref_exists(get_main_ref_store(the_repository), refname);
 }
 
+
+static int starts_with_component(const char *str, const char *prefix)
+{
+	for (; ; str++, prefix++) {
+		if (!*prefix) {
+			if (!*str || *str == '/')
+				return 1;
+			return 0;
+		} else if (*str != *prefix)
+			return 0;
+	}
+}
+
+static const char *newname_bad_prefixes[] = {
+	"refs",
+	NULL
+};
+
+int newname_has_bad_prefix(const char *refname)
+{
+	const char **p;
+	p = newname_bad_prefixes;
+	while (*p) {
+		if (starts_with_component(refname, *p))
+			return 1;
+		p++;
+	}
+	return 0;
+}
+
 static int match_ref_pattern(const char *refname,
 			     const struct string_list_item *item)
 {
diff --git a/refs.h b/refs.h
index 730d05ad91..00c11ec589 100644
--- a/refs.h
+++ b/refs.h
@@ -107,6 +107,8 @@ int refs_verify_refname_available(struct ref_store *refs,
 
 int ref_exists(const char *refname);
 
+int newname_has_bad_prefix(const char *refname);
+
 int should_autocreate_reflog(const char *refname);
 
 int is_branch(const char *refname);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/4] branch: Prevent creation of local branches named "refs" or "refs/*"
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 1/4] refs: new function newname_has_bad_prefix Ingo Rohloff
@ 2019-11-07 19:07   ` Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 3/4] remote: Prevent users from creating remotes " Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 4/4] tag: Prevent users from creating tags " Ingo Rohloff
  3 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:07 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

To explain the intention, here is an example:

A user executes
   git branch refs/remotes/origin/master
or
   git checkout -b refs/remotes/origin/master
after this operation
   git log refs/remotes/origin/master
will very likely complain that this reference is ambiguous.

The reason is, that you now very likely have the following two
references which both match:

   refs/remotes/origin/master
   refs/heads/refs/remotes/origin/master

git cannot decide which of the two references is meant.

By preventing the creation of local branches which are named
  refs  or  refs/*
this issue is circumvented:
  git log refs/*
will never refer to local branches located under
  refs/heads/refs/*
because such local branches should not exist.

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
---
 builtin/branch.c   | 6 ++++++
 builtin/checkout.c | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/builtin/branch.c b/builtin/branch.c
index 2ef214632f..c2e45ff52c 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -497,6 +497,9 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int
 	const char *interpreted_newname = NULL;
 	int recovery = 0;
 
+	if (newname_has_bad_prefix(newname))
+		die(_("Invalid new branch name: '%s'"), newname);
+
 	if (!oldname) {
 		if (copy)
 			die(_("cannot copy the current branch while not on any."));
@@ -844,6 +847,9 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
 		if (track == BRANCH_TRACK_OVERRIDE)
 			die(_("the '--set-upstream' option is no longer supported. Please use '--track' or '--set-upstream-to' instead."));
 
+		if (newname_has_bad_prefix(argv[0]))
+			die(_("Invalid new branch name: '%s'"), argv[0]);
+
 		create_branch(the_repository,
 			      argv[0], (argc == 2) ? argv[1] : head,
 			      force, 0, reflog, quiet, track);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 3634a3dac1..51ac2cae43 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1566,6 +1566,9 @@ static int checkout_main(int argc, const char **argv, const char *prefix,
 	if (opts->new_orphan_branch)
 		opts->new_branch = opts->new_orphan_branch;
 
+	if (opts->new_branch && newname_has_bad_prefix(opts->new_branch))
+		die(_("Invalid new branch name: '%s'"), opts->new_branch);
+
 	/* --track without -b/-B/--orphan should DWIM */
 	if (opts->track != BRANCH_TRACK_UNSPECIFIED && !opts->new_branch) {
 		const char *argv0 = argv[0];
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/4] remote: Prevent users from creating remotes named "refs" or "refs/*"
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 1/4] refs: new function newname_has_bad_prefix Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 2/4] branch: Prevent creation of local branches named "refs" or "refs/*" Ingo Rohloff
@ 2019-11-07 19:07   ` Ingo Rohloff
  2019-11-07 19:07   ` [PATCH v2 4/4] tag: Prevent users from creating tags " Ingo Rohloff
  3 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:07 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

To explain the intention, here is an example:

A user executes
   git remote add refs/heads <URL>
   git fetch refs/heads
after this operation
   git log refs/heads/master
will very likely complain that this reference is ambiguous.

The reason is, that you now very likely have the following two
references which both match:

   refs/heads/master
   refs/remotes/refs/heads/master

git cannot decide which of the two references is meant.

By preventing the creation of remotes which are named
  refs  or  refs/*
this issue is circumvented:
  git log refs/*
will never refer to a remote tracking branch located under
  refs/remotes/refs/*
because such remotes should not exist.

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
---
 builtin/remote.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/remote.c b/builtin/remote.c
index 5591cef775..2272c16d18 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -189,6 +189,9 @@ static int add(int argc, const char **argv)
 	name = argv[0];
 	url = argv[1];
 
+	if (newname_has_bad_prefix(name))
+		die(_("Invalid new remote name: '%s'"), name);
+
 	remote = remote_get(name);
 	if (remote_is_configured(remote, 1))
 		die(_("remote %s already exists."), name);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] tag: Prevent users from creating tags named "refs" or "refs/*"
  2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
                     ` (2 preceding siblings ...)
  2019-11-07 19:07   ` [PATCH v2 3/4] remote: Prevent users from creating remotes " Ingo Rohloff
@ 2019-11-07 19:07   ` Ingo Rohloff
  3 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-07 19:07 UTC (permalink / raw)
  To: git; +Cc: Ingo Rohloff

To explain the intention, here is an example:

A user executes
   git tag -m "a tag" refs/heads/master
after this operation
   git log refs/heads/master
will very likely complain that this reference is ambiguous.

The reason is, that you now very likely have the following two
references which both match:

   refs/heads/master
   refs/tags/refs/heads/master

git cannot decide which of the two references is meant.

By preventing the creation of tags which are named
  refs  or  refs/*
this issue is circumvented:
  git log refs/*
will never refer to a tag located under
  refs/tags/refs/*
because such a tag should not exist.

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
---
 builtin/tag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/tag.c b/builtin/tag.c
index e0a4c25382..c818fe3fcd 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -537,6 +537,9 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 
 	tag = argv[0];
 
+	if (!cmdmode && newname_has_bad_prefix(tag))
+		die(_("Invalid new tag name: '%s'"), tag);
+
 	object_ref = argc == 2 ? argv[1] : "HEAD";
 	if (argc > 2)
 		die(_("too many params"));
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-07 12:54   ` Ingo Rohloff
@ 2019-11-08  2:54     ` Junio C Hamano
  2019-11-08 12:45       ` Ingo Rohloff
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2019-11-08  2:54 UTC (permalink / raw)
  To: Ingo Rohloff; +Cc: git

Ingo Rohloff <ingo.rohloff@lauterbach.com> writes:

> On Thu, 07 Nov 2019 15:05:49 +0900
> Junio C Hamano <gitster@pobox.com> wrote:
>> Ingo Rohloff <ingo.rohloff@lauterbach.com> writes:
>> 
>> > ...
>> > This patch forbids to create local branches, with names which start
>> > with any of
>> >
>> >   refs/
>> >   heads/
>> >   remotes/
>> >   tags/  
>> 
>> Is there a reason why notes/ hierarchy is excluded?  What about
>> pull/?  Are we dealing with an unbounded set?  Should this list be
>> somehow end-user configurable so that say Gerrit users can add for/
>> and changes/ to the "forbidden" set?
>
> I think I did not explain the intention that well.
> What I basically want to avoid is a situation in which there is
> no way at all to refer unambiguously to a particular reference.

Hmph, I thought this was a solved problem, but maybe I am still
misunderstanding you.

When you have a possibly ambigous branch whose name is $X, whether
$X begins with one of the strings you listed above or not, you can
always

 - refer to the commit at the tip of that branch by saying
   refs/heads/$X (e.g. "git show refs/heads/$X") and that always
   refers to the commit, even if there are other branches and tags
   that may begin with refs/ or refs/heads/.  You would of course
   get a warning about possible ambiguity because saying just $X
   (e.g. "git show $X") may not refer to refs/heads/$X when you have
   other refs that make $X ambiguous.

 - refer to the branch by saying $X (and not refs/heads/$X).

The thing to know is when you are naming a branch and when you are
naming a commit.

 - "git branch -d $X" is referring to the branch itself and removes
   refs/heads/$X.  

 - "git branch new <start-point>" uses <start-point> to specify the
   commit to begin the 'new' branch at, and does not necessarily
   take a branch name, so you should say refs/heads/$X.  You may not
   be able to rely on lazy-man's short-hands, like "git checkout -t
   origin/master" that DWIMs what you did not say, as you have to
   say refs/heads/$X that is *not* in the <remote-nick>/<branch>
   form (you can still do so with "git checkout -b master
   refs/heads/$X" followed by necessary configuration updates), but
   that is a price to pay for using ambiguous names.

And for those who do not want to pay for using ambiguous names, we
issue warnings when resolving refnames.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] branch: Forbid to create local branches with confusing names
  2019-11-08  2:54     ` Junio C Hamano
@ 2019-11-08 12:45       ` Ingo Rohloff
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Rohloff @ 2019-11-08 12:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hello Junio,

> > I think I did not explain the intention that well.
> > What I basically want to avoid is a situation in which there is
> > no way at all to refer unambiguously to a particular reference.  
> 
> Hmph, I thought this was a solved problem, but maybe I am still
> misunderstanding you.

You are right, this is partly solved: 
Deleting such a reference to a local branch is always possible.
But these kind of problems are not solved fully.

To explain my issue more concisely:
I start in any "regular" git repository.
   Regular ==  Repository was cloned from somewhere so a remote
   named "origin" and a remote-tracking branch named "origin/master"
   do exist. 

I fire off the following four commands
(Yes: This is malicious and stupid ;-))

   git checkout master
   git branch origin/master
   git branch remotes/origin/master
   git branch refs/remotes/origin/master

git does not complain at all here.

Then I try the following three commands.

   git branch somework origin/master
      warning: refname 'origin/master' is ambiguous.
      fatal: Ambiguous object name: 'origin/master'.

   git branch somework remotes/origin/master
      warning: refname 'remotes/origin/master' is ambiguous.
      fatal: Ambiguous object name: 'remotes/origin/master'.

   git branch somework refs/remotes/origin/master
      warning: refname 'refs/remotes/origin/master' is ambiguous.
      fatal: Ambiguous object name: 'refs/remotes/origin/master'.


QUESTION: 
I think I am lost now. That's where I might have overlooked something ?



I might continue with (this is the solved case):

   git branch -d refs/remotes/origin/master
      Deleted branch refs/remotes/origin/master (was 3454f30).

Sounds rather scary (because this sounds like you deleted a 
remote-tracking branch), but actually does the right thing I guess.
(The command deletes refs/heads/refs/remotes/origin/master)


After which this succeeds:

   git branch somework refs/remotes/origin/master
      Branch 'somework' set up to track remote branch 'master' from 'origin'.




PATCH:
Make this fail:

   git branch refs/remotes/origin/master
      fatal: Invalid new branch name: 'refs/remotes/origin/master'

This avoids the failure for 

   git branch somework refs/remotes/origin/master


and to avoid very similar issues make these fail too:

   git tag -m "a tag" refs/remotes/origin/master
      fatal: Invalid new tag name: 'refs/remotes/origin/master'

   git remote add refs/heads ssh://ds1/home/irohloff/git/gcc_build.git
      fatal: Invalid new remote name: 'refs/heads'


All of these examples use really pathological names for tags/remotes/branches.
I cannot believe that anyone wants to do this intentionally.

QUESTION:
Are there more user created, command line specified refnames 
in addition to tags/remotes/branches ?




If you have time:

Some more background.
The whole idea behind the patch: 
Make sure "refs/" is a "unique" prefix, which only
appears as ".git/refs/".
This should ensure that "refs/" only matches
to the very first entry from:
	static const char *ref_rev_parse_rules[] = {
		"%.*s",
		"refs/%.*s",
		"refs/tags/%.*s",
		"refs/heads/%.*s",
		"refs/remotes/%.*s",
		"refs/remotes/%.*s/HEAD",
		NULL
	};  

So goal: Make sure
  refs/refs/*          does not exist
  refs/tags/refs/*     does not exist
  refs/heads/refs/*    does not exist
  refs/remotes/refs/*  does not exist

To avoid the existence of refs/remotes/refs/* it is necessary to 
also prohibit a standalone "refs" as remote name (not just "refs/*");
and to handle that more easily I also prohibit a standalone "refs"
for tags and branches.

Of course you might still create all these nasty subdirs with plumbing.
I try to avoid that this is done with porcelain.
(At least that's as far as I understand git terminology.)
Of course future git extensions might try to create something like
  .git/refs/refs/*
but since these extensions are reviewed, I guess it is easy to nudge
authors of extensions (like git-svn, git-bisect, ...) to NOT do this.

so long
  Ingo


PS: 
I really think per default more prefixes than just "refs/" should
be forbidden when creating tags/remotes/branches.
But I also agree with you that this is much less straight forward
(Which prefixes to forbid ? Config option ? How much does this break ? ...).
As far as I can tell tags/remotes/branches, which 
are called "refs/*" or "refs" are completely pathological; 
I think unconditionally forbidding to create these kind of 
refnames for tags/remotes/branches with porcelain is OK.

BTW: This is also quite confusing 
(but does not really hurt and is consistent with what you described)

   git branch -r -d refs/remotes/origin/master
      error: remote-tracking branch 'refs/remotes/origin/master' not found.

What is meant here is I think

   remote-tracking branch 'refs/remotes/refs/remotes/origin/master' not found.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-08 12:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-06 16:56 [PATCH] branch: Forbid to create local branches with confusing names Ingo Rohloff
2019-11-06 22:15 ` Johannes Schindelin
2019-11-07 19:04   ` Ingo Rohloff
2019-11-07  6:05 ` Junio C Hamano
2019-11-07 12:54   ` Ingo Rohloff
2019-11-08  2:54     ` Junio C Hamano
2019-11-08 12:45       ` Ingo Rohloff
2019-11-07 19:07 ` [PATCH v2 0/4] Do not create new refnames "refs" or "refs/*" Ingo Rohloff
2019-11-07 19:07   ` [PATCH v2 1/4] refs: new function newname_has_bad_prefix Ingo Rohloff
2019-11-07 19:07   ` [PATCH v2 2/4] branch: Prevent creation of local branches named "refs" or "refs/*" Ingo Rohloff
2019-11-07 19:07   ` [PATCH v2 3/4] remote: Prevent users from creating remotes " Ingo Rohloff
2019-11-07 19:07   ` [PATCH v2 4/4] tag: Prevent users from creating tags " Ingo Rohloff

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).