git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, peff@peff.net, pclouds@gmail.com
Subject: Re: [PATCH v2 5/6] clone: fix hostname parsing when guessing dir
Date: Thu, 30 Jul 2015 14:18:11 +0200	[thread overview]
Message-ID: <20150730121811.GA24635@pks-pc.localdomain> (raw)
In-Reply-To: <xmqq7fpiamiq.fsf@gitster.dls.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 3690 bytes --]

On Wed, Jul 29, 2015 at 10:42:21AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We fail to guess a sensible directory name for a newly cloned
> > repository when the path component of the URL is empty. E.g.
> > cloning a repository 'ssh://user:password@example.com/' we create
> > a directory 'password@example.com' for the clone.
> >
> > Fix this by ...
> 
> It is clear that you do not want to have that password in the
> resulting directory name from the problem description, but you
> started saying "Fix this" without saying what the desired outcome
> is.  "We want to use only the hostname, e.g. 'example.com', in such
> a case instead." or something, perhaps, at the end of the first
> paragraph?  "Fix this by doing such and such" becomes understandable
> only after we know what end result you want to achieve by "doing
> such and such".

Agreed, will fix with the next iteration.

> > ... using parse_connect_url to split host and path
> > components and explicitly checking whether we need to fall back
> > to the hostname for guessing a directory name.
> 
> I cannot help wonder why this much change (including patches 3 and
> 4) is needed.  Isn't it just the matter of making this part of the
> existing code be aware of '@' in addition to ':'?

Actually no, as host and path components need to be treated
differently. See below.

> > -	/*
> > -	 * Find last component, but be prepared that repo could have
> > -	 * the form  "remote.example.com:foo.git", i.e. no slash
> > -	 * in the directory part.
> > -	 */
> > -	start = end;
> > -	while (repo < start && !is_dir_sep(start[-1]) && start[-1] != ':')
> > -		start--;
> 
> Regardless of the issue you are trying to address, we may want to
> limit that "be prepared for and careful with ':'" logic in the
> existing code to the case where the "last component" does not have
> any other component before it.  That is:
> 
> 	http://example.com/foo:bar.git/
> 
> would be stripped to
> 
> 	http://example.com/foo:bar
> 
> and then we scan backwards for ':' or '/' and declare that "bar" is
> the name of the repository, but we would probably want "foo:bar"
> instead (or we may not, as some filesystems do not want to have a
> colon in its path components).

This case is exactly why I did include patches 3 and 4. We've got
two cases that need to be distinguished:

1. we've got a non-empty path component (that is it contains more
   than just a '/'). In this case we want to take its last part
   and strip it of things like '.git'. We should only honor ':'
   as a path separator if it is the first character in the path
   component, otherwise only honor '/'.

2. we've got an empty path component. In this case we want to
   inspect the host part. If it is empty we have to error out,
   otherwise we want to strip it of authentication information
   (everythin up to and including '@') and port information
   (everything following the ':').

So both cases are treated entirely different. For your example
we'd first split up 'http://example.com/foo:bar.git' into the
host 'example.com' and the path '/foo:bar.git'. As
'parse_connect_url()' does exactly what we need, e.g. split up
host and path, I think it is only natural to reuse it.

But actually you are right, currently I still have the old logic
in place that splits on colons in the path component. In my case
it would be 'parse_connect_url()'s responsibility to detect if
host and path are not separated by '/' but by ':' and thus we'd
not run into the problem with 'foo:bar.git'. I'll verify that
behavior though and write some tests.

Patrick

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-07-30 12:18 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-27 11:48 [PATCH] clone: fix repo name when cloning a server's root Patrick Steinhardt
2015-07-27 12:51 ` Duy Nguyen
2015-07-27 12:59   ` Patrick Steinhardt
2015-07-27 14:29   ` Junio C Hamano
2015-07-29 15:51 ` [PATCH v2 0/6] " Patrick Steinhardt
2015-07-29 15:51   ` [PATCH v2 1/6] tests: fix broken && chains in t1509-root-worktree Patrick Steinhardt
2015-07-29 15:51   ` [PATCH v2 2/6] tests: fix cleanup after tests " Patrick Steinhardt
2015-07-29 15:51   ` [PATCH v2 3/6] connect: expose parse_connect_url() Patrick Steinhardt
2015-07-29 15:51   ` [PATCH v2 4/6] connect: move error check to caller of parse_connect_url Patrick Steinhardt
2015-07-29 20:32     ` Eric Sunshine
2015-07-30 12:19       ` Patrick Steinhardt
2015-07-29 15:51   ` [PATCH v2 5/6] clone: fix hostname parsing when guessing dir Patrick Steinhardt
2015-07-29 17:42     ` Junio C Hamano
2015-07-30 12:18       ` Patrick Steinhardt [this message]
2015-07-30 16:30         ` Junio C Hamano
2015-07-30 16:53           ` Junio C Hamano
2015-08-03  8:34             ` Patrick Steinhardt
2015-08-03 16:37               ` Jeff King
2015-08-03 19:43                 ` Junio C Hamano
2015-07-29 15:51   ` [PATCH v2 6/6] clone: add tests for cloning with empty path Patrick Steinhardt
2015-07-30 18:18     ` Eric Sunshine
2015-07-31  0:58       ` Junio C Hamano
2015-07-31  8:45         ` Patrick Steinhardt
2015-08-04 11:29 ` [PATCH v3 0/6] fix repo name when cloning a server's root Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 1/6] tests: fix broken && chains in t1509-root-worktree Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 2/6] tests: fix cleanup after tests " Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 3/6] clone: do not include authentication data in guessed dir Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 4/6] clone: do not use port number as dir name Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 5/6] clone: abort if no dir name could be guessed Patrick Steinhardt
2015-08-04 11:29   ` [PATCH v3 6/6] clone: add tests for cloning with empty path Patrick Steinhardt
2015-08-04 18:37     ` Eric Sunshine
2015-08-05 17:34   ` [PATCH v3 0/6] fix repo name when cloning a server's root Junio C Hamano
2015-08-05 21:19     ` Jeff King
2015-08-06  7:22       ` Torsten Bögershausen
2015-08-06  8:00         ` Junio C Hamano
2015-08-05 10:06 ` [PATCH v4 0/3] " Patrick Steinhardt
2015-08-05 10:06   ` [PATCH v4 1/3] clone: do not include authentication data in guessed dir Patrick Steinhardt
2015-08-05 17:43     ` Junio C Hamano
2015-08-05 19:36       ` Junio C Hamano
2015-08-05 19:41         ` Junio C Hamano
2015-08-06  9:47           ` Patrick Steinhardt
2015-08-07 20:45             ` Junio C Hamano
2015-08-08 17:37               ` Patrick Steinhardt
2015-08-05 10:06   ` [PATCH v4 2/3] clone: do not use port number as dir name Patrick Steinhardt
2015-08-05 10:06   ` [PATCH v4 3/3] clone: abort if no dir name could be guessed Patrick Steinhardt
2015-08-05 17:44     ` Junio C Hamano
2015-08-10 15:48 ` [PATCH v5 0/5] Improve guessing of repository names Patrick Steinhardt
2015-08-10 15:48   ` [PATCH v5 1/5] clone: add tests for output directory Patrick Steinhardt
2015-08-10 15:48   ` [PATCH v5 2/5] clone: use computed length in guess_dir_name Patrick Steinhardt
2015-08-10 15:48   ` [PATCH v5 3/5] clone: do not include authentication data in guessed dir Patrick Steinhardt
2015-08-10 15:48   ` [PATCH v5 4/5] clone: do not use port number as dir name Patrick Steinhardt
2015-08-10 15:48   ` [PATCH v5 5/5] clone: abort if no dir name could be guessed Patrick Steinhardt
2015-08-10 18:07   ` [PATCH v5 0/5] Improve guessing of repository names Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150730121811.GA24635@pks-pc.localdomain \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).