git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: Jiang Xin <worldhello.net@gmail.com>,
	Lea Wiemann <lewiemann@gmail.com>,
	git@vger.kernel.org, Johannes Sixt <j6t@kdbg.org>
Subject: Re: Proposed function path_in_directory()
Date: Mon, 08 Oct 2012 09:13:07 -0700	[thread overview]
Message-ID: <7vy5jhlz70.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <506FE619.1020608@alum.mit.edu> (Michael Haggerty's message of "Sat, 06 Oct 2012 10:04:41 +0200")

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 10/01/2012 06:51 AM, Michael Haggerty wrote:
>> I think I would advocate that the prefix has to match the front of the
>> path exactly (including any trailing slashes) and either
>> 
>>     strlen(prefix) == 0
>>     or the prefix ended with a '/'
>>     or the prefix and path are identical
>>     or the character in path following the matching part is a '/'
>> 
>> This would allow the "is path its own prefix" policy to be decided by
>> the caller by either including or omitting a trailing slash on the
>> prefix argument.
>
> Thinking about this more, I don't think it will work.  As usual, the
> special cases around "/" and "//" make things awkward.  I think it is
> necessary to have a separate argument to specify whether "path is its
> own prefix".
>
> So I am trying to decide how a function path_in_directory() should work,
> and would like to get some feedback, especially on the following two points:
>
> 1. How should "//" be handled?  I don't really have experience with the
> peculiar paths that start with "//", so I'm not sure how they should be
> handled (or even if the handling needs to be platform-dependent).  My
> working hypothesis is that the inputs should be normalized by the
> caller, so if the caller wants "//" to be treated as equivalent to "/"
> then the caller should normalize them *before* calling this function.
> Conversely, if the caller passes "//" to the function, that implies that
> "//" is distinct from "/" and "//" is considered a proper subdirectory
> of "/".  See the cases marked with "??????" below.

I think POSIX essentially says that anything that begins with "//"
is up to the platform, but in practice, the only real-life variant
we see is "//dosshare/path/from/root".  I agree with your "caller
should normalize for the semantics it wants to see".

If our lazy programming creates paths with duplicated slashes, we
should be fixing such codepaths anyway, so assuming that all paths
we create ourselves are free of double-slashes _inside_ a path (i.e.
when concatenating a subdirectory name to a directory name), the
only case we need to worry about is the double-slashes given by the
user (either from the command line, environment, or configuration).
The path normalizing logic removes double-slashes inside a path, and
I think it should do so while keeping the leading slashes, so that we
do not lose distinction between "//dosshare/" and "/dosshare/".

> 2. Does there need to be any special related to DOS paths?

The ceiling computation may need special case for dos.  What does
the getcwd() give us?  Do we learn only the path within the "current
drive" and need to prefix C: (or D: or X:) ourselves if we really
want to tell C:\bin and D:\bin apart?

Assuming that is the case, the ceiling computation would need a
helper function that hides the gory details of prefixing getcwd()
result with drive letter or whatever needed, and another that
normalizes the elements of the environment variables (I presume that
if an element in it without the drive prefix should be normalized to
add the current drive letter to it so that the normalized getcwd()
result can be compared with it).  E.g. if the ceiling list is
"D:/a/b;/trash/" then getcwd() returning "/a" alone does not make it
outside the ceiling due to "D:/a/b"---our current drive must be "D"
for that pattern to kick in.  The unqualified /trash would apply to
any drive.

Does that make sense?

  reply	other threads:[~2012-10-08 16:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-29  6:15 [PATCH v2 0/9] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 1/9] Introduce new static function real_path_internal() Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 2/9] real_path_internal(): add comment explaining use of cwd Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 3/9] Introduce new function real_path_if_valid() Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 4/9] longest_ancestor_length(): use string_list_split() Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 5/9] longest_ancestor_length(): explicitly filter list before loop Michael Haggerty
2012-09-29  6:15 ` [PATCH v2 6/9] longest_ancestor_length(): always add a slash to the end of prefixes Michael Haggerty
2012-09-29  6:16 ` [PATCH v2 7/9] longest_ancestor_length(): use string_list_longest_prefix() Michael Haggerty
2012-09-29  6:16 ` [PATCH v2 8/9] longest_ancestor_length(): resolve symlinks before comparing paths Michael Haggerty
2012-09-30  8:00   ` Junio C Hamano
2012-10-01  4:51     ` Michael Haggerty
2012-10-01  5:30       ` Junio C Hamano
2012-10-06  8:04       ` Proposed function path_in_directory() [was: Re: [PATCH v2 8/9] longest_ancestor_length(): resolve symlinks before comparing paths] Michael Haggerty
2012-10-08 16:13         ` Junio C Hamano [this message]
2012-10-08 18:20           ` Proposed function path_in_directory() Johannes Sixt
2012-10-08 18:23             ` Junio C Hamano
2012-09-29  6:16 ` [PATCH v2 9/9] t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vy5jhlz70.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=j6t@kdbg.org \
    --cc=lewiemann@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).