git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
Date: Sat, 10 Oct 2020 01:10:48 +0000	[thread overview]
Message-ID: <20201010011048.GQ1392312@camp.crustytoothpaste.net> (raw)
In-Reply-To: <xmqqk0vzrtqr.fsf@gitster.c.googlers.com>

[-- Attachment #1: Type: text/plain, Size: 3602 bytes --]

On 2020-10-09 at 21:10:04, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > We'd like to canonicalize paths such that we can preserve any number of
> > trailing components that may be missing.
> 
> Sorry, but at least to me, the above gives no clue what kind of
> operation is desired to be done on paths.  How would one preserve
> what does not exist (i.e. are missing)?
> 
> Do you mean some leading components in a path point at existing
> directories and after some point a component names a directory
> that does not exist, so everything after that does not yet exist
> until you "mkdir -p" them?
> 
> I guess my confusion comes primarily from the fuzziness of the verb
> "canonicalize" in the sentence.  We want to handle a/b/../c/d and
> there are various combinations of missng and existing directories,
> e.g. a/b may not exist or a/b may but a/c may not, etc.  Is that
> what is going on?  Makes me wonder if it makes sense to canonicalize
> a/b/../c/d into a/c/d when a/b does not exist in the first place,
> though.

The behavior that I'm proposing is the realpath -m behavior.  If the
path we're canonicalizing doesn't exist, we find the closest parent that
does exist, canonicalize it (à la realpath(3)), and then append the
components that don't exist to the canonicalized portion.

> > Let's add a function to do
> > that that calls strbuf_realpath to find the canonical path for the
> > portion we do have and then append the missing part.  We adjust
> > strip_last_component to return us the component it has stripped and use
> > that to help us accumulate the missing part.
> 
> OK, so if we have a/b/c/d and know a/b/c/d does not exist on the
> filesystem, we start by splitting it to a/b/c and d, see if a/b/c
> exists, and if not, do the same recursively to a/b/c to split it
> into a/b and c, and prefix the latter to 'd' that we split earlier
> (i.e. now we have a/b and c/d), until we have an existing directory
> on the first half?

Correct.

> > +/*
> > + * Like strbuf_realpath, but trailing components which do not exist are copied
> > + * through.
> > + */
> > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> > +{
> > +	struct strbuf remaining = STRBUF_INIT;
> > +	struct strbuf trailing = STRBUF_INIT;
> > +	struct strbuf component = STRBUF_INIT;
> > +
> > +	strbuf_addstr(&remaining, path);
> > +
> > +	while (remaining.len) {
> > +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> > +			strbuf_addbuf(resolved, &trailing);
> > +
> > +			strbuf_release(&component);
> > +			strbuf_release(&remaining);
> > +			strbuf_release(&trailing);
> > +
> > +			return resolved->buf;
> > +		}
> > +		strip_last_component(&remaining, &component);
> > +		strbuf_insertstr(&trailing, 0, "/");
> > +		strbuf_insertstr(&trailing, 1, component.buf);
> 
> I may be utterly confused, but is this where
> 
>     - we started with a/b/c/d, pushed 'd' into trailing and decided
>       to redo with a/b/c
> 
>     - now we split the a/b/c into a/b and c, and adjusting what is
>       in trailing from 'd' to 'c/d'
> 
> happens place?  It's a bit sad that we need to repeatedly use
> insertstr to prepend in front, instead of appending.

Yes, that's true.  It really isn't avoidable, though, with the functions
the way that they are.  We can't use the original path and keep track of
the offset because it may contain multiple path separators and we don't
want to include those in the path.
-- 
brian m. carlson: Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

  reply	other threads:[~2020-10-10  1:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
2020-10-09 21:10   ` Junio C Hamano
2020-10-10  1:10     ` brian m. carlson [this message]
2020-11-09 13:57       ` Johannes Schindelin
2020-11-09 13:55   ` Johannes Schindelin
2020-11-16  2:21     ` brian m. carlson
2020-10-09 19:15 ` [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
2020-11-09 14:46   ` Johannes Schindelin
2020-11-16  2:15     ` brian m. carlson
2020-11-04 23:01 ` [PATCH v2 0/2] rev-parse options for absolute or relative paths Emily Shaffer
2020-11-05  3:20   ` brian m. carlson
2020-11-09 13:33 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201010011048.GQ1392312@camp.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).