On 2020-10-09 at 21:10:04, Junio C Hamano wrote: > "brian m. carlson" writes: > > > We'd like to canonicalize paths such that we can preserve any number of > > trailing components that may be missing. > > Sorry, but at least to me, the above gives no clue what kind of > operation is desired to be done on paths. How would one preserve > what does not exist (i.e. are missing)? > > Do you mean some leading components in a path point at existing > directories and after some point a component names a directory > that does not exist, so everything after that does not yet exist > until you "mkdir -p" them? > > I guess my confusion comes primarily from the fuzziness of the verb > "canonicalize" in the sentence. We want to handle a/b/../c/d and > there are various combinations of missng and existing directories, > e.g. a/b may not exist or a/b may but a/c may not, etc. Is that > what is going on? Makes me wonder if it makes sense to canonicalize > a/b/../c/d into a/c/d when a/b does not exist in the first place, > though. The behavior that I'm proposing is the realpath -m behavior. If the path we're canonicalizing doesn't exist, we find the closest parent that does exist, canonicalize it (à la realpath(3)), and then append the components that don't exist to the canonicalized portion. > > Let's add a function to do > > that that calls strbuf_realpath to find the canonical path for the > > portion we do have and then append the missing part. We adjust > > strip_last_component to return us the component it has stripped and use > > that to help us accumulate the missing part. > > OK, so if we have a/b/c/d and know a/b/c/d does not exist on the > filesystem, we start by splitting it to a/b/c and d, see if a/b/c > exists, and if not, do the same recursively to a/b/c to split it > into a/b and c, and prefix the latter to 'd' that we split earlier > (i.e. now we have a/b and c/d), until we have an existing directory > on the first half? Correct. > > +/* > > + * Like strbuf_realpath, but trailing components which do not exist are copied > > + * through. > > + */ > > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path) > > +{ > > + struct strbuf remaining = STRBUF_INIT; > > + struct strbuf trailing = STRBUF_INIT; > > + struct strbuf component = STRBUF_INIT; > > + > > + strbuf_addstr(&remaining, path); > > + > > + while (remaining.len) { > > + if (strbuf_realpath(resolved, remaining.buf, 0)) { > > + strbuf_addbuf(resolved, &trailing); > > + > > + strbuf_release(&component); > > + strbuf_release(&remaining); > > + strbuf_release(&trailing); > > + > > + return resolved->buf; > > + } > > + strip_last_component(&remaining, &component); > > + strbuf_insertstr(&trailing, 0, "/"); > > + strbuf_insertstr(&trailing, 1, component.buf); > > I may be utterly confused, but is this where > > - we started with a/b/c/d, pushed 'd' into trailing and decided > to redo with a/b/c > > - now we split the a/b/c into a/b and c, and adjusting what is > in trailing from 'd' to 'c/d' > > happens place? It's a bit sad that we need to repeatedly use > insertstr to prepend in front, instead of appending. Yes, that's true. It really isn't avoidable, though, with the functions the way that they are. We can't use the original path and keep track of the offset because it may contain multiple path separators and we don't want to include those in the path. -- brian m. carlson: Houston, Texas, US