git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Sixt <j6t@kdbg.org>, Klaus Ethgen <Klaus@Ethgen.ch>,
	git@vger.kernel.org
Subject: Re: [BUG] Colon in remote urls
Date: Sat, 10 Dec 2016 03:51:34 -0500	[thread overview]
Message-ID: <20161210085133.2pnkz6eqlxoxdckg@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqwpf8rkeq.fsf@gitster.mtv.corp.google.com>

On Fri, Dec 09, 2016 at 11:07:25AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > (One other option is to just declare that the quarantine feature doesn't
> > work with colons in the pathname, but stop turning it on by default. I'm
> > not sure I like that, though).
> 
> I think we long time ago in 2005 have declared that a colon in a
> directory name would not work for Git repositories because of things
> like GIT_CEILING_DIRECTORIES, GIT_ALTERNATE_OBJECT_DIRECTORIES; so I
> do not think we terribly mind that direction.

I don't mind declaring the feature incompatible. But I'm not sure
whether I like not having it on by default, if only because it means we
have two distinct code paths to care about. They're sufficiently
different that I'd worry about a bug creeping into one and not the
other.

> > Here's a rough idea of what the quoting solution could look like. It
> > should make your case work, but I'm not sure what we want to do about
> > backwards-compatibility, if anything.
> 
> Yes, obviously it robs from those with backslash in their pathnames
> to please those with colons; we've never catered to the latter, so I
> am not sure if the trade-off is worth it.
> 
> I can see how adding a new environment variable could work, though.
> A naive way would be to add GIT_ALT_OBJ_DIRS that uses your quoting
> when splitting its elements, give it precedence over the existing
> one (or allow to use both and take union as the set of alternate
> object stores) and make sure that the codepaths we use internally
> uses the new variable.

Yeah, a new variable solves the backwards-compatibility issue, though if
we continue to use backslash as an escape, then people on Windows will
annoying _have_ to backslash-escape "C:\\" (worse, AIUI the conversion
from "/unix/path" to "C:\unix\path" happens transparently via msys
magic, and it would not know that we need to quote).

I think the least-gross alternative would be to make newline the new
delimiter. It's already the delimiter (and not quotable) in the
objects/info/alternates file, and I have a lot less trouble telling
people with newline in their filenames that they're Doing It Wrong.

> But if the quarantine codepath will stay to
> be the only user of this mechanism (and I strongly suspect that will
> be the case), the new environment could just be pointing at a single
> directory, i.e. GIT_OBJECT_QUARANTINE_DIRECTORY, whose value is
> added without splitting to the list of alternate object stores, and
> the quarantine codepath can export that instead.

Yes, I also considered that approach. The big downside is that it does
not help other users of GIT_ALTERNATE_OBJECT_DIRECTORIES. I doubt that
matters much in practice, though.

My other question is whether we care about compatibility between Git
versions here. If receive-pack sets the variable, we can assume that
index-pack will be run from the same version. But we also run hooks with
the quarantine variables set. The nice thing about the existing scheme
is that older versions of Git (or alternate implementations of Git) will
just work, because it builds on a mechanism that has existed for ages.

And that's the one thing that a quoting scheme has going for it: it only
impacts the variable when there is something to be quoted. So if your
repo path has a colon in it (or semi-colon on Windows) _and_ you are
calling something like jgit from your hook, then you might see a
failure. But either of those by itself is not a problem.

Side note: It makes me wonder if all implementations even support
GIT_ALTERNATE_OBJECT_DIRECTORIES. JGit seems to, looks like libgit2 does
not. The most backwards-compatible thing is not enabling quarantine by
default, and then there's no chance of regression, and you are
responsible for making sure you have a compatible setup before turning
the feature on. But like I said, I'd love to avoid that if we can.

So here's the array of options, as I see it:

  1. Disable quarantine by default; only turn it on if you don't have
     any of the funny corner cases.

  2. Introduce an extra single-item environment variable that gets
     appended to the list of alternates, and side-steps quoting issues.

  3. Introduce a new variable that can hold an alternate list, and
     either teach it reasonable quoting semantics, or give it a
     less-common separator.

  4. Introduce quoting to the existing variable, but make the quoting
     character sufficiently obscure that nobody cares about the lack of
     backwards compatibility.

I actually think (4) is reasonably elegant, except that the resulting
quoting scheme would be vaguely insane to look at. E.g., if we pick
newline as the quote character (not the separator), then you end up
with:

  $ repo=foo:bar.git
  $ GIT_ALTERNATE_OBJECT_DIRECTORIES=$(echo $repo | perl -pe 's/:/\n$&/')
  $ echo "$GIT_ALTERNATE_OBJECT_DIRECTORIES"
  foo
  :bar.git

It's pleasant for machines, but not for humans.

So I dunno. Opinions on those 4 options are welcome.

-Peff

  reply	other threads:[~2016-12-10  8:52 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-09 14:02 [BUG] Colon in remote urls Klaus Ethgen
2016-12-09 15:22 ` Jeff King
2016-12-09 19:07   ` Junio C Hamano
2016-12-10  8:51     ` Jeff King [this message]
2016-12-12 19:49       ` [PATCH 0/2] handling alternates paths with colons Jeff King
2016-12-12 19:52         ` [PATCH 1/2] alternates: accept double-quoted paths Jeff King
2016-12-13 11:30           ` Duy Nguyen
2016-12-13 11:52             ` Jeff King
2016-12-13 18:08             ` Junio C Hamano
2016-12-17  7:56               ` Duy Nguyen
2016-12-12 19:53         ` [PATCH 2/2] tmp-objdir: quote paths we add to alternates Jeff King
2016-12-12 20:53           ` Johannes Sixt
2016-12-13 11:44             ` Jeff King
2016-12-13 18:10               ` Junio C Hamano
2016-12-13 18:15                 ` Jeff King
2016-12-13 20:10                   ` Junio C Hamano
2016-12-13 19:09           ` [PATCH 3/2] t5547-push-quarantine: run the path separator test on Windows, too Johannes Sixt
2016-12-13 19:12             ` Jeff King
2016-12-13 19:23             ` Junio C Hamano
2016-12-21 21:33             ` [PATCH 4/2] t5615-alternate-env: double-quotes in file names do not work on Windows Johannes Sixt
2016-12-21 22:42               ` Jeff King
2016-12-22  6:13                 ` Johannes Sixt
2016-12-22 19:06                   ` Johannes Sixt
2016-12-22 21:38                     ` Johannes Schindelin
2016-12-22 22:19                     ` Jeff King
2016-12-12 22:37         ` [PATCH 0/2] handling alternates paths with colons Junio C Hamano
2016-12-13 11:50           ` Jeff King
2016-12-13 18:10             ` Junio C Hamano
2016-12-13 18:17               ` Jeff King
2016-12-13 18:42                 ` Junio C Hamano
2016-12-13 18:47                   ` Jeff King
2016-12-10  9:29     ` [BUG] Colon in remote urls Klaus Ethgen
2016-12-10 10:24       ` Jeff King
2016-12-10 10:46         ` Klaus Ethgen
2016-12-09 21:32   ` Johannes Sixt
2016-12-10  8:26     ` Jeff King
2016-12-10  9:41       ` Klaus Ethgen
2016-12-10 10:26         ` Jeff King
2016-12-10 10:51           ` Klaus Ethgen
2016-12-10  9:32     ` Klaus Ethgen
2016-12-10 18:18       ` Johannes Schindelin
2016-12-11 11:02         ` Klaus Ethgen
2016-12-11 14:51           ` Philip Oakley
2016-12-12 11:03           ` Johannes Schindelin
2016-12-12 11:50             ` Klaus Ethgen
2016-12-12 14:05               ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161210085133.2pnkz6eqlxoxdckg@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Klaus@Ethgen.ch \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).