git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>,
	Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 1/1] mingw: only test index entries for backslashes, not tree entries
Date: Tue, 31 Dec 2019 23:51:35 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1912312329480.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <xmqqblruk9lz.fsf@gitster-ct.c.googlers.com>

Hi Junio & Jonathan,

On Thu, 26 Dec 2019, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
> > Jonathan Nieder <jrnieder@gmail.com> writes:
> >
> >> Is there anything we can or should do to prevent people checking in
> >> new examples of paths with backslash in them (on all platforms)?
> >
> > I obviously won't dictate what should happen on Windows, but I think
> > the overall principle for paths recorded in a tree object that can
> > be problematic on some of the platforms ought to be:
> >
> >  * fsck and transfer.fsckobjects should be taught to notice
> >    offending characteristics (e.g. has a backslash in it, is one of
> >    the "reserved names" on some platform like LPT1).

Agree. This is on my radar, but so far not too-high priority, as the
`fsck` checks are not (yet?) standard practice in PR builds (and warnings
on the server side are prone to be ignored).

> >  * if paths with the offending characteristics are *so* obviously
> >    useless in real life and are possible only in a crafted path that
> >    is only useful to attack users, the check in fsck should default
> >    to "reject" to help the disease spread via hosting sites.

I don't think that reserved names such as `aux`, nor names containing
backslashes should be rejected _always_. While I cannot think of _any_
instance where I would want to have a backslash in a file name, I am sure
that just like `aux.c`, there _must_ be somebody out there who thought of
a file name that contains a backslash and makes at least some sort of
sense.

> >  * otherwise, the check should be to "warn" but not "reject", so
> >    that projects can keep using paths that may problematic on
> >    platforms that do not matter to them.

Yes, it should be "warn".

> > I think LPT1 and friends fall into the "warning is fine" category,
> > and a path component that contains a backslash would fall into the
> > "this is an attack, just reject" category.
>
> I guess I should have stepped back a bit.
>
> In the message I am responding to, I focused solely on how tree
> objects that originate elsewhere should be treated, but there are
> two more data paths we need to worry about:
>
>  * A new path gets introduced to the system, via "update-index",
>    "add", etc., to the index and then "write-tree" to form tree
>    objects in the local repository.

Right, that's what I had in mind when I wrote this patch. The path gets
added to the index, we detect a backslash, and on Windows (under
`core.protectNTFS`) fail with an error.

>  * A path in the index, either created locally via "update-index",
>    "add", etc., or read from a tree object, gets written to the
>    local filesystem, resulting in an attempt to create a path that
>    the local filesystem cannot represent (or worse---behaves badly,
>    like "sending random garbage to the printer").

Happily, my patch seems to catch this code path, too: when reading from a
`tree` object into the index, we use `add_index_entry()` (called via
various code paths in the `unpack_trees()` machinery). That's exactly the
patched function.

Or maybe you know of a code path in the `unpack_trees()` machinery that
does _not_ go through `add_index_entry()`? I would be very interested to
learn about such code paths.

> I think we should apply the same principle as the one I outlined for
> the tree objects.  The fsckobjects mechanism may not be reusable to
> catch violations in add_index_entry_with_check() as-is, but we need
> to aim for as much reuse of logic and code as possible so that our
> checks at various layers all behave consistently.

I am afraid that we won't be able to reuse code paths for checking the
backslash here, but for reserved names I am planning on refactoring the
code accordingly.

Thanks,
Dscho

  reply	other threads:[~2019-12-31 22:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-26 17:42 [PATCH 0/1] Disallow writing, but not fetching commits with file names containing backslashes Johannes Schindelin via GitGitGadget
2019-12-26 17:42 ` [PATCH 1/1] mingw: only test index entries for backslashes, not tree entries Johannes Schindelin via GitGitGadget
2019-12-26 18:56   ` Junio C Hamano
2019-12-26 21:16     ` Johannes Schindelin
2019-12-30 21:57       ` Junio C Hamano
2020-01-02 19:53         ` Johannes Schindelin
2019-12-26 20:03   ` Jonathan Nieder
2019-12-26 21:23     ` Johannes Schindelin
2019-12-26 21:42       ` Jonathan Nieder
2019-12-26 22:01         ` Junio C Hamano
2019-12-26 22:25           ` Junio C Hamano
2019-12-31 22:51             ` Johannes Schindelin [this message]
2020-01-02 19:58         ` Johannes Schindelin
2020-01-04  1:57           ` Jonathan Nieder
2020-01-04 21:29             ` Johannes Schindelin
2019-12-26 19:22 ` [PATCH 0/1] Disallow writing, but not fetching commits with file names containing backslashes Junio C Hamano
2019-12-26 21:19   ` Johannes Schindelin
2019-12-31 22:53 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
2019-12-31 22:53   ` [PATCH v2 1/1] mingw: only test index entries for backslashes, not tree entries Johannes Schindelin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1912312329480.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).