user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@yhbt.net>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: meta@public-inbox.org
Subject: [PATCH] t/import: test for nasty characters
Date: Sat, 4 Jul 2020 20:25:25 +0000	[thread overview]
Message-ID: <20200704202525.GA19556@dcvr> (raw)
In-Reply-To: <20200703233032.GA5810@dcvr>

Eric Wong <e@yhbt.net> wrote:
> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
> > -		$name =~ tr/<>//d;
> > +		$name =~ tr/\n\r<>$/     /d;
> 
> Is getting rid of '$' an effort to avoid double interpolation by Perl?
> Perl won't recursively expand variables AFAIK.

I'm not seeing the purpose in $ being grouped with the
characters (test below confirms it, I think).

> I agree with dropping \r and \n, though.

Actually, it's already dropped in git since June.

> > -	print $w "commit $ref\nmark :$commit\n",
> > -		"author $name <$email> $at\n",
> > -		"committer $self->{ident} $ct\n" or wfail;
> > +	# Be very careful with the strings from the email
> > +	print $w "commit ", $ref, "\nmark :", $commit, "\n",
> > +		"author ", $name, " <", $email, "> ", $at, "\n",
> > +		"committer ", $self->{ident}, " ", $ct, "\n" or wfail;
> 
> I haven't tested, but I'm not seeing this hunk as necessary
> once \r\n<> are removed.  Thanks.

I've tested now, and those changes to print seem unnecessary.
So I think just testing for these cases is enough.
Thanks again.

---------8<----------
Subject: [PATCH] t/import: test for nasty characters

Spammers may send emails with nasty characters which can throw
off git-fast-import.  Users with non-existent or weaker spam
filters may be susceptible to corruption in the fast-import
stream as a result.

This was actually quietly fixed in git on 2020-06-01 by
commit 9ab886546cc89f37819e1ef09cb49fd9325b3a41
("smsg: introduce ->populate method"), but no test case
was created.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Link: https://public-inbox.org/meta/87imf4qn87.fsf@x220.int.ebiederm.org/
Link: https://public-inbox.org/meta/20200601100657.14700-6-e@yhbt.net/
---
 t/import.t | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/import.t b/t/import.t
index abbc8229d..9491f3374 100644
--- a/t/import.t
+++ b/t/import.t
@@ -11,6 +11,7 @@ use PublicInbox::Spawn qw(spawn);
 use Fcntl qw(:DEFAULT SEEK_SET);
 use File::Temp qw/tempfile/;
 use PublicInbox::TestCommon;
+use MIME::Base64 3.05; # Perl 5.10.0 / 5.9.2
 my ($dir, $for_destroy) = tmpdir();
 
 my $git = PublicInbox::Git->new($dir);
@@ -103,4 +104,27 @@ eval {
 };
 ok($@, 'Import->add fails on non-existent dir');
 
+my @cls = qw(PublicInbox::Eml);
+SKIP: {
+	require_mods('PublicInbox::MIME', 1);
+	push @cls, 'PublicInbox::MIME';
+};
+
+$main::badchars = "\n\0\r";
+my $from = '=?UTF-8?B?'. encode_base64("B\ra\nd\0\$main::badchars", ''). '?=';
+for my $cls (@cls) {
+	my $eml = $cls->new(<<EOF);
+From: $from <spammer\@example.com>
+Message-ID: <$cls\@example.com>
+
+EOF
+	ok($im->add($eml), "added $cls message with nasty char in From");
+}
+$im->done;
+my $bref = $git->cat_file('HEAD');
+like($$bref, qr/^author Ba d \$main::badchars <spammer\@example\.com> /sm,
+	 'latest commit accepted by spammer');
+$git->qx(qw(fsck --no-progress --strict));
+is($?, 0, 'fsck reported no errors');
+
 done_testing();

  reply	other threads:[~2020-07-04 20:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03 21:53 [PATCH] Import: Be more careful with names in email Eric W. Biederman
2020-07-03 23:30 ` Eric Wong
2020-07-04 20:25   ` Eric Wong [this message]
2020-07-04 20:28     ` [PATCH] t/import: test for nasty characters Eric W. Biederman
2020-07-04 21:24       ` Eric Wong
2020-07-05 14:55       ` Leah Neukirchen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200704202525.GA19556@dcvr \
    --to=e@yhbt.net \
    --cc=ebiederm@xmission.com \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).