From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 82CA51F5AE; Sat, 4 Jul 2020 20:25:25 +0000 (UTC) Date: Sat, 4 Jul 2020 20:25:25 +0000 From: Eric Wong To: "Eric W. Biederman" Cc: meta@public-inbox.org Subject: [PATCH] t/import: test for nasty characters Message-ID: <20200704202525.GA19556@dcvr> References: <87imf4qn87.fsf@x220.int.ebiederm.org> <20200703233032.GA5810@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200703233032.GA5810@dcvr> List-Id: Eric Wong wrote: > "Eric W. Biederman" wrote: > > - $name =~ tr/<>//d; > > + $name =~ tr/\n\r<>$/ /d; > > Is getting rid of '$' an effort to avoid double interpolation by Perl? > Perl won't recursively expand variables AFAIK. I'm not seeing the purpose in $ being grouped with the characters (test below confirms it, I think). > I agree with dropping \r and \n, though. Actually, it's already dropped in git since June. > > - print $w "commit $ref\nmark :$commit\n", > > - "author $name <$email> $at\n", > > - "committer $self->{ident} $ct\n" or wfail; > > + # Be very careful with the strings from the email > > + print $w "commit ", $ref, "\nmark :", $commit, "\n", > > + "author ", $name, " <", $email, "> ", $at, "\n", > > + "committer ", $self->{ident}, " ", $ct, "\n" or wfail; > > I haven't tested, but I'm not seeing this hunk as necessary > once \r\n<> are removed. Thanks. I've tested now, and those changes to print seem unnecessary. So I think just testing for these cases is enough. Thanks again. ---------8<---------- Subject: [PATCH] t/import: test for nasty characters Spammers may send emails with nasty characters which can throw off git-fast-import. Users with non-existent or weaker spam filters may be susceptible to corruption in the fast-import stream as a result. This was actually quietly fixed in git on 2020-06-01 by commit 9ab886546cc89f37819e1ef09cb49fd9325b3a41 ("smsg: introduce ->populate method"), but no test case was created. Reported-by: Eric W. Biederman Link: https://public-inbox.org/meta/87imf4qn87.fsf@x220.int.ebiederm.org/ Link: https://public-inbox.org/meta/20200601100657.14700-6-e@yhbt.net/ --- t/import.t | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/t/import.t b/t/import.t index abbc8229d..9491f3374 100644 --- a/t/import.t +++ b/t/import.t @@ -11,6 +11,7 @@ use PublicInbox::Spawn qw(spawn); use Fcntl qw(:DEFAULT SEEK_SET); use File::Temp qw/tempfile/; use PublicInbox::TestCommon; +use MIME::Base64 3.05; # Perl 5.10.0 / 5.9.2 my ($dir, $for_destroy) = tmpdir(); my $git = PublicInbox::Git->new($dir); @@ -103,4 +104,27 @@ eval { }; ok($@, 'Import->add fails on non-existent dir'); +my @cls = qw(PublicInbox::Eml); +SKIP: { + require_mods('PublicInbox::MIME', 1); + push @cls, 'PublicInbox::MIME'; +}; + +$main::badchars = "\n\0\r"; +my $from = '=?UTF-8?B?'. encode_base64("B\ra\nd\0\$main::badchars", ''). '?='; +for my $cls (@cls) { + my $eml = $cls->new(< +Message-ID: <$cls\@example.com> + +EOF + ok($im->add($eml), "added $cls message with nasty char in From"); +} +$im->done; +my $bref = $git->cat_file('HEAD'); +like($$bref, qr/^author Ba d \$main::badchars /sm, + 'latest commit accepted by spammer'); +$git->qx(qw(fsck --no-progress --strict)); +is($?, 0, 'fsck reported no errors'); + done_testing();