user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: [PATCH] Import: Don't copy nulls from emails into git
Date: Sat, 07 Jul 2018 13:22:28 -0500	[thread overview]
Message-ID: <87d0vysy6z.fsf_-_@xmission.com> (raw)
In-Reply-To: <20180706222251.GA14747@dcvr> (Eric Wong's message of "Fri, 6 Jul 2018 22:22:51 +0000")


Recently I ran git --git-dir=lkml/git/1.git fsck
and it reported:
> warning in commit 299dbd50b6995c6debe2275f0df984ce697fb4cc: nulInCommit: NULL byte inthe commit object body

Which I found quite scary.  Nulls in the wrong place have a bad tendency
to make programs misbehave.

It turns out someone had placed "=?iso-8859-1?q?=00?=" at the end of
their subject line.  Which is the mime encoding for NULL.  Email::Mime
had correctly decoded the header, and then public-inbox had simply
copied the contents of the header into the subject line of the git
commit.

To prevent that from causing problems replace nulls in such subject
lines with spaces.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
---
 lib/PublicInbox/Import.pm |  2 ++
 t/nulsubject.t            | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)
 create mode 100644 t/nulsubject.t

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index 250a2db31e97..8c1819209e58 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -405,6 +405,8 @@ sub add {
 		print $w "reset $ref\n" or wfail;
 	}
 
+	# Mime decoding can create nulls replace them with spaces to protect git
+	$subject =~ s/\0/ /;
 	utf8::encode($subject);
 	print $w "commit $ref\nmark :$commit\n",
 		"author $name <$email> $author_time_raw\n",
diff --git a/t/nulsubject.t b/t/nulsubject.t
new file mode 100644
index 000000000000..bb05be8589e7
--- /dev/null
+++ b/t/nulsubject.t
@@ -0,0 +1,33 @@
+# Copyright (C) 2016-2018 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Test::More;
+use File::Temp qw/tempdir/;
+
+use_ok 'PublicInbox::Import';
+use_ok 'PublicInbox::Git';
+my $tmpdir = tempdir('pi-nulsubject-XXXXXX', TMPDIR => 1, CLEANUP => 1);
+my $git_dir = "$tmpdir/a.git";
+
+{
+	is(system(qw(git init -q --bare), $git_dir), 0, 'git init ok');
+	my $git = PublicInbox::Git->new($git_dir);
+	my $im = PublicInbox::Import->new($git, 'testbox', 'test@example');
+	$im->add(Email::MIME->create(
+		header => [
+			From => 'a@example.com',
+			To => 'b@example.com',
+			'Content-Type' => 'text/plain',
+			Subject => ' A subject line with a null =?iso-8859-1?q?=00?= see!',
+			'Message-ID' => '<null-test.a@example.com>',
+		],
+		body => "hello world\n",
+	));
+	$im->done;
+	is(system(qw(git --git-dir), $git_dir, 'fsck', '--strict'), 0, 'git fsck ok');
+}
+
+done_testing();
+
+1;
-- 
2.17.1


  parent reply	other threads:[~2018-07-07 18:22 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-05  5:40 Warnings from git fsck after lkml import Eric W. Biederman
2018-07-05 23:13 ` Eric Wong
2018-07-06  0:36   ` Eric W. Biederman
2018-07-06  3:47     ` Eric W. Biederman
2018-07-06 21:32       ` [PATCH] MsgTime.pm: Use strptime to compute the time zone Eric W. Biederman
2018-07-06 22:22         ` Eric Wong
2018-07-07 18:18           ` Eric W. Biederman
2018-07-07 18:22           ` Eric W. Biederman [this message]
2018-07-08  0:07             ` [PATCH] Import: Don't copy nulls from emails into git Eric Wong
2018-07-08  1:52               ` Eric W. Biederman
2018-07-12 18:31   ` Warnings from git fsck after lkml import Konstantin Ryabitsev
2018-07-12 22:19     ` Eric W. Biederman
2018-07-12 22:29     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d0vysy6z.fsf_-_@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).