From: Eric Wong <e@80x24.org> To: meta@public-inbox.org Subject: [PATCH 25/24] tighten up digit matches to ASCII for git output Date: Wed, 5 Jun 2019 02:18:48 +0000 Message-ID: <20190605021848.29258-1-e@80x24.org> (raw) In-Reply-To: <20190604112748.23598-1-e@80x24.org> While I don't expect git to suddenly start spewing non-ASCII digits in places I'd expect ASCII, this would make things easier for future hackers and reviewers. --- lib/PublicInbox/Git.pm | 4 ++-- lib/PublicInbox/Import.pm | 10 +++++----- script/public-inbox-convert | 6 +++--- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 9014e02..68445b3 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -141,7 +141,7 @@ again: } return; } - $head =~ /^[0-9a-f]{40} \S+ (\d+)$/ or + $head =~ /^[0-9a-f]{40} \S+ ([0-9]+)$/ or fail($self, "Unexpected result from git cat-file: $head"); my $size = $1; @@ -319,7 +319,7 @@ sub modified ($) { foreach my $oid (<$fh>) { chomp $oid; my $buf = cat_file($self, $oid) or next; - $$buf =~ /^committer .*?> (\d+) [\+\-]?\d+/sm or next; + $$buf =~ /^committer .*?> ([0-9]+) [\+\-]?[0-9]+/sm or next; my $cmt_time = $1; $modified = $cmt_time if $cmt_time > $modified; } diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm index 81a38fb..2c4bad9 100644 --- a/lib/PublicInbox/Import.pm +++ b/lib/PublicInbox/Import.pm @@ -106,7 +106,7 @@ sub _cat_blob ($$$) { local $/ = "\n"; my $info = <$r>; defined $info or die "EOF from fast-import / cat-blob: $!"; - $info =~ /\A[a-f0-9]{40} blob (\d+)\n\z/ or return; + $info =~ /\A[a-f0-9]{40} blob ([0-9]+)\n\z/ or return; my $left = $1; my $offset = 0; my $buf = ''; @@ -493,9 +493,9 @@ sub clean_purge_buffer { foreach my $i (0..$#$buf) { my $l = $buf->[$i]; - if ($l =~ /^author .* (\d+ [\+-]?\d+)$/) { + if ($l =~ /^author .* ([0-9]+ [\+-]?[0-9]+)$/) { $buf->[$i] = "author <> $1\n"; - } elsif ($l =~ /^data (\d+)/) { + } elsif ($l =~ /^data ([0-9]+)/) { $buf->[$i++] = "data " . length($cmt_msg) . "\n"; $buf->[$i] = $cmt_msg; last; @@ -525,7 +525,7 @@ sub purge_oids { @buf = (); } push @buf, "commit $tmp\n"; - } elsif (/^data (\d+)/) { + } elsif (/^data ([0-9]+)/) { # only commit message, so $len is small: my $len = $1; # + 1 for trailing "\n" push @buf, $_; @@ -557,7 +557,7 @@ sub purge_oids { @buf = (); } elsif ($_ eq "done\n") { $done = 1; - } elsif (/^mark :(\d+)$/) { + } elsif (/^mark :([0-9]+)$/) { push @buf, $_; $mark = $1; } else { diff --git a/script/public-inbox-convert b/script/public-inbox-convert index bd8fb98..99480c3 100755 --- a/script/public-inbox-convert +++ b/script/public-inbox-convert @@ -103,7 +103,7 @@ while (<$rd>) { $state = 'blob'; } elsif (/^commit /) { $state = 'commit'; - } elsif (/^data (\d+)/) { + } elsif (/^data ([0-9]+)/) { my $len = $1; $w->print($_) or $im->wfail; while ($len) { @@ -114,7 +114,7 @@ while (<$rd>) { } next; } elsif ($state eq 'commit') { - if (m{^M 100644 :(\d+) (${h}{2}/${h}{38})}o) { + if (m{^M 100644 :([0-9]+) (${h}{2}/${h}{38})}o) { my ($mark, $path) = ($1, $2); $D{$path} = $mark; if ($last && $last ne 'm') { @@ -134,7 +134,7 @@ while (<$rd>) { $last = 'd'; next; } - if (m{^from (:\d+)}) { + if (m{^from (:[0-9]+)}) { $prev = $from; $from = $1; # no next -- EW
prev parent reply other threads:[~2019-06-05 2:18 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-04 11:27 [PATCH 00/24] fix IDN linkification, add paranoia Eric Wong 2019-06-04 11:27 ` [PATCH 01/24] linkify: support Internationalized Domain Names in URLs Eric Wong 2019-06-04 11:27 ` [PATCH 02/24] nntp: be explicit about ASCII digit matches Eric Wong 2019-06-04 11:27 ` [PATCH 03/24] nntp: ensure we only handle ASCII whitespace Eric Wong 2019-06-04 11:27 ` [PATCH 04/24] mid: id_compress requires ASCII-clean words Eric Wong 2019-06-04 11:27 ` [PATCH 05/24] feed: only accept ASCII digits for ref~$N Eric Wong 2019-06-04 11:27 ` [PATCH 06/24] http: require SERVER_PORT to be ASCII digit Eric Wong 2019-06-04 11:27 ` [PATCH 07/24] wwwlisting: require ASCII digit for port number Eric Wong 2019-06-04 11:27 ` [PATCH 08/24] wwwattach: only pass the charset through if ASCII Eric Wong 2019-06-04 11:27 ` [PATCH 09/24] www: only emit ASCII chars in attachment filenames Eric Wong 2019-06-04 11:27 ` [PATCH 10/24] www: require ASCII filenames in git blob downloads Eric Wong 2019-06-04 11:27 ` [PATCH 11/24] config: do not accept non-ASCII digits in cgitrc params Eric Wong 2019-06-04 11:27 ` [PATCH 12/24] newswww: only accept ASCII digits as article numbers Eric Wong 2019-06-04 11:27 ` [PATCH 13/24] view: require YYYYmmDD(HHMMSS) timestamps to be ASCII Eric Wong 2019-06-04 11:27 ` [PATCH 14/24] githttpbackend: require Range:, Status: to be ASCII digits Eric Wong 2019-06-04 11:27 ` [PATCH 15/24] searchview: do not allow non-ASCII offsets and limits Eric Wong 2019-06-04 11:27 ` [PATCH 16/24] msgtime: require ASCII digits for parsing dates Eric Wong 2019-06-04 11:27 ` [PATCH 17/24] filter/rubylang: require ASCII digit for mailcount Eric Wong 2019-06-04 11:27 ` [PATCH 18/24] inbox: require ASCII digits for feedmax var Eric Wong 2019-06-04 11:27 ` [PATCH 19/24] solver|viewdiff: restrict digit matches to ASCII Eric Wong 2019-06-04 11:27 ` [PATCH 20/24] www: require ASCII digit for git epoch Eric Wong 2019-06-04 11:27 ` [PATCH 21/24] require ASCII digits for local FS items Eric Wong 2019-06-04 11:27 ` [PATCH 22/24] githttpbackend: require ASCII in path Eric Wong 2019-06-04 11:27 ` [PATCH 23/24] www: require ASCII range for mbox downloads Eric Wong 2019-06-04 11:27 ` [PATCH 24/24] www: require ASCII word characters for CSS filenames Eric Wong 2019-06-05 2:18 ` Eric Wong [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: https://public-inbox.org/README * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190605021848.29258-1-e@80x24.org \ --to=e@80x24.org \ --cc=meta@public-inbox.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
user/dev discussion of public-inbox itself This inbox may be cloned and mirrored by anyone: git clone --mirror https://public-inbox.org/meta git clone --mirror http://czquwvybam4bgbro.onion/meta git clone --mirror http://hjrcffqmbrq6wope.onion/meta git clone --mirror http://ou63pmih66umazou.onion/meta # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \ meta@public-inbox.org public-inbox-index meta Example config snippet for mirrors. Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta nntp://news.gmane.io/gmane.mail.public-inbox.general note: .onion URLs require Tor: https://www.torproject.org/ code repositories for the project(s) associated with this inbox: https://80x24.org/public-inbox.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git