user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now)
  @ 2019-12-01 22:04  6%   ` Eric Wong
  2019-12-12  3:42  7%     ` Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-12-01 22:04 UTC (permalink / raw)
  To: meta

Wacky dates show up in lore for valid messages.  Lets ignore
them and let future generations deal with Y10K and time-travel
problems.
---
 lib/PublicInbox/MsgTime.pm |  6 +++++-
 t/msgtime.t                | 14 ++++++++++++--
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/MsgTime.pm b/lib/PublicInbox/MsgTime.pm
index 479aaa4ecf132..9f4326442dd11 100644
--- a/lib/PublicInbox/MsgTime.pm
+++ b/lib/PublicInbox/MsgTime.pm
@@ -38,7 +38,7 @@ sub str2date_zone ($) {
 	if ($date =~ /(?:[A-Za-z]+,?\s+)? # day-of-week
 			([0-9]+),?\s+  # dd
 			([A-Za-z]+)\s+ # mon
-			([0-9]{2,})\s+ # YYYY or YY (or YYY :P)
+			([0-9]{2,4})\s+ # YYYY or YY (or YYY :P)
 			([0-9]+)[:\.] # HH:
 				((?:[0-9]{2})|(?:\s?[0-9])) # MM
 				(?:[:\.]((?:[0-9]{2})|(?:\s?[0-9])))? # :SS
@@ -67,6 +67,10 @@ sub str2date_zone ($) {
 
 		$ts = timegm($ss // 0, $mm, $hh, $dd, $mon, $yyyy);
 
+		# 4-digit dates in non-spam from 1900s and 1910s exist in
+		# lore archives
+		return if $ts < 0;
+
 		# Compute the time offset from [+-]HHMM
 		$tz //= 0;
 		my ($tz_hh, $tz_mm);
diff --git a/t/msgtime.t b/t/msgtime.t
index 1452dc97d5b0b..cecad775769e1 100644
--- a/t/msgtime.t
+++ b/t/msgtime.t
@@ -5,7 +5,7 @@ use warnings;
 use Test::More;
 use PublicInbox::MIME;
 use PublicInbox::MsgTime;
-
+our $received_date = 'Mon, 22 Jan 2007 13:16:24 -0500';
 sub datestamp ($) {
 	my ($date) = @_;
 	local $SIG{__WARN__} = sub {};  # Suppress warnings
@@ -17,7 +17,11 @@ sub datestamp ($) {
 			Subject => 'this is a subject',
 			'Message-ID' => '<a@example.com>',
 			Date => $date,
-			'Received' => '(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);\n\tMon, 22 Jan 2007 13:16:24 -0500',
+			'Received' => <<EOF,
+(majordomo\@vger.kernel.org) by vger.kernel.org via listexpand
+\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);
+\t$received_date
+EOF
 		],
 		body => "hello world\n",
 	    );
@@ -104,4 +108,10 @@ for (qw(UT GMT Z)) {
 }
 is_datestamp('Fri, 02 Oct 1993 00:00:00 EDT', [ 749534400, '-0400']);
 
+# fallback to Received: header if Date: is out-of-range:
+is_datestamp('Fri, 1 Jan 1904 10:12:31 +0100',
+	PublicInbox::MsgTime::str2date_zone($received_date));
+is_datestamp('Fri, 9 Mar 71685 18:45:56 +0000', # Y10K is not my problem :P
+	PublicInbox::MsgTime::str2date_zone($received_date));
+
 done_testing();

^ permalink raw reply	[relevance 6%]

* Re: [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now)
  2019-12-01 22:04  6%   ` [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now) Eric Wong
@ 2019-12-12  3:42  7%     ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-12-12  3:42 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> diff --git a/t/msgtime.t b/t/msgtime.t
> index 1452dc97d5b0b..cecad775769e1 100644
> --- a/t/msgtime.t
> +++ b/t/msgtime.t
<snip>
> @@ -17,7 +17,11 @@ sub datestamp ($) {
>  			Subject => 'this is a subject',
>  			'Message-ID' => '<a@example.com>',
>  			Date => $date,
> -			'Received' => '(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);\n\tMon, 22 Jan 2007 13:16:24 -0500',
> +			'Received' => <<EOF,
> +(majordomo\@vger.kernel.org) by vger.kernel.org via listexpand
> +\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);
> +\t$received_date
> +EOF

Oops, accidental interpolation of @1 :x  Will squash this before
pushing:

diff --git a/t/msgtime.t b/t/msgtime.t
index cecad775..98cf66e6 100644
--- a/t/msgtime.t
+++ b/t/msgtime.t
@@ -19,7 +19,7 @@ sub datestamp ($) {
 			Date => $date,
 			'Received' => <<EOF,
 (majordomo\@vger.kernel.org) by vger.kernel.org via listexpand
-\tid S932173AbXAVSQY (ORCPT <rfc822;w@1wt.eu>);
+\tid S932173AbXAVSQY (ORCPT <rfc822;w\@1wt.eu>);
 \t$received_date
 EOF
 		],

^ permalink raw reply	[relevance 7%]

Results 1-2 of 2 | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2019-11-29 12:25     [PATCH 0/4] drop Date::Parse dependency Eric Wong
2019-11-29 12:25     ` [PATCH 4/4] Date::Parse is now optional Eric Wong
2019-12-01 22:04  6%   ` [PATCH 5/4] msgtime: avoid obviously out-of-range dates (for now) Eric Wong
2019-12-12  3:42  7%     ` Eric Wong

Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).