user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [PATCH] lei_to_mail+mbox_reader: fix handling of empty/bogus emails
  @ 2021-09-07 20:56  6%         ` Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2021-09-07 20:56 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Sat, Sep 04, 2021 at 09:36:58PM +0000, Eric Wong wrote:
> > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> > > Yep, that seems to work fine. Question -- I noticed that lei just issues a
> > > regular query, retrieves results with curl and then parses the output. Is
> > > there a danger of potentially running into issues with parsing the regular
> > > HTML output if it changes in the future?
> > 
> > It's actually parsing gzipped mboxrd (&x=m).  But you're right
> > we could use stronger safeguards in case we see gzipped HTML or
> > something else...
> 
> Ooh, okay, I guess I should actually look at the output of the curl call. :)
> The questions I have, then:
> 
> 1. this means that each "lei up" call will be increasingly larger and larger,
>    since when we init the search with rt:, it gets resolved into a datestamp
>    (e.g. rt:2.weeks.ago becomes rt:1625699031). I'm worried that this will be
>    increasingly hard on the server side, especially if someone
>    fires-and-forgets a cronjob that ends up downloading ever-growing mboxes
>    every 5 minutes.

"rt:2.weeks.ago" stays "rt:2.weeks.ago" in saved searches :>

It was one of my primary annoyances when I initially implemented
this and commit 2e4e4b0d6f30d9d4612066395ba694c7c7d61e6e solved it.
https://public-inbox.org/meta/20210416231035.31807-2-e@80x24.org/
("lei q: --save preserves relative time queries")

> 2. is there some sanity limit on the server side that would prevent someone's
>    overly broad search query from gzipping and downloading gigabytes of mail?

Not right now.  With public-inbox-httpd, the actual git fetches
are handled fairly w.r.t to other requests (and I could
deprioritize them further, if needed...).  The Xapian query OTOH...

^ permalink raw reply	[relevance 6%]

* [PATCH 1/9] lei q: --save preserves relative time queries
  2021-04-16 23:10  7% [PATCH 0/9] lei saved search usability improvements Eric Wong
@ 2021-04-16 23:10  6% ` Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2021-04-16 23:10 UTC (permalink / raw)
  To: meta

Somebody may want a saved search which consistently asks for
messages within a rolling time period window.  In other words,
we want to support using "lei q --save dt:last.week.." and keeps
the "dt:last.week.." relative to whenever "lei up" is run.  This
ensures relative date-time specifications get used in the future
rather than converting into an absolute date-time from the
initial "lei q" invocation.
---
 lib/PublicInbox/LeiQuery.pm       |  2 +-
 lib/PublicInbox/LeiSavedSearch.pm |  5 +++--
 t/lei-q-save.t                    | 25 +++++++++++++++++++++----
 3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 7456f7f9..7ddba4cf 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -143,7 +143,7 @@ no query allowed on command-line with --stdin
 		PublicInbox::InputPipe::consume($self->{0}, \&qstr_add, $self);
 		return;
 	}
-	$mset_opt{q_raw} = \@argv;
+	$mset_opt{q_raw} = [ @argv ]; # copy
 	$mset_opt{qstr} =
 		$self->{lse}->query_argv_to_string($self->{lse}->git, \@argv);
 	_start_query($self);
diff --git a/lib/PublicInbox/LeiSavedSearch.pm b/lib/PublicInbox/LeiSavedSearch.pm
index 815008fd..e79cf76a 100644
--- a/lib/PublicInbox/LeiSavedSearch.pm
+++ b/lib/PublicInbox/LeiSavedSearch.pm
@@ -25,12 +25,13 @@ sub new {
 	} else { # new saved search "lei q --save"
 		my $saved_dir = $lei->store_path . '/../saved-searches/';
 		my (@name) = ($lei->{ovv}->{dst} =~ m{([\w\-\.]+)/*\z});
-		push @name, to_filename($lei->{mset_opt}->{qstr});
+		my $q = $lei->{mset_opt}->{q_raw} // die 'BUG: {q_raw} missing';
+		my $q_raw_str = ref($q) ? "@$q" : $q;
+		push @name, to_filename($q_raw_str);
 		$dir = $saved_dir . join('-', @name);
 		require File::Path;
 		File::Path::make_path($dir); # raises on error
 		$self->{'-f'} = "$dir/lei.saved-search";
-		my $q = $lei->{mset_opt}->{q_raw};
 		if (ref $q) {
 			cfg_set($self, '--add', 'lei.q', $_) for @$q;
 		} else {
diff --git a/t/lei-q-save.t b/t/lei-q-save.t
index a6d579cf..6cfac20b 100644
--- a/t/lei-q-save.t
+++ b/t/lei-q-save.t
@@ -2,24 +2,41 @@
 # Copyright (C) 2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
+use PublicInbox::Smsg;
 my $doc1 = eml_load('t/plack-qp.eml');
+$doc1->header_set('Date', PublicInbox::Smsg::date({ds => time - (86400 * 5)}));
 my $doc2 = eml_load('t/utf8.eml');
+$doc2->header_set('Date', PublicInbox::Smsg::date({ds => time - (86400 * 4)}));
+
 test_lei(sub {
 	my $home = $ENV{HOME};
-	lei_ok qw(import -q t/plack-qp.eml);
-	lei_ok qw(q -q --save z:0..), '-o', "$home/md/";
+	my $in = $doc1->as_string;
+	lei_ok [qw(import -q -F eml -)], undef, { 0 => \$in, %$lei_opt };
+	lei_ok qw(q -q --save z:0.. d:last.week..), '-o', "$home/md/";
 	my %before = map { $_ => 1 } glob("$home/md/cur/*");
 	is_deeply(eml_load((keys %before)[0]), $doc1, 'doc1 matches');
 
 	my @s = glob("$home/.local/share/lei/saved-searches/md-*");
 	is(scalar(@s), 1, 'got one saved search');
+	my $cfg = PublicInbox::Config->new("$s[0]/lei.saved-search");
+	is_deeply($cfg->{'lei.q'}, ['z:0..', 'd:last.week..'],
+		'store relative time, not parsed (absolute) timestamp');
 
 	# ensure "lei up" works, since it compliments "lei q --save"
-	lei_ok qw(import t/utf8.eml);
-	lei_ok qw(up), $s[0];
+	$in = $doc2->as_string;
+	lei_ok [qw(import -q -F eml -)], undef, { 0 => \$in, %$lei_opt };
+	lei_ok qw(up -q), $s[0];
 	my %after = map { $_ => 1 } glob("$home/md/cur/*");
 	is(delete $after{(keys(%before))[0]}, 1, 'original message kept');
 	is(scalar(keys %after), 1, 'one new message added');
 	is_deeply(eml_load((keys %after)[0]), $doc2, 'doc2 matches');
+
+	# check stdin
+	lei_ok [qw(q --save - -o), "mboxcl2:mbcl2" ],
+		undef, { -C => $home, %$lei_opt, 0 => \'d:last.week..'};
+	@s = glob("$home/.local/share/lei/saved-searches/mbcl2-*");
+	$cfg = PublicInbox::Config->new("$s[0]/lei.saved-search");
+	is_deeply $cfg->{'lei.q'}, 'd:last.week..',
+		'q --stdin stores relative time';
 });
 done_testing;

^ permalink raw reply related	[relevance 6%]

* [PATCH 0/9] lei saved search usability improvements
@ 2021-04-16 23:10  7% Eric Wong
  2021-04-16 23:10  6% ` [PATCH 1/9] lei q: --save preserves relative time queries Eric Wong
  0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2021-04-16 23:10 UTC (permalink / raw)
  To: meta

Found a few bugfixes along the way, but after thinking it over,
I think "lei up /path/to/maildir/or/mbox/or/IMAP-URI" makes the
most sense.

Eric Wong (9):
  lei q: --save preserves relative time queries
  lei: expose share_path as a method
  lei: saved searches keyed only by path/URL and format
  lei_to_mail: cast to URIimap object early
  test_common: handle '-C' (chdir) spawn option properly
  lei: fix rel2abs
  lei up: support output destination as arg
  lei q --save: avoid lei.q.format
  lei q --save: clobber config file on repeats

 lib/PublicInbox/Config.pm         |  9 ++++
 lib/PublicInbox/LEI.pm            | 19 +++++----
 lib/PublicInbox/LeiQuery.pm       |  2 +-
 lib/PublicInbox/LeiSavedSearch.pm | 71 ++++++++++++++++++++++++-------
 lib/PublicInbox/LeiToMail.pm      | 12 +++---
 lib/PublicInbox/LeiUp.pm          |  5 +--
 lib/PublicInbox/Reply.pm          | 10 +----
 lib/PublicInbox/TestCommon.pm     |  7 +++
 t/lei-q-save.t                    | 36 ++++++++++++++--
 9 files changed, 126 insertions(+), 45 deletions(-)


^ permalink raw reply	[relevance 7%]

Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-04-16 23:10  7% [PATCH 0/9] lei saved search usability improvements Eric Wong
2021-04-16 23:10  6% ` [PATCH 1/9] lei q: --save preserves relative time queries Eric Wong
2021-09-02 21:12     Showcasing lei at Linux Plumbers Konstantin Ryabitsev
2021-09-02 21:58     ` Eric Wong
2021-09-03 15:15       ` Konstantin Ryabitsev
2021-09-04 21:36         ` [PATCH] lei_to_mail+mbox_reader: fix handling of empty/bogus emails Eric Wong
2021-09-07 18:17           ` Konstantin Ryabitsev
2021-09-07 20:56  6%         ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).