user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 25/34] watch: remove {mdir} array
  2020-06-27 10:03  7% [PATCH 00/34] watch: add IMAP and NNTP support Eric Wong
@ 2020-06-27 10:03  6% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-06-27 10:03 UTC (permalink / raw)
  To: meta

Since we store all watched directory names as keys in %mdmap,
there should be no need to keep an array of those directories
around.

t/watch_maildir*.t required changes to remove trained spam.
Once we've trained something as spam, there shouldn't be
a need to rescan it.
---
 lib/PublicInbox/WatchMaildir.pm | 22 ++++++++--------------
 t/watch_maildir.t               |  2 ++
 t/watch_maildir_v2.t            |  2 ++
 3 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/WatchMaildir.pm b/lib/PublicInbox/WatchMaildir.pm
index 621d41bd81d..8d2dc432684 100644
--- a/lib/PublicInbox/WatchMaildir.pm
+++ b/lib/PublicInbox/WatchMaildir.pm
@@ -40,8 +40,7 @@ sub compile_watchheaders ($) {
 
 sub new {
 	my ($class, $config) = @_;
-	my (%mdmap, @mdir, $spamc);
-	my %uniq; # directory => count
+	my (%mdmap, $spamc);
 	my %imap; # url => [inbox objects] or 'watchspam'
 
 	# "publicinboxwatch" is the documented namespace
@@ -54,10 +53,7 @@ sub new {
 		for my $dir (@$dirs) {
 			if (is_maildir($dir)) {
 				# skip "new", no MUA has seen it, yet.
-				my $cur = "$dir/cur";
-				push @mdir, $cur;
-				$uniq{$cur}++;
-				$mdmap{$cur} = 'watchspam';
+				$mdmap{"$dir/cur"} = 'watchspam';
 			} elsif (my $url = imap_url($dir)) {
 				$imap{$url} = 'watchspam';
 			} else {
@@ -83,8 +79,6 @@ sub new {
 				my ($new, $cur) = ("$watch/new", "$watch/cur");
 				my $cur_dst = $mdmap{$cur} //= [];
 				return if is_watchspam($cur, $cur_dst, $ibx);
-				push @mdir, $new unless $uniq{$new}++;
-				push @mdir, $cur unless $uniq{$cur}++;
 				push @{$mdmap{$new} //= []}, $ibx;
 				push @$cur_dst, $ibx;
 			} elsif (my $url = imap_url($watch)) {
@@ -96,17 +90,16 @@ sub new {
 			}
 		}
 	});
-	return unless scalar(@mdir) || scalar(keys %imap);
 
 	my $mdre;
-	if (@mdir) {
-		$mdre = join('|', map { quotemeta($_) } @mdir);
+	if (scalar keys %mdmap) {
+		$mdre = join('|', map { quotemeta($_) } keys %mdmap);
 		$mdre = qr!\A($mdre)/!;
 	}
+	return unless $mdre || scalar(keys %imap);
 	bless {
 		spamcheck => $spamcheck,
 		mdmap => \%mdmap,
-		mdir => \@mdir,
 		mdre => $mdre,
 		config => $config,
 		imap => scalar keys %imap ? \%imap : undef,
@@ -231,7 +224,8 @@ sub watch_fs_init ($) {
 		$self->{done_timer} //= PublicInbox::DS::requeue($done);
 	};
 	require PublicInbox::DirIdle;
-	PublicInbox::DirIdle->new($self->{mdir}, $cb); # EPOLL_CTL_ADD
+	# inotify_create + EPOLL_CTL_ADD
+	PublicInbox::DirIdle->new([keys %{$self->{mdmap}}], $cb);
 }
 
 # returns the git config section name, e.g [imap "imaps://user@example.com"]
@@ -688,7 +682,7 @@ sub fs_scan_step {
 		$opendirs->{$dir} = $dh if $n < 0;
 	}
 	if ($op && $op eq 'full') {
-		foreach my $dir (@{$self->{mdir}}) {
+		foreach my $dir (keys %{$self->{mdmap}}) {
 			next if $opendirs->{$dir}; # already in progress
 			my $ok = opendir(my $dh, $dir);
 			unless ($ok) {
diff --git a/t/watch_maildir.t b/t/watch_maildir.t
index c8658140cf2..c44273f0519 100644
--- a/t/watch_maildir.t
+++ b/t/watch_maildir.t
@@ -84,6 +84,7 @@ PublicInbox::WatchMaildir->new($config)->scan('full');
 is(scalar @list, 2, 'two revisions in rev-list');
 @list = $git->qx(qw(ls-tree -r --name-only refs/heads/master));
 is(scalar @list, 0, 'tree is empty');
+is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 
 # check with scrubbing
 {
@@ -105,6 +106,7 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	is(scalar @list, 0, 'tree is empty');
 	@list = $git->qx(qw(rev-list refs/heads/master));
 	is(scalar @list, 4, 'four revisions in rev-list');
+	is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 }
 
 {
diff --git a/t/watch_maildir_v2.t b/t/watch_maildir_v2.t
index 6cc8b6ff0e9..f5b8e932985 100644
--- a/t/watch_maildir_v2.t
+++ b/t/watch_maildir_v2.t
@@ -71,6 +71,7 @@ $write_spam->();
 is(unlink(glob("$maildir/new/*")), 1, 'unlinked old spam');
 PublicInbox::WatchMaildir->new($config)->scan('full');
 is(($srch->reopen->query(''))[0], 0, 'deleted file');
+is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 
 # check with scrubbing
 {
@@ -90,6 +91,7 @@ More majordomo info at  http://vger.kernel.org/majordomo-info.html\n);
 	PublicInbox::WatchMaildir->new($config)->scan('full');
 	($nr, $msgs) = $srch->reopen->query('');
 	is($nr, 0, 'inbox is empty again');
+	is(unlink(glob("$spamdir/cur/*")), 1, 'unlinked trained spam');
 }
 
 {

^ permalink raw reply related	[relevance 6%]

* [PATCH 00/34] watch: add IMAP and NNTP support
@ 2020-06-27 10:03  7% Eric Wong
  2020-06-27 10:03  6% ` [PATCH 25/34] watch: remove {mdir} array Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-06-27 10:03 UTC (permalink / raw)
  To: meta

Some fairly major changes to -watch.  Filesys::Notify::Simple is
no longer used, and -watch now uses inotify, signalfd or kevent
like the read-only daemons.

Credentials are handled via Net::Netrc (Perl standard library)
or "git-credential", so we do no password storage on our own.

NNTP (and non-IDLE IMAP) may allow more parallelization in the
future.

One significant project-wide change is getting rid of "use
fields".  It gets in my way more than it helps, and it's
probably alien to a fair amount of Perl hackers.  AFAIK, it's
never really been popular outside of Danga::Socket-based
projects.


Eric W. Biederman (1):
  IMAPTracker: Add a helper to track our place in reading imap mailboxes

Eric Wong (33):
  inboxwritable: ensure ssoma.lock exists on init
  inbox: warn on ->on_inbox_unlock exception
  imaptracker: use ~/.local/share/public-inbox/imap.sqlite3
  watchmaildir: hoist out compile_watchheaders
  watchmaildir: fix check for spam vs ham inbox conflicts
  URI IMAP support
  watch: preliminary IMAP support
  kqnotify|fake_inotify: detect Maildir write ops
  watch: remove Filesys::Notify::Simple dependency
  watch: use signalfd for Maildir watching
  ds: remove fields.pm usage
  watch: wire up IMAP IDLE reapers to DS
  watch: support IMAP polling
  config: support ->urlmatch method for -watch
  watch: stop importers before forking
  watch: use UID SEARCH to avoid empty UID FETCH
  ds: add_timer: allow passing arg to callback.
  imaptracker: add {url} field to reduce args
  imaptracker: drop {dbname} field
  watch: avoid long transaction when writing to IMAPTracker
  watch: support imap.fetchBatchSize parameter
  watch: imap: be quieter about disconnecting on quit
  watch: support multiple watch: directives per-inbox
  watch: remove {mdir} array
  watch: just use ->urlmatch
  testcommon: $ENV{TAIL} supports non-@ARGV redirects
  watch: add NNTP support
  watch: show user-specified URL consistently.
  watch: enable autoflush for STDOUT and STDERR
  watch: use our own "git credential" wrapper
  watch: support ~/.netrc via Net::Netrc
  imaptracker: use flock(2) around writes
  watch: simplify internal structures

 Documentation/public-inbox-watch.pod |   3 +-
 INSTALL                              |   8 -
 MANIFEST                             |  11 +
 Makefile.PL                          |   4 -
 ci/deps.perl                         |   1 -
 lib/PublicInbox/Config.pm            |  21 +-
 lib/PublicInbox/DS.pm                |  29 +-
 lib/PublicInbox/Daemon.pm            |  19 +-
 lib/PublicInbox/DirIdle.pm           |  49 ++
 lib/PublicInbox/FakeInotify.pm       |  56 +-
 lib/PublicInbox/GitAsyncCat.pm       |   4 +-
 lib/PublicInbox/GitCredential.pm     |  55 ++
 lib/PublicInbox/HTTP.pm              |  23 +-
 lib/PublicInbox/HTTPD/Async.pm       |  22 +-
 lib/PublicInbox/IMAP.pm              |  19 +-
 lib/PublicInbox/IMAPTracker.pm       |  82 +++
 lib/PublicInbox/In2Tie.pm            |  13 +
 lib/PublicInbox/Inbox.pm             |   1 +
 lib/PublicInbox/InboxIdle.pm         |  20 +-
 lib/PublicInbox/InboxWritable.pm     |   3 +
 lib/PublicInbox/KQNotify.pm          |  38 +-
 lib/PublicInbox/Listener.pm          |   8 +-
 lib/PublicInbox/NNTP.pm              |  12 +-
 lib/PublicInbox/NNTPdeflate.pm       |   5 +-
 lib/PublicInbox/ParentPipe.pm        |   8 +-
 lib/PublicInbox/Sigfd.pm             |  21 +-
 lib/PublicInbox/TestCommon.pm        |  40 +-
 lib/PublicInbox/URIimap.pm           | 113 +++
 lib/PublicInbox/WatchMaildir.pm      | 998 +++++++++++++++++++++++----
 script/public-inbox-watch            |  33 +-
 t/config.t                           |  18 +
 t/dir_idle.t                         |   6 +
 t/fake_inotify.t                     |  45 ++
 t/imap_tracker.t                     |  54 ++
 t/imapd.t                            |  74 ++
 t/kqnotify.t                         |  41 ++
 t/nntpd.t                            |  52 ++
 t/uri_imap.t                         |  65 ++
 t/watch_filter_rubylang.t            |   2 +-
 t/watch_imap.t                       |  21 +
 t/watch_maildir.t                    |  96 ++-
 t/watch_maildir_v2.t                 |   4 +-
 t/watch_multiple_headers.t           |   2 +-
 t/watch_nntp.t                       |  17 +
 xt/mem-imapd-tls.t                   |  18 +-
 45 files changed, 1944 insertions(+), 290 deletions(-)
 create mode 100644 lib/PublicInbox/DirIdle.pm
 create mode 100644 lib/PublicInbox/GitCredential.pm
 create mode 100644 lib/PublicInbox/IMAPTracker.pm
 create mode 100644 lib/PublicInbox/URIimap.pm
 create mode 100644 t/dir_idle.t
 create mode 100644 t/fake_inotify.t
 create mode 100644 t/imap_tracker.t
 create mode 100644 t/kqnotify.t
 create mode 100644 t/uri_imap.t
 create mode 100644 t/watch_imap.t
 create mode 100644 t/watch_nntp.t


^ permalink raw reply	[relevance 7%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-06-27 10:03  7% [PATCH 00/34] watch: add IMAP and NNTP support Eric Wong
2020-06-27 10:03  6% ` [PATCH 25/34] watch: remove {mdir} array Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).