user/dev discussion of public-inbox itself
 help / color / mirror / Atom feed
9396f66113750f26d17a761357850d24c08eea30 blob 6367 bytes (raw)

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
 
TODO items for public-inbox

(Not in any particular order, and
performance, ease-of-setup, installation, maintainability, etc
all need to be considered for everything we introduce)

* general performance improvements, but without relying on
  XS or pre-built modules any more than we currently do.
  (Optional Inline::C and user-compiled re2c acceptable)

* mailmap support (same as git) for remapping expired email addresses

* support remapping of expired URLs similar to mailmap
  (coordinate with git.git with this?)

* POP3 server, since some webmail providers support external POP3:
  https://public-inbox.org/meta/20160411034104.GA7817@dcvr.yhbt.net/
  Perhaps make this depend solely the NNTP server and work as a proxy.
  Meaning users can run this without needing a full copy of the
  archives in git repositories.

* HTTP, IMAP and NNTP proxy support.  Allow us to be a frontend for
  firewalled off (or Tor-exclusive) instances.  The use case is
  for offering a publicly accessible IP with a cheap VPS,
  yet storing large amounts of data on computers without a
  public IP behind a home Internet connection.

* support HTTP(S) CONNECT proxying to NNTP for users with
  firewall problems

* DHT (distributed hash table) for mapping Message-IDs to various
  archive locations to avoid SPOF.

* optional Cache::FastMmap support so production deployments won't
  need Varnish (Varnish doesn't protect NNTP or IMAP, either)

* dogfood and take advantage of new kernel APIs (while maintaining
  portability to older Linux, free BSDs and maybe Hurd).

* dogfood latest Xapian, Perl5, SQLite, git and various modules to
  ensure things continue working as they should (or more better)
  while retaining compatibility with old versions.

* Support more of RFC 3977 (NNTP)
  Is there anything left for read-only support?

* Combined "super server" for NNTP/HTTP/POP3/IMAP to reduce memory,
  process, and FD overhead

* Configurable linkification for per-inbox shorthands:
  "$gmane/123456" could be configured to expand to the
  appropriate link pointing to the gmane.io list archives,
  likewise "[Bug #123456]" could be configured to expand to
  point to some project's bug tracker at http://example.com/bug/123456

* configurable synonym and spelling support in Xapian

* Support optional "HTTPS Everywhere" for mapping old HTTP to HTTPS
  links if (and only if) the user wants to use HTTPS.  We may also
  be able to configure redirects for expired URLs.

  Note: message bodies rendered as HTML themselves must NOT change,
  the links should point to an anchor tag within the same page,
  instead; giving the user options.

* configurable constants (index limits, search results)

* handle messages with multiple Message-IDs (done for v2, doable for v1)

* handle broken double-bracketed References properly (maybe)
  and totally broken Message-IDs

  cf.  https://public-inbox.org/git/20160814012706.GA18784@starla/

* improve documentation

* linkify thread skeletons better
  https://public-inbox.org/git/6E3699DEA672430CAEA6DEFEDE6918F4@PhilipOakley/

* Further lower mail parser memory usage.  We still slurp entire
  message bodies into memory and incur 2-3x overhead on
  multipart messages.  Inline::C (and maybe gmime) could work.

* use REQUEST_URI properly for CGI / mod_perl2 compatibility
  with Message-IDs which include '%' (done?)

* more and better test cases (use git fast-import to speed up creation)

* large mbox/Maildir/MH/NNTP spool import (see PublicInbox::Import)

* Read-only WebDAV interface to the git repo so it can be mounted
  via davfs2 or fusedav to avoid full clones.
  davfs2 needs Range: request support for this to be feasible:
    https://savannah.nongnu.org/bugs/?33259
    https://savannah.nongnu.org/support/?107649

* Contribute something like IMAP IDLE for "git fetch".
  Inboxes (and any git repos) can be kept up-to-date without
  relying on polling.

* Improve bundle support in git to make it cheaper to host/clone
  with dumb HTTP(S) servers.

* Expose targeted reindexing of individual messages.
  Sometimes an indexing bug only affects a handful of messages,
  so it's not worth the trouble of doing a full reindex.

* code repository integration (cgit: done, TODO: gitweb, etc...)

* migration path to v2 without breaking v1 "git fetch" cronjobs

* imperfect scraper importers for obfuscated list archives
  (e.g. obfuscated Mailman stuff, Google Groups, etc...)

* extend public-inbox-watch to support IMAP, NNTP

* improve performance and avoid head-of-line blocking on slow storage

* HTTP(S) search API (likely JMAP, but GraphQL could be an option)
  It should support git-specific prefixes (dfpre:, dfpost:, dfn:, etc)
  as extensions.  If JMAP, it should have HTTP(S) analogues to
  various IMAP extensions.

* search across multiple inboxes, or admin-definable groups of inboxes

* scalability to tens/hundreds of thousands of inboxes

  - pagination for WwwListing

  - inotify-based manifest.js.gz updates

  - process/FD reduction (needs to be slow-storage friendly)

  ...

* command-line tool (similar to mairix/notmuch, but solver+git-aware)

* consider removing doc_data from Xapian, redundant with over.sqlite3

* share "git cat-file --batch" processes across inboxes to avoid
  bumping into /proc/sys/fs/pipe-user-pages-* limits

* make "git cat-file --batch" detect unlinked packfiles so we don't
  have to restart processes (very long-term)

* support searching based on `git-patch-id --stable` to improve
  bidirectional mapping of commits <=> emails

* linter to check validity of config file

* linter option and WWW endpoint to graph relationships and flows
  between inboxes, addresses, Maildirs, coderepos, newsgroups,
  IMAP mailboxes, etc...

* pygments support - via Python script similar to `git cat-file --batch'
  to avoid startup penalty.  pygments.rb (Ruby) can be inspiration, too.

* highlighting + linkification for "git format-patch --interdiff" output

* highlighting + linkification for "git format-patch --range-diff" output
  (requires mirroring of git repos)

* parse and allow (semi)automatic-mirroring of "git request-pull" output
  for coderepos

* configurable diff output for solver-generated blobs

* figure out how search for messages with multiple Date: headers
  should work (some wacky examples out there...)

* support UUCP addresses for legacy archives
debug log:

solving 9396f661 ...
found 9396f661 in https://80x24.org/public-inbox.git

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git