user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <>
Cc: Konstantin Ryabitsev <>
Subject: [PATCH] TODO: notes about v2 format for giant archives
Date: Tue, 16 Jan 2018 22:36:16 +0000	[thread overview]
Message-ID: <> (raw)

Inspired by interest in LKML archival:
 TODO | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/TODO b/TODO
index 3163b8a..605013e 100644
--- a/TODO
+++ b/TODO
@@ -78,3 +78,34 @@ all need to be considered for everything we introduce)
 * more and better test cases (use git fast-import to speed up creation)
 * large mbox/Maildir/MH/NNTP spool import (see PublicInbox::Import)
+* Read-only WebDAV interface to the git repo so it can be mounted
+  via davfs2 or fusedav to avoid full clones.
+* Improve tree layout to help giant archives (v2 format):
+  * Must be optional; old ssoma users may continue using v1
+  * Xapian becomes becomes a requirement when using v2; they
+    claim good scalability:
+  * Allow git to perform better deltafication for quoted messages
+  * Changing tree layout for deltafication means we need to handle
+    deletes for spam differently than we do now.
+  * Deal with duplicate Message-IDs (web UI, at least, not sure about NNTP)
+  * (Maybe) SQLite alternatives (MySQL/MariaDB/Pg) for NNTP article
+    number mapping:
+  * Ref rotation (splitting heads by YYYY or YYYY-MM)
+  * Support multiple git repos for a single archive?
+    This seems gross, but splitting large packs in in git conflicts
+    with bitmaps and we want to use both features.  Perhaps this
+    limitation can be fixed in git instead of merely being documented:
+  * Optional history squashing to reduce commit and intermediate
+    tree objects

             reply	other threads:[~2018-01-16 22:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-16 22:36 Eric Wong [this message]
2018-02-08  3:09 ` Eric Wong
2018-02-08  4:05   ` Konstantin Ryabitsev
2018-02-08 17:08     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this inbox:

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).