public-inbox.git  about / heads / tags
an "archives first" approach to mailing lists
blob 618bada2f49be02647aab46e64472f0d67311375 4809 bytes (raw)
$ git show HEAD:Documentation/lei-mail-formats.pod	# shows this blob on the CLI

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
 
=head1 NAME

lei-mail-formats - description of mail formats supported by lei

=head1 DESCRIPTION

L<lei-q(1)> supports writing to several existing mail formats
for interoperability with existing mail user agents (MUA);
below is an overview of them to help users choose.

=head1 Maildir

The default output format when given a filesystem path, it supports
parallel read-write access.  Performance is acceptable for smaller
directories, but degrades as mailboxes get larger.  Speed and
scalability are limited by kernel and filesystem performance
due to the use of small files and large number of syscalls.

See also: L<https://cr.yp.to/proto/maildir.html> and
L<https://wiki2.dovecot.org/MailboxFormat/Maildir>

=head1 Mbox family

The mbox family consists of several incompatible formats.
Locking for parallel access is supported, but may not be
compatible across tools.  With compression (e.g. L<gzip(1)>),
they require the least amount of space while offering good
read-only performance.

Keyword updates (C<Status:> and/or C<X-Status:> headers)
generally require rewriting the entire mbox.

See also:
L<https://www.loc.gov/preservation/digital/formats/fdd/fdd000383.shtml>,
L<mbox(5)>

=head2 mboxo

The traditional BSD format.  It quotes C<From > to C<E<gt>From >,
but lines already beginning with C<E<gt>From > do not get quoted,
thus automatic reversibility is not guaranteed.  MUAs which favor
L</mboxcl> or L</mboxcl2> may convert these automatically to their
preferred format.

Truncation is undetectable unless compressed with gzip or similar.

=head2 mboxrd

An evolution of L</mboxo>, but quotes C<From > lines prefixed
with any number of C<E<gt>> characters and is thus fully
reversible.

This format is emitted by L<PublicInbox::WWW(3pm)> with gzip.
Since git 2.10, C<git am --patch-format=mboxrd> reads this
format.  C<git log> and C<git format-patch --stdout> can also
generate this format with the C<--pretty=mboxrd> switch.

As with uncompressed L</mboxo>, uncompressed mboxrd are vulnerable
to undetectable truncation.

It gracefully degrades to being treated as L</mboxo> by MUAs
unaware of the format as excessive C<E<gt>From > quoting is
recognizable to humans.

=head2 mboxcl

L</mboxo> with a C<Content-Length:> header, C<From > lines
remain quoted to retain readability with L</mboxo> and L</mboxrd> MUAs.
However, it is easy to corrupt these files when using tools
which are not aware of C<Content-Length:> and write out updates
as L</mboxo>.

L<mutt(1)> will convert L</mboxo> and L</mboxrd> to mboxcl upon opening.

See also: L<https://www.jwz.org/doc/content-length.html>

=head2 mboxcl2

Like L</mboxcl>, but without C<From > any quoting.  It is wholly
incompatible with MUAs which only handle L</mboxo> and/or L</mboxrd>.
This is format is generated by L<mutt(1)> when writing to a new
mbox.

=head1 MH

Preliminary support for reads as of 2.0.0.  Locking semantics differ
incompatibly amongst existing writers: Python and nmh appear
compatible with each other, while mutt appears racy and unsuitable
for parallel access due to rename(2) potentially clobbering the
C<.mh_sequences> file.  More info about other clients is greatly
appreciated.

Sequence numbers may be packed and reused by some writers, so lei
users may need to run L<lei-refresh-mail-sync(1)> if inotify|kevent
missed packing while L<lei-daemon(8)> wasn't running.

lei is safe for reading mlmmj archives as MH since mlmmj neither
packs nor uses a .mh_sequences file to store state.

=head1 MMDF

Not yet supported, and it's unclear if current usage/support makes
it worth supporting.

=head1 IMAP

Depending on the IMAP server software and configuration, IMAP
servers may use any (or combination) of the aforementioned
formats or a non-standard database backend.  Currently, lei
uses L<Mail::IMAPClient> which has acceptable performance
over low-latency links.  Performance over high-latency links
is currently poor.

=head1 eml

A single raw message file.  C<eml> is not an output format for lei,
but accepted by as an C<--input-format> (C<-F>) for read-only
commands such as L<lei-tag(1)> and L<lei-import(1)>.

Since C<eml> is the suffix for the C<message/rfc822> MIME type
(according to the C<mime.types> file), lei will infer the type
based on the C<.eml> suffix if C<--input-format> is unspecified

C<.patch>-suffixed files generated by L<git-format-patch(1)>
(without C<--stdout>) are C<eml> files with the addition of an
mbox C<From > header.  L<lei(1)> removes C<From > lines to treat
them as C<eml> when reading these for compatibility with
C<git-am(1)> and similar tools.

=head1 COPYRIGHT

Copyright 2021 all contributors L<mailto:meta@public-inbox.org>

License: AGPL-3.0+ L<http://www.gnu.org/licenses/agpl-3.0.txt>

=head1 SEE ALSO

L<lei(1)>, L<lei-q(1)>, L<lei-convert(1)>, L<lei-overview(7)>

git clone https://public-inbox.org/public-inbox.git
git clone http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git