Date | Commit message (Collapse) |
|
Call order will need to change a bit since this is going to be
tied to Xapian
|
|
We'll reuse this class in v2, but won't be utilizing
per-git-repository ssoma.lock files.
Meanwhile, stop treating ::Inbox objects as an afterthought
and allow importing name and email into them.
|
|
For machines which have never seen ssoma, they don't need the
index so stop creating it.
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
Sometimes an email is an innocent removal "rm" for a
misdirected, off-topic post, while most removed messages are
"spam". Allow anybody to look at history and easily distinguish
the reason for removing the message.
|
|
This seems to allow weirdly-encoded "raw" emails in
blade.nagaokaut.ac.jp/ruby/ruby-core/*
to be handled without difficulties.
|
|
This was necessary for the presence of the 0xa0 byte(*)
in the Subject: of the message at:
http://blade.nagaokaut.ac.jp/ruby/ruby-core/3220
(*) That is 0xa0, not 0x0a ("\n"), so I wonder if the
nibbles got swapped somehow.
|
|
This should fix problems with multipart messages where
text/plain parts lack a header.
cf. git clone --mirror https://github.com/rjbs/Email-MIME.git
refs/pull/28/head
In the future, we may still introduce as streaming
interface to reduce memory usage on large emails.
|
|
We should not completely kill a process if "git gc --auto"
errors out due to a warning or whatnot.
|
|
We need to prevent excessive repository growth for
public-inbox-watch and public-inbox-mda users.
|
|
We will be reusing this in the next commit, too.
|
|
This reduces duplication, slightly. We may be using it
yet again in a to-be-introduced function (or we may not
introduce it).
|
|
Not sure why or how I missed this before; but the common address
parsing routine we have should be more correct.
Add a test to ensure excessively quoted names don't make it
through, either.
|
|
We need to pass the Inbox object to SearchIdx to get altid
mappings properly for incremental imports.
TODO: use the Inbox object in more places where it makes sense
to do so.
|
|
This will allow us to release and re-acquire Xapian locks
due to the lack of FD_CLOEXEC on some FDs.
|
|
For reindexing, fresh Xapian DBs do not count as a reindex,
allowing users to blindly use --reindex on the first
run on a clean repo.
While we're at it, allow indexing to override HEAD ref for
multi-head git repos.
|
|
Callers may have localized $/ to something else, so make sure
we chomp the expected character(s) when calling chomp.
|
|
Mailing lists I watch and mirror may not have the best spam
filtering, and an extra layer should not hurt.
|
|
Because our WatchMaildir module is liberal about what
it accepts, we can potentially have messages without a
subject.
|
|
This prevents multiple update processes from stepping over
each other while called under the lock, and also allows the
new -watch process to update the index iff indexing was
desired.
|
|
git has stricter requirements for ident names (no '<>')
which Email::Address allows.
Even in 1.908, Email::Address also has an incomplete fix for
CVE-2015-7686 with a DoS-able regexp for comments. Since we
don't care for or need all the RFC compliance of Email::Address,
avoiding it entirely may be preferable.
Email::Address will still be installed as a requirement for
Email::MIME, but it is only used by the
Email::MIME::header_str_set which we do not use
|
|
We don't need to update-server-info (or read-tree) if fast
import was spawned for removals and no changes were made.
|
|
git doesn't handle '<' and '>' characters in the author
name at all regardless of quoting, not just matched pairs.
So fall back to using the email as the author name since
the commit info isn't critical, anyways (shallow clones
are fine).
|
|
Mbox formatters may add extra newlines at the end of the
message, and that's not relevant for comparing messages
for deletion.
|
|
We should update $GIT_DIR/info/refs for dumb HTTP clients
whenever we make changes to the repository. The best place
to update is immediately after making commits.
This fixes a bug where public-inbox-learn did not properly
update $GIT_DIR/info/refs after inserting or removing
messages.
|
|
This is probably trivial enough to be final?
|
|
By converting to using ourt git-fast-import-based Import
module. This should allow us to be more easily installed.
|
|
The read could fail entirely and leave $lf undefined.
|
|
It confuses the git ident parser and may not be a great
idea to fix in git since it could break interopability
with older versions.
|
|
git is byte-oriented and fast-import will not tolerate
miscalculations. This is necessary for wide characters
in commit messages (email Subjects).
|
|
Author names may have wide characters in them, so avoid warnings
as git favors UTF-8 for names and fast-import even requires them
for commit messages
|
|
This will allow us to write fast importers for existing
archives as well as eventually removing the ssoma dependency
for performance and ease-of-installation.
|