about summary refs log tree commit homepage
path: root/lib/PublicInbox/Import.pm
DateCommit message (Collapse)
2016-05-25remove Email::Address dependency
git has stricter requirements for ident names (no '<>') which Email::Address allows. Even in 1.908, Email::Address also has an incomplete fix for CVE-2015-7686 with a DoS-able regexp for comments. Since we don't care for or need all the RFC compliance of Email::Address, avoiding it entirely may be preferable. Email::Address will still be installed as a requirement for Email::MIME, but it is only used by the Email::MIME::header_str_set which we do not use
2016-05-21import: avoid needless git update-server-info
We don't need to update-server-info (or read-tree) if fast import was spawned for removals and no changes were made.
2016-05-12import: fallback to email if '<>' exists in author name
git doesn't handle '<' and '>' characters in the author name at all regardless of quoting, not just matched pairs. So fall back to using the email as the author name since the commit info isn't critical, anyways (shallow clones are fine).
2016-05-12import: normalize body by stripping trailing newlines
Mbox formatters may add extra newlines at the end of the message, and that's not relevant for comparing messages for deletion.
2016-04-28import: run git-update-server-info when done
We should update $GIT_DIR/info/refs for dumb HTTP clients whenever we make changes to the repository. The best place to update is immediately after making commits. This fixes a bug where public-inbox-learn did not properly update $GIT_DIR/info/refs after inserting or removing messages.
2016-04-27import: document API for public consumption
This is probably trivial enough to be final?
2016-04-25remove ssoma dependency
By converting to using ourt git-fast-import-based Import module. This should allow us to be more easily installed.
2016-04-25import: extra check for final byte read
The read could fail entirely and leave $lf undefined.
2016-04-12import: filter out [<>] from user names
It confuses the git ident parser and may not be a great idea to fix in git since it could break interopability with older versions.
2016-04-11import: use bytes::length for true data length in bytes
git is byte-oriented and fast-import will not tolerate miscalculations. This is necessary for wide characters in commit messages (email Subjects).
2016-04-11import: set binmode before printing author names
Author names may have wide characters in them, so avoid warnings as git favors UTF-8 for names and fast-import even requires them for commit messages
2016-04-11import: initial module + test case
This will allow us to write fast importers for existing archives as well as eventually removing the ssoma dependency for performance and ease-of-installation.