about summary refs log tree commit homepage
path: root/script/public-inbox-convert
DateCommit message (Collapse)
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-02-02convert: fix --no-index switch
The (currently undocumented) "--no-index" flag did not trigger the V2Writable->done call necessary to make the import successful. Fixes: eea47b676127bcdb ("convert: preserve highwater mark from v1 msgmap")
2020-02-02convert: shift @ARGV explicitly
Relying on implicit "@_" for shift fails with TestCommon::_run_sub iff GetOptions modifies @ARGV.
2020-02-02convert: remove unused variables capturing :from
Looking at git history, they were never used.
2020-01-31convert: preserve highwater mark from v1 msgmap
If we're reusing the msgmap from a v1 inbox, we also need to ensure the highwater mark doesn't get doubled in the v1->v2 conversion by internally triggering the equivalent of "--reindex" on a fresh v2 inbox. This was needed to convert an indexed v1 inbox which featured messages with multiple Message-IDs in it. Fresh, unindexed clones of v1 inboxes would not have been affected by this.
2020-01-27inbox: add ->version method
This allows us to simplify version checking by avoiding "//" or "||" operators sprinkled around.
2020-01-06treewide: "require" + "use" cleanup and docs
There's a bunch of leftover "require" and "use" statements we no longer need and can get rid of, along with some excessive imports via "use". IO::Handle usage isn't always obvious, so add comments describing why a package loads it. Along the same lines, document the tmpdir support as the reason we depend on File::Temp 0.19, even though every Perl 5.10.1+ user has it. While we're at it, favor "use" over "require", since it it gives us extra compile-time checking.
2019-11-14convert: remove duplicated GetOptions() call
We only need to parse the command-line once.
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-09-09run update-copyrights from gnulib for 2019
2019-06-05tighten up digit matches to ASCII for git output
While I don't expect git to suddenly start spewing non-ASCII digits in places I'd expect ASCII, this would make things easier for future hackers and reviewers.
2018-05-11convert+compact: fix when running without ~/.public-inbox/config
Some users may not have any public-inboxes configured, especially in tests.
2018-04-20convert: copy description and git config from v1 repo
I noticed I lost a $GIT_DIR/description in a conversion, so we should preserve it. While we're at it, we ought to copy any config in the old repo to the new one. We will need to warn about cloneurl since it's unfortunately not an automatic process to update. Oh well..
2018-04-07convert: support converting with altid defined
public-inbox-convert ought to be 100% lossless, now
2018-04-04v2: support incremental indexing + purge
This is important for people running mirrors via "git fetch", as they need to be kept up-to-date. Purging is also now supported in mirrors. The short-lived "--regenerate" option is gone and is now implicitly enabled as a result. It's still cheap when article number regeneration is unnecessary, as we track the range for each git repository.
2018-04-01v2: one file, really
We need to ensure there is only one file in the top-level tree at any commit so the "add; remove; add;" sequence on the same message is detected properly. Otherwise, git will not detect the second "add" unless a second message is added to history. Deletes are now stored in "d" (and not "D" or "_/D") at the top-level, now. There's no need to have a "_" to reduce churn as "m" and "d" should never co-exist. It's now lowercased to make it easier-to-distinguish from "D" in git-log output.
2018-03-30v2: respect core.sharedRepository in git configs
Ensure -convert and -compact do not make repositories unreadable on live servers.
2018-03-30convert: avoid redundant "done\n" statement for fast-import
This bug was hidden due to timing problems with eatmydata or running with tmpfs for TMPDIR.
2018-03-29public-inbox-convert: tool for converting old to new inboxes
This should make it easier to let users perform comparisons and migrate to v2 if needed.