Date | Commit message (Collapse) |
|
It turns out there's no point in having multiple instances of
this or having to worry about destruction or destruction
ordering.
This will make it easier to reuse the one instance we have
across different modules.
|
|
We'll want to use to support highlighting syntax used by
Markdown and possibly other markup languages (while retaining
the raw plain-text layout and formatting).
|
|
Favor in-place utf8::decode since it's a bit faster without
method dispatch overhead; and don't care about validity just
yet.
HlMod->do_hl itself should return "utf8" strings, since other
parts of our code can use it, so it's not the job of ViewVCS to
post-process HlMod output.
|
|
Leaving out parentheses caused transitions to state="del" or
state="add" to be misidentified.
cf. https://public-inbox.org/meta/20190204105454.GG10587@szeder.dev/
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
|
|
This is the fallback for the normal WWW endpoint.
Adding this to the top-level seems to be alright, since lynx and
w3m both understand nntp://<HOSTNAME>/<Message-ID> anyways.
If newsgroup and inbox names conflict, then consider it the
fault of the original sender.
Since NewsWWW is intended to support buggy linkifiers in mail clients,
they can interpret nntp:// URLs as http://<HOSTNAME>/<Message-ID>
Inbox ordering from the config file is preserved since
commit cfa8ff7c256e20f3240aed5f98d155c019788e3b
("config: each_inbox iteration preserves config order"),
so admins can rely on that to configure how scanning
works.
Requested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
cf. https://public-inbox.org/meta/20190107190719.GE9442@pure.paranoia.local/
nntp://news.public-inbox.org/20190107190719.GE9442@pure.paranoia.local
|
|
This is best-effort, but works well-enough in practice for
projects which use shell-friendly filenames as well as the
long path names for some Linux kernel selftests.
|
|
For URLs we generate, we need to escape '&' in query parameters
for correctness.
|
|
Only to be pedantic...
|
|
Sometimes users will write "http://example.com" without the
trailing slash, which every browser and tool I've tested seems
to understand.
|
|
Perl "split" can capture and group in the regexp itself,
so rely on that to shorten our code.
Comparing the /T/ HTML output of a thread from hell (on LKML with
1356 messages) reveals no difference in the rendered result.
Only the HTML source differs in newline placement before/after
the closing </span>
This allows a minor speedup on my X32 Thinkpad @ 1.6GHz with
the aforementioned LKML thread from hell:
before: 3.67s
after: 3.55s
|
|
We use absolute URLs in the Atom feeds (to ease
syndication/mirroring), so hunk headers need to point to the
solver URLs.
|
|
diffstat <-> ^diff anchors work within the same attachment or
message while in HTML views which display multiple messages.
|
|
This can be helpful for reviewing larger patches which span
across several files on the permalink (/$MESSAGE_ID/) HTML
page.
More work will be needed to get this working for the /T/ and /t/
pages which show multiple emails, as the filename-based anchors
will conflict at the moment.
|
|
We'll use HTML attributes + anchor links to link to filenames
in coming commits.
|
|
* origin/purge:
implement public-inbox-purge tool
v2writable: read epoch on purge
v2writable: cleanup processes when done
v2writable: purge ignores non-existent git epoch directories
v2writable: ->purge returns undef on no-op
import: purge: reap fast-export process
hoist out resolve_repo_dir from -index
|
|
|
|
|
|
This will become critical for future changes to display
git commits, diffs, and trees.
Use "qspawn.wcb" instead of "qspawn.response" to enhance
readability.
|
|
This will make it easier to make command-line tools
from SolverGit.
|
|
Forking off git-cat-file here for streaming large blobs is
reasonably efficient, at least no worse than using
git-http-backend for serving clones. So let our limiter
framework deal with it.
git itself isn't great for large files, and AFAIK there's no
stable/widely-available mechanisms for reading smaller chunks
of giant blobs in git itself.
Tested with some giant GPU headers in the Linux kernel.
|
|
We can rely on git to disambiguate, here; because sometimes
shorter OIDs can be unambiguous even if we only resolved the
longer one.
|
|
public-inbox can only index the abbreviated object_ids in
emails, not the full or even longer-than-necessary object_ids.
So retry failed object_ids if they're longer than 7 hex
characters.
|
|
Otherwise, long-running but idle git processes may keep unlinked
packs around indefinitely and waste disk space.
|
|
Xapian will interpret ".." as ranges, even quoted phrases.
So break up words on ".." since punctuation (AFAIK) is not
searchable, anyways.
|
|
Using git worktrees was causing t/solver_git.t to fail on me.
|
|
* origin/viewvcs: (66 commits)
solvergit: deal with alternative diff prefixes
solvergit: extract mode from diff headers properly
solvergit: avoid "Wide character" warnings
solvergit: do not show full path names to "git apply"
css/216dark: add comments and tweak highlight colors
viewvcs: avoid segfault with highlight.pm at shutdown
solvergit: do not solve blobs twice
t/check-www-inbox: disable history
t/check-www-inbox: don't follow mboxes
t/check-www-inbox: replace IPC::Run with PublicInbox::Spawn
hval: add src_escape for highlight post-processing
viewvcs: wire up syntax-highlighting for blobs
hlmod: disable enclosing <pre> tag
t/hl_mod: extra check to ensure we escape HTML
wwwhighlight: read_in_full returns undef on errors
solver: crank up max patches to 9999
viewvcs: do not show final error message twice
qspawn: decode $? for user-friendliness
solver: reduce "git apply" invocations
solver: hold patches in temporary directory
...
|
|
Not needed since commit 956abe9ad5f13a0d1755262be412d6a54fda72e9
("view: depend on SearchMsg for Message-ID")
|
|
Removing 'psgix.input.buffered' could be a possibility in
the future.
|
|
At least, without extra directory levels, since
git-diff supports --src-prefix and --dst-prefix,
and /git/6aa8857a11/s/ uses it...
|
|
grep() won't set $1, so use "=~", instead.
|
|
Just quiet Perl down, since we don't know or care about the
encoding of the patch we hand off to git-apply.
|
|
"git apply" will warn about whitespace with the full path of the
patch, which will expose the $TMPDIR environment to users over
HTTP(S).
This change breaks compatibility with git pre-1.8.5, again;
but that was released in late-2013; so hopefully everybody
is on newer versions.
|
|
Overkill, but "highlight" supports single-line comments (slc)
independently of multi-line comments (com); but we'll use the
same color for that.
We'll also use #0f0 instead of #0ff for "kwb" (keyword class "b")
since blue shades are prevalent in <a> links and comments, while
green was unused.
|
|
Proper ordering of destruction seems required to avoid segfaults
at shutdown.
|
|
In some cases, a file may ping-pong between blob IDs in the same
message when reverts occur. So break out of this early.
This doesn't account for different abbreviations, but the
limited variations of abbreviations should alleviate the
problem.
|
|
Looking at git@vger history, several emails had broken
References/In-Reply-To pointing to <y>, <n> and email
addresses as Message-IDs in References and In-Reply-To
headers.
This was causing too many unrelated messages to be linked
together in the same thread.
|
|
We need to post-process "highlight" output to ensure it doesn't
contain odd bytes which cause "wide character" warnings or
require odd glyphs in source form.
|
|
And update 216dark.css to match a color scheme I'm used to;
which is fairly minimal and doesn't use all the classes
"highlight" provides.
|
|
We already have a <pre> tag in ViewVCS, and nesting <pre>
inside the pre-existing <pre> overrides the "white-space:pre"
we use to align line numbers.
|
|
The return value of "print" is not undef for Perl IO::Handle.
|
|
Might as well, since the only constraint is filesystem space
for temporary files for public-inbox-httpd users.
-httpd can fairly share work across clients with our use of
psgi_qx; and there's a recent patch series in git@vger with 64
patches in sequence.
|
|
SolverGit::ERR already writes the exception to the debug
log before calling {user_cb}, so there's no need for viewvcs
to append it.
|
|
The raw value of $? isn't very useful, generally.
|
|
"git apply" is capable of applying multiple patches in one
invocation, so give it multiple patches on the command-line
now that we no longer rely on anonymous file handles to hold
patches.
This cuts down a 64-patch series on git@vger from ~1s to ~800ms
with vfork spawn enabled using Inline::C.
|
|
We can avoid bumping up RLIMIT_NOFILE too much by storing
patches in a temporary directory. And we can share this
top-level directory with our temporary git repository.
Since we no longer rely on a working-tree for git, we are free
to rearrange the layout and avoid relying on the ".git"
convention and relying on "git -C" for chdir.
This may also ease porting public-inbox to older systems
where git does not support "-C" for chdir.
|
|
The psgi_qx routine in the now-abandoned "repobrowse" branch
allows us to break down blob-solving at each process execution
point. It reuses the Qspawn facility for git-http-backend(1),
allowing us to limit parallel subprocesses independently of Perl
worker count.
This is actually a 2-3% slower a fully-synchronous execution;
but it is fair to other clients as it won't monopolize the server
for hundreds of milliseconds (or even seconds) at a time.
|
|
It makes no difference to browsers aside from saving a few
bytes; and this means we won't have to worry about extra
'%0D' showing up in links to solver.
|
|
This new asynchronous API, will allow us to take
advantage of non-blocking I/O from even small commands;
as those may still need to wait for slow operations.
|
|
If an HTTP client disconnects while we're piping the output of a
process to them, break the pipe of the process to reclaim
resources as soon as possible.
|
|
|