user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 5/5] viewvcs: support streaming large blobs
  2019-01-31  4:27  6% [PATCH 0/5] a few more solver fixups and improvements Eric Wong
@ 2019-01-31  4:27  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-01-31  4:27 UTC (permalink / raw)
  To: meta

Forking off git-cat-file here for streaming large blobs is
reasonably efficient, at least no worse than using
git-http-backend for serving clones.  So let our limiter
framework deal with it.

git itself isn't great for large files, and AFAIK there's no
stable/widely-available mechanisms for reading smaller chunks
of giant blobs in git itself.

Tested with some giant GPU headers in the Linux kernel.
---
 lib/PublicInbox/ViewVCS.pm | 37 +++++++++++++++++++++++++++++++++----
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index 85edf22..63731e9 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -34,6 +34,7 @@ END { $hl = undef };
 my %QP_MAP = ( A => 'oid_a', B => 'oid_b', a => 'path_a', b => 'path_b' );
 my $max_size = 1024 * 1024; # TODO: configurable
 my $enc_utf8 = find_encoding('UTF-8');
+my $BIN_DETECT = 8000; # same as git
 
 sub html_page ($$$) {
 	my ($ctx, $code, $strref) = @_;
@@ -43,7 +44,33 @@ sub html_page ($$$) {
 		my ($nr, undef) =  @_;
 		$nr == 1 ? $$strref : undef;
 	});
-	$wcb->($res);
+	$wcb ? $wcb->($res) : $res;
+}
+
+sub stream_large_blob ($$$$) {
+	my ($ctx, $res, $logref, $fn) = @_;
+	my ($git, $oid, $type, $size, $di) = @$res;
+	my $cmd = ['git', "--git-dir=$git->{git_dir}", 'cat-file', $type, $oid];
+	my $qsp = PublicInbox::Qspawn->new($cmd);
+	my @cl = ('Content-Length', $size);
+	my $env = $ctx->{env};
+	$env->{'qspawn.response'} = delete $ctx->{-wcb};
+	$qsp->psgi_return($env, undef, sub {
+		my ($r, $bref) = @_;
+		if (!defined $r) { # error
+			html_page($ctx, 500, $logref);
+		} elsif (index($$bref, "\0") >= 0) {
+			my $ct = 'application/octet-stream';
+			[200, ['Content-Type', $ct, @cl ] ];
+		} else {
+			my $n = bytes::length($$bref);
+			if ($n >= $BIN_DETECT || $n == $size) {
+				my $ct = 'text/plain; charset=UTF-8';
+				return [200, ['Content-Type', $ct, @cl] ];
+			}
+			undef; # bref keeps growing
+		}
+	});
 }
 
 sub solve_result {
@@ -65,9 +92,13 @@ sub solve_result {
 	$ref eq 'ARRAY' or return html_page($ctx, 500, \$log);
 
 	my ($git, $oid, $type, $size, $di) = @$res;
+	my $path = to_filename($di->{path_b} || $hints->{path_b} || 'blob');
+	my $raw_link = "(<a\nhref=$path>raw</a>)";
 	if ($size > $max_size) {
+		return stream_large_blob($ctx, $res, \$log, $fn) if defined $fn;
 		# TODO: stream the raw file if it's gigantic, at least
-		$log = '<pre><b>Too big to show</b></pre>' . $log;
+		$log = "<pre><b>Too big to show, download available</b>\n" .
+			"$oid $type $size bytes $raw_link</pre>" . $log;
 		return html_page($ctx, 500, \$log);
 	}
 
@@ -86,8 +117,6 @@ sub solve_result {
 		return delete($ctx->{-wcb})->([200, $h, [ $$blob ]]);
 	}
 
-	my $path = to_filename($di->{path_b} || $hints->{path_b} || 'blob');
-	my $raw_link = "(<a\nhref=$path>raw</a>)";
 	if ($binary) {
 		$log = "<pre>$oid $type $size bytes (binary)" .
 			" $raw_link</pre>" . $log;
-- 
EW


^ permalink raw reply related	[relevance 7%]

* [PATCH 0/5] a few more solver fixups and improvements
@ 2019-01-31  4:27  6% Eric Wong
  2019-01-31  4:27  7% ` [PATCH 5/5] viewvcs: support streaming large blobs Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-01-31  4:27 UTC (permalink / raw)
  To: meta

All going into master and seems to run OK on the main
public-inbox.org server without problems.

Eric Wong (5):
  t/config.t: test PublicInbox::Git sharing between inboxes
  inbox: perform cleanup of Git objects for coderepos
  solvergit: allow searching on longer-than-needed OIDs
  solvergit: allow shorter-than-necessary OIDs from user
  viewvcs: support streaming large blobs

 lib/PublicInbox/Git.pm       | 18 ++++++++++++----
 lib/PublicInbox/Inbox.pm     | 17 +++++++++++++--
 lib/PublicInbox/SolverGit.pm | 41 +++++++++++++++++++++++++++++++++---
 lib/PublicInbox/ViewVCS.pm   | 37 ++++++++++++++++++++++++++++----
 t/config.t                   | 19 +++++++++++++++++
 t/git.t                      |  4 ++++
 t/solver_git.t               |  9 ++++++++
 7 files changed, 132 insertions(+), 13 deletions(-)

-- 
EW


^ permalink raw reply	[relevance 6%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-01-31  4:27  6% [PATCH 0/5] a few more solver fixups and improvements Eric Wong
2019-01-31  4:27  7% ` [PATCH 5/5] viewvcs: support streaming large blobs Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).