git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: John 'Warthog9' Hawley <warthog9@kernel.org>,
	Petr Baudis <pasky@ucw.cz>,
	admin@repo.or.cz, Jakub Narebski <jnareb@gmail.com>
Subject: [PATCHv5 06/17] gitweb/lib - Simple select(FH) based output capture
Date: Thu,  7 Oct 2010 00:01:51 +0200	[thread overview]
Message-ID: <1286402526-13143-7-git-send-email-jnareb@gmail.com> (raw)
In-Reply-To: <1286402526-13143-1-git-send-email-jnareb@gmail.com>

Add two packages: GitwebCache::Capture, which defines interface, and
GitwebCache::Capture::SelectFH, which is actually implements simple
capturing.  GitwebCache::Capture::SelectFH captures output by using
select(FILEHANDLE) to change default filehandle for output.  This
means that output of a "print" or a "printf" (or a "write") without
a filehandle would be captured.

To change mode of filehandle used for capturing correctly,
  binmode select(), <mode>;
needs to be used in place of
  binmode STDOUT, <mode>;

Capturing is done using in-memory file held in Perl scalar.

Using select(FILEHANDLE) is a bit fragile as a method of capturing
output, as it assumes that we always use "print" or "printf" without
filehandle, and use select() which returns default filehandle for
output in place of explicit STDOUT.  On the other hand it has the
advantage of being simple.  Alternate solutions include using tie
(like in CGI::Cache), or using PerlIO layers - but the last requires
non-standard PerlIO::Util module.


Includes separate tests for capturing output.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
You can see alternate solutions for capturing output in 21/17 patch of
this series: "gitweb/lib - Alternate ways of capturing output" (an
appendix to this series).

Differences from v4:
* The capture interface tests are now invoked by a separate test
  script t/t9504-gitweb-capture-interface.sh, for 'prove' to work
  correctly (as test_external doesn't yet work as subtest).

* The t/t9504/test_capture_interface.pl uses GIT_BUILD_DIR rather than
  TEST_DIRECTORY, and respect GITWEBLIBDIR to make it possible to test
  installed version of module.

* Removed spurious changes and fixes to commits earlier in series
  (patch cleanup).


Differences from relevant parts of J.H. patch:
* Capturing gitweb output will be done without need to modify gitweb
  to either save generated output into $output variable, and then
  print it or save it in cache after it is generated in full (original
  J.H. patch in "Gitweb caching v2"), or changing all print statements
  to print to explicit filehandle which points to STDOUT if caching is
  disabled and to in-memory file if caching is enabled (modified
  J.H. patch in "Gitweb caching v5").

* Contrary to the '$output .= <sth>' solution, and similar to the
  'print {$out} <sth>' or 'print $out <sth>' (which can be thought of
  as explicit version of select($out)), this way of capturing output
  doesn't change gitweb behavior when caching is turned off; in
  particular it preserves streaming.

  Also the '$output .= <sth>' solution can affect performance because
  of repeated string concatenation.

* The most important issue is that I/O "layers" (PerlIO), like ':utf8'
  or ':raw', are *already applied* to the output that is captured.
  This means that captured output is *always* in binary (':raw') mode.
  In Perl 6 language it means that data returned by capturing engine
  is an equivalent of Buf, a collection of bytes, whether Buf or Str
  (a colection of logical characters) is printed.

  The overal result is that we would not need separate code path for
  caching binary output, and separate naming conventions for cache
  files for binary data.

  The t9504 test is about checking if both ':utf8' and ':raw' output
  is captured correctly.

 gitweb/lib/GitwebCache/Capture.pm          |   66 ++++++++++++++++++++++
 gitweb/lib/GitwebCache/Capture/SelectFH.pm |   82 ++++++++++++++++++++++++++++
 t/t9504-gitweb-capture-interface.sh        |   34 ++++++++++++
 t/t9504/test_capture_interface.pl          |   76 ++++++++++++++++++++++++++
 4 files changed, 258 insertions(+), 0 deletions(-)
 create mode 100644 gitweb/lib/GitwebCache/Capture.pm
 create mode 100644 gitweb/lib/GitwebCache/Capture/SelectFH.pm
 create mode 100755 t/t9504-gitweb-capture-interface.sh
 create mode 100755 t/t9504/test_capture_interface.pl

diff --git a/gitweb/lib/GitwebCache/Capture.pm b/gitweb/lib/GitwebCache/Capture.pm
new file mode 100644
index 0000000..3e9fe81
--- /dev/null
+++ b/gitweb/lib/GitwebCache/Capture.pm
@@ -0,0 +1,66 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2010, Jakub Narebski <jnareb@gmail.com>
+#
+# This program is licensed under the GPLv2
+
+#
+# Output capturing for gitweb caching engine
+#
+
+# It is base abstract class (a role) for capturing output of gitweb
+# actions for gitweb caching engine.
+# 
+# Child (derived) concrete classes, which actually implement some method
+# of capturing STDOUT output, must implement the following methods:
+# * ->new(), to create new object of a capturing class
+# * ->start(), to start capturing output
+# * ->stop(), to stop capturing output and return it
+#
+# Before starting capture by using capture_block etc. subroutines,
+# one has to run <child class>->setup().
+
+package GitwebCache::Capture;
+
+use strict;
+use warnings;
+
+use Exporter qw(import);
+our @EXPORT    = qw(capture_start capture_stop capture_block);
+our @EXPORT_OK = qw(setup_capture);
+our %EXPORT_TAGS = (all => [ @EXPORT, @EXPORT_OK ]);
+
+# Holds object used for capture (of child class)
+my $capture;
+
+sub setup_capture {
+	my $self = shift || __PACKAGE__;
+
+	$capture = $self->new(@_);
+}
+
+sub capture {
+	my ($self, $code) = @_;
+
+	$self->start();
+	$code->();
+	return $self->stop();
+}
+
+# Wrap caching data; capture only STDOUT
+sub capture_block (&) {
+	my $code = shift;
+	return $capture->capture($code);
+}
+
+sub capture_start {
+	$capture->start(@_);
+}
+
+sub capture_stop {
+	return $capture->stop(@_);
+}
+
+1;
+__END__
+# end of package GitwebCache::Capture;
diff --git a/gitweb/lib/GitwebCache/Capture/SelectFH.pm b/gitweb/lib/GitwebCache/Capture/SelectFH.pm
new file mode 100644
index 0000000..18ce5c3
--- /dev/null
+++ b/gitweb/lib/GitwebCache/Capture/SelectFH.pm
@@ -0,0 +1,82 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2010, Jakub Narebski <jnareb@gmail.com>
+#
+# This program is licensed under the GPLv2
+
+#
+# Simple output capturing using select(FH);
+#
+
+# This module (class) captures output of 'print <sth>', 'printf <sth>'
+# and 'write <sth>' (without a filehandle) by using select(FILEHANDLE)
+# to change default filehandle for output, changing it to in-memory
+# file (saving output to scalar).
+#
+# Note that when using this simplest way of capturing, to change mode of
+# filehandle using for capturing correctly, "binmode STDOUT, <mode>;"
+# has to be changed to "binmode select(), <mode>;".  This has no change
+# if we are not capturing output using GitwebCache::Capture::SelectFH.
+
+package GitwebCache::Capture::SelectFH;
+
+use PerlIO;
+
+use strict;
+use warnings;
+
+use base qw(GitwebCache::Capture);
+use GitwebCache::Capture qw(:all);
+
+use Exporter qw(import);
+our @EXPORT      = @GitwebCache::Capture::EXPORT;
+our @EXPORT_OK   = @GitwebCache::Capture::EXPORT_OK;
+our %EXPORT_TAGS = %GitwebCache::Capture::EXPORT_TAGS;
+
+# Constructor
+sub new {
+	my $proto = shift;
+
+	my $class = ref($proto) || $proto;
+	my $self  = {};
+	$self = bless($self, $class);
+
+	$self->{'oldfh'} = select();
+	$self->{'data'} = '';
+
+	return $self;
+}
+
+# Start capturing data (STDOUT)
+# (printed using 'print <sth>' or 'printf <sth>')
+sub start {
+	my $self = shift;
+
+	$self->{'data'}    = '';
+	$self->{'data_fh'} = undef;
+	
+	open $self->{'data_fh'}, '>', \$self->{'data'}
+		or die "Couldn't open in-memory file for capture: $!";
+	$self->{'oldfh'} = select($self->{'data_fh'});
+
+	# note: this does not cover all cases
+	binmode select(), ':utf8'
+		if ((PerlIO::get_layers($self->{'oldfh'}))[-1] eq 'utf8');
+}
+
+# Stop capturing data (required for die_error)
+sub stop {
+	my $self = shift;
+
+	# return if we didn't start capturing
+	return unless defined $self->{'data_fh'};
+
+	select($self->{'oldfh'});
+	close $self->{'data_fh'}
+		or die "Couldn't close in-memory file for capture: $!";
+	return $self->{'data'};
+}
+
+1;
+__END__
+# end of package GitwebCache::Capture::SelectFH;
diff --git a/t/t9504-gitweb-capture-interface.sh b/t/t9504-gitweb-capture-interface.sh
new file mode 100755
index 0000000..82623f1
--- /dev/null
+++ b/t/t9504-gitweb-capture-interface.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Jakub Narebski
+#
+
+test_description='gitweb capturing interface
+
+This test checks capturing interface used for capturing gitweb output
+in gitweb caching (GitwebCache::Capture* modules).'
+
+# for now we are running only cache interface tests
+. ./test-lib.sh
+
+# this test is present in gitweb-lib.sh
+if ! test_have_prereq PERL; then
+	skip_all='perl not available, skipping test'
+	test_done
+fi
+
+"$PERL_PATH" -MTest::More -e 0 >/dev/null 2>&1 || {
+	skip_all='perl module Test::More unavailable, skipping test'
+	test_done
+}
+
+# ----------------------------------------------------------------------
+
+# The external test will outputs its own plan
+test_external_has_tap=1
+
+test_external \
+	'GitwebCache::Capture Perl API (in gitweb/lib/)' \
+	"$PERL_PATH" "$TEST_DIRECTORY"/t9504/test_capture_interface.pl
+
+test_done
diff --git a/t/t9504/test_capture_interface.pl b/t/t9504/test_capture_interface.pl
new file mode 100755
index 0000000..55c402a
--- /dev/null
+++ b/t/t9504/test_capture_interface.pl
@@ -0,0 +1,76 @@
+#!/usr/bin/perl
+use lib (split(/:/, $ENV{GITPERLLIB}));
+
+use warnings;
+use strict;
+use utf8;
+
+use Test::More;
+
+# test source version
+use lib $ENV{GITWEBLIBDIR} || "$ENV{GIT_BUILD_DIR}/gitweb/lib";
+
+# ....................................................................
+
+# prototypes must be known at compile time, otherwise they do not work
+BEGIN { use_ok('GitwebCache::Capture::SelectFH', qw(:all)); }
+
+# Test setting up capture
+#
+my $capture = new_ok('GitwebCache::Capture::SelectFH' => [], 'The $capture');
+isa_ok($capture, 'GitwebCache::Capture', 'The $capture');
+ok(setup_capture('GitwebCache::Capture::SelectFH'),
+   'setup_capture with package name: GitwebCache::Capture::SelectFH');
+ok(setup_capture($capture),
+   'setup_capture with subclass object: $capture');
+
+# Test properties of capture_block
+#
+is(prototype('capture_block'), '&', 'capture_block has (&) prototype');
+
+# Test capturing
+#
+diag('Should not print anything except test results and diagnostic');
+my $test_data = 'Capture this';
+my $captured = capture_block {
+	print $test_data;
+};
+is($captured, $test_data, 'capture_block captures simple data');
+
+binmode STDOUT, ':utf8';
+$test_data = <<'EOF';
+Áéí óú
+ÄËÑÏÖ
+Ábçdèfg
+Zażółć gęsią jaźń
+山田 太郎
+ブレームのテストです。
+
+はれひほふ
+
+しているのが、いるので。
+濱浜ほれぷりぽれまびぐりろへ。
+EOF
+utf8::decode($test_data);
+$captured = capture_block {
+	binmode select(), ':utf8';
+
+	print $test_data;
+};
+utf8::decode($captured);
+is($captured, $test_data, 'capture_block captures utf8 data');
+
+$test_data = '|\x{fe}\x{ff}|\x{9F}|\000|'; # invalid utf-8
+$captured = capture_block {
+	binmode select(), ':raw';
+
+	print $test_data;
+};
+is($captured, $test_data, 'capture_block captures raw data');
+
+
+done_testing();
+
+# Local Variables:
+# encoding: utf-8
+# End:
-- 
1.7.3

  parent reply	other threads:[~2010-10-06 22:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-06 22:01 [PATCHv5 00/17] gitweb: Simple file based output caching Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 01/17] t/test-lib.sh: Export also GIT_BUILD_DIR in test_external Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 02/17] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 03/17] gitweb/lib - Very simple file based cache Jakub Narebski
2010-10-06 22:41   ` Thomas Adam
2010-10-06 22:44     ` Ævar Arnfjörð Bjarmason
2010-10-06 22:46       ` Thomas Adam
2010-10-06 22:47         ` Ævar Arnfjörð Bjarmason
2010-10-06 23:00     ` Jakub Narebski
2010-10-06 23:12       ` Thomas Adam
2010-10-06 23:32         ` Jakub Narebski
2010-10-06 22:57   ` Ævar Arnfjörð Bjarmason
2010-10-06 23:46     ` Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 04/17] gitweb/lib - Stat-based cache expiration Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 05/17] gitweb/lib - Regenerate entry if the cache file has size of 0 Jakub Narebski
2010-10-06 22:01 ` Jakub Narebski [this message]
2010-10-06 22:52   ` [PATCHv5 06/17] gitweb/lib - Simple select(FH) based output capture Thomas Adam
2010-10-06 23:22     ` Jakub Narebski
2010-10-06 23:03   ` Ævar Arnfjörð Bjarmason
2010-10-06 23:26     ` Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 07/17] gitweb/lib - Cache captured output (using get/set) Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 08/17] gitweb: Add optional output caching Jakub Narebski
2010-10-06 22:46   ` Ævar Arnfjörð Bjarmason
2010-10-06 23:06     ` Jakub Narebski
2010-10-06 23:16       ` Ævar Arnfjörð Bjarmason
2010-10-06 22:01 ` [PATCHv5 09/17] gitweb/lib - Adaptive cache expiration time Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 10/21] gitweb/lib - Use CHI compatibile (compute method) caching interface Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 11/17] gitweb/lib - Use locking to avoid 'cache miss stampede' problem Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 12/17] gitweb/lib - No need for File::Temp when locking Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 13/17] gitweb/lib - Serve stale data when waiting for filling cache Jakub Narebski
2010-10-06 22:01 ` [PATCHv5 14/17] gitweb/lib - Regenerate (refresh) cache in background Jakub Narebski
2010-10-06 22:02 ` [PATCHv5 15/17] gitweb: Introduce %actions_info, gathering information about actions Jakub Narebski
2010-10-06 22:02 ` [PATCHv5/RFC 16/17] gitweb: Show appropriate "Generating..." page when regenerating cache Jakub Narebski
2010-10-06 22:02 ` [PATCHv5/RFC 17/17] gitweb: Add startup delay to activity indicator for cache Jakub Narebski
2010-10-06 22:02 ` [RFC/PATCHv5 18/17] gitweb/lib - Add clear() and size() methods to caching interface Jakub Narebski
2010-10-06 22:56   ` Thomas Adam
2010-10-06 22:02 ` [RFC PATCHv5 19/17] gitweb: Add beginnings of cache administration page Jakub Narebski
2010-10-06 22:02 ` [PoC PATCHv5 20/17] gitweb/lib - Benchmarking GitwebCache::SimpleFileCache (in t/9603/) Jakub Narebski
2010-10-06 22:02 ` [PoC PATCHv5 21/17] gitweb/lib - Alternate ways of capturing output Jakub Narebski
2010-10-10 20:32 ` [RFD] Possible improvements for output caching in gitweb Jakub Narebski
2010-10-24 21:34 ` [PATCHv5 00/17] gitweb: Simple file based output caching J.H.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1286402526-13143-7-git-send-email-jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=admin@repo.or.cz \
    --cc=git@vger.kernel.org \
    --cc=pasky@ucw.cz \
    --cc=warthog9@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).