git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: "J.H." <warthog9@eaglescrag.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 10/18] gitweb: Adding isBinaryAction() and isFeedAction() to determine the action type
Date: Fri, 10 Dec 2010 13:10:53 +0100	[thread overview]
Message-ID: <201012101310.55094.jnareb@gmail.com> (raw)
In-Reply-To: <4D01A103.3090900@eaglescrag.net>

On Fri, 10 Dec 2010, J.H. wrote:

>>> This is fairly self explanatory, these are here just to centralize the checking
>>> for these types of actions, as special things need to be done with regards to
>>> them inside the caching engine.
>>>
>>> isBinaryAction() returns true if the action deals with creating binary files
>>> (this needing :raw output)
>> 
>> Why do you need special case binary / :raw output?  It is not really
>> necessary if it is done in right way, as shown in my rewrite.
> 
> Because that's not how my caching engine does it, and the reason for
> that is I am mimicking how the rest of gitweb does it.

To shorten the explanation why treating binary (needing :raw) output in
a special way is not necessary: with the way gitweb code is structured
(with "binmode STDOUT, ':raw'" inside action subroutine), with the way
capturing output is done (by redirecting STDOUT), and even with the way
kernel.org caching code is structured the only thing that needs to be
done to support both text (:utf8, as set at beginning of gitweb) and
binary (:raw) output is to *dump cache to STDOUT in binary mode*:

	binmode $cache_fh, ':raw';
	binmode STDOUT, ':raw';
	File::Copy::copy($fh, \*STDOUT);

Nothing more.

Just dump cache file to STDOUT in binary mode.
 
> I attempted at one point to do as you were suggesting, and it became too
> cumbersome.  I eventually broke out the 'binary' packages into a special
> case (thus mimicking how gitweb is already doing things), which also
> gives me the advantage of being able to checksum the resulting binary
> out of band, as well as being able to more trivially calculate the file
> size being sent.

I don't see how it needs to be special-cased: the ordinary output would
also take advantage of this.  Note that plain 'blob' action can also
be quite large.

If there is to be done smarter, i.e. HTTP-aware, parsing and dumping of
cache entry file, e.g. by reading the HTTP header part to memory and
fiddling with HTTP headers (e.g. adding Content-Length header), it can be
done in a contents-agnostic way.

Note that with the way I do it in my rewrite, namely saving cached output
to temporary file to rename it to final destination later (atomic update),
we can do mungling of HTTP headers before/during this final copying to
final file, e.g. calculating Content-Length and perhaps Content-MD5 
headers.

> 
>>> isFeedAction() returns true if the action deals with a news feed of some sort,
>>> basically used to bypass the 'Generating...' message should it be a news reader
>>> as those will explode badly on that page.
>> 
>> Why blacklisting 'feed', instead of whitelisting HTML-output?
> 
> There are a limited number of feed types and their ilk (standard xml
> formatted feed and atom), there are lots of html-output like things.
> Easier to default and have things work, generally, than to have things
> not work the way you would expect.

Ah, I see from what you written in other subthreads of this thread that
you prefer to have "Generating..." page where it is not wanted that not
have it where it could be useful (i.e. blacklist approach), while I took
the opposite side (i.e. whitelist approach).

-- 
Jakub Narebski
Poland

  reply	other threads:[~2010-12-10 12:11 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-09 21:57 [PATCH 00/18] Gitweb caching v8 John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 01/18] gitweb: Prepare for splitting gitweb John 'Warthog9' Hawley
2010-12-09 23:30   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 02/18] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 03/18] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 04/18] gitweb: Minimal testing of gitweb caching John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 05/18] gitweb: Regression fix concerning binary output of files John 'Warthog9' Hawley
2010-12-09 23:33   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 06/18] gitweb: Add more explicit means of disabling 'Generating...' page John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 07/18] gitweb: Revert back to $cache_enable vs. $caching_enabled John 'Warthog9' Hawley
2010-12-09 23:38   ` Jakub Narebski
2010-12-10  2:38     ` J.H.
2010-12-10 13:48       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 08/18] gitweb: Change is_cacheable() to return true always John 'Warthog9' Hawley
2010-12-09 23:46   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 09/18] gitweb: Revert reset_output() back to original code John 'Warthog9' Hawley
2010-12-09 23:58   ` Jakub Narebski
2010-12-10  2:43     ` J.H.
2010-12-09 21:57 ` [PATCH 10/18] gitweb: Adding isBinaryAction() and isFeedAction() to determine the action type John 'Warthog9' Hawley
2010-12-10  0:06   ` Jakub Narebski
2010-12-10  3:39     ` J.H.
2010-12-10 12:10       ` Jakub Narebski [this message]
2010-12-10 12:25         ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 11/18] gitweb: add isDumbClient() check John 'Warthog9' Hawley
2010-12-10  0:12   ` Jakub Narebski
2010-12-10  4:00     ` J.H.
2010-12-11  0:07       ` Junio C Hamano
2010-12-11  0:15         ` Jakub Narebski
2010-12-11  1:15           ` J.H.
2010-12-11  1:40             ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 12/18] gitweb: Change file handles (in caching) to lexical variables as opposed to globs John 'Warthog9' Hawley
2010-12-10  0:16   ` Jakub Narebski
2010-12-10  0:32     ` Junio C Hamano
2010-12-10  0:47       ` Jakub Narebski
2010-12-10  5:56       ` J.H.
2010-12-09 21:57 ` [PATCH 13/18] gitweb: Add commented url & url hash to page footer John 'Warthog9' Hawley
2010-12-10  0:26   ` Jakub Narebski
2010-12-10  6:10     ` J.H.
2010-12-09 21:57 ` [PATCH 14/18] gitweb: add print_transient_header() function for central header printing John 'Warthog9' Hawley
2010-12-10  0:36   ` Jakub Narebski
2010-12-10  6:18     ` J.H.
2010-12-09 21:57 ` [PATCH 15/18] gitweb: Add show_warning() to display an immediate warning, with refresh John 'Warthog9' Hawley
2010-12-10  1:01   ` Jakub Narebski
2010-12-10  7:38     ` J.H.
2010-12-10 14:10       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 16/18] gitweb: When changing output (STDOUT) change STDERR as well John 'Warthog9' Hawley
2010-12-10  1:36   ` Jakub Narebski
2010-12-12  5:25     ` J.H.
2010-12-12 15:17       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 17/18] gitweb: Prepare for cached error pages & better error page handling John 'Warthog9' Hawley
2010-12-10  1:49   ` Jakub Narebski
2010-12-10  8:33     ` J.H.
2010-12-10 20:33       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 18/18] gitweb: Add better error handling for gitweb caching John 'Warthog9' Hawley
2010-12-10  1:56   ` Jakub Narebski
2010-12-09 23:26 ` [PATCH 00/18] Gitweb caching v8 Jakub Narebski
2010-12-10  0:43   ` J.H.
2010-12-10  1:27     ` Jakub Narebski
2010-12-10  0:39 ` Junio C Hamano
2010-12-10  0:45   ` J.H.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201012101310.55094.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=warthog9@eaglescrag.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).