git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: "John 'Warthog9' Hawley" <warthog9@eaglescrag.net>
Cc: git@vger.kernel.org, Jakub Narebski <jnareb@gmail.com>
Subject: Re: [PATCH 00/18] Gitweb caching v8
Date: Thu, 09 Dec 2010 15:26:59 -0800 (PST)	[thread overview]
Message-ID: <m3bp4u34vj.fsf@localhost.localdomain> (raw)
In-Reply-To: <1291931844-28454-1-git-send-email-warthog9@eaglescrag.net>

John, could you please in the future Cc me?  I am interested in gitweb
output caching development.  Thanks in advance.

"John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes:

> Afternoon everyone,
> 
> (Afternoon is like morning, right?)
>  
> This is the latest incarnation of gitweb w/ caching.  Per the general
> consensus and requests from the recent GitTogether I'm re-submitting 
> my patches.
> 
> Bunch of re-works in the code, and several requested features.  Sadly the
> patch series has balloned as I've been adding things.  It was 3-4 patches,
> it's now 18.  This is based on top of Jakub's v7.2 patch series, but
> it should be more or less clean now.

Could you please rebase it on top of v7.2 version?  The v7.2 patch
series contained a few bugs that needs to be corrected.

> 
> As such there was a bunch of changes that I needed to do to Jakub's tree
> which are indicated in the series.  Why did I do them up as separate things?
> Mainly there's a bunch of history that's getting lost right now between
> going back and forth, and I wanted to have clear patches to discuss
> should further discussion be needed.

I guess that in the final submission (i.e. the one that is to be
merged in into git.git repository) those changes would be squashed in,
isn't it?

> 
> This still differs, by two patches, from whats in production on kernel.org.
> It's missing the index page git:// link, and kernel.org and kernel.org also
> has the forced version matching.  As a note I'll probably let this stew
> another day or so on kernel.org and then I'll push it into the Fedora update
> stream, as there's a couple of things in this patch series that would be 
> good for them to have.

There was some discussion about git:// link in the past; nevertheless
this issue is independent on gitweb caching and can (and should) be
sent as a aeparate patch.

IIRC we agreed that because of backward compatibility forced versions
match is quite useless (in general)...

> 
> There is one additional script I've written that the Fedora folks are using,
> and that might be useful to include, which is an 'offline' cache file generator.
> It basically wraps gitweb.cgi and at the end moves the cache file into the right
> place.  The Fedora folks were finding it took hours to generate their front
> page, and that doing a background generation almost never completed (due to 
> process death).  This was a simple way to handle that.  If people would like
> I can add it in as an additional patch.

Are you detaching the background process?

It would be nice to have it as separate patch.

> 
> v8:
> 	- Reverting several changes from Jakub's change set that make no sense
>                 - is_cacheable changed to always return true - nothing special about
>                   blame or blame_incremental as far as the caching engine is concerned

'blame_incremental' is just another version of 'blame' view.  I have
disabled it when caching is enabled in my rewrite (you instead disabled
caching for 'blame_incremental' in your v7 and mine v7.x) because I
couldn't get it to work together with caching.  Did you check that it
works?

Besides, withou "tee"-ing, i.e. printing output as it is captured,
cached 'blame_data' means that 'blame_incremental' is not incremental,
and therefore it vanishes its advantage over 'blame'.

In the case data is in cache, then 'blame_inremental' doesn't have
advantage over 'blame' either.

>                 - Reverted config file change "caching_enabled" back to "cache_enable" as this
>                   config file option is already in the wild in production code, as are all
>                   current gitweb-caching configuration variables.

[Explitive deleted.]  I dislike strongly this $cache_enable.  I think it
would be better for backward compatibility (should we keep backward
compatibility with out-of-tree patches?) to use the same mechanism as
provided in 

  [PATCHv6/RFC 22/24] gitweb: Support legacy options used by kernel.org caching engine
  http://thread.gmane.org/gmane.comp.version-control.git/163052/focus=163058
  http://repo.or.cz/w/git/jnareb-git.git/commitdiff/27ec67ad90ecd56ac3d05f6a9ea49b6faabf7d0a

in my rewrite.  Just set $caching_enabled to true if $cache_enable is
defined and true.

>                 - Reverted change to reset_output as
>                         open STDOUT, ">&", \*STDOUT_REAL;
>                   causes assertion failures:
>                   Assertion !((((s->var)->sv_flags & (0x00004000|0x00008000)) == 0x00008000) && (((svtype)((s->var)->sv_flags & 0xff)) == SVt_PVGV || ((svtype)((s->var)->sv_flags & 0xff)) == SVt_PVLV)) failed: file "scalar.xs", line 49 at gitweb.cgi line 1221.
>                   if we encounter an error *BEFORE* we've ever changed the output.

Which Perl version are you using?  Because I think you found error in Perl.
Well, at least I have not happen on this bug.

I have nothing againts using

  open STDOUT, ">&STDOUT_REAL";

though I really prefer that you used lexical filehandles, instead of
"globs" which are global variables.

The following works:

  open STDOUT, '>&', fileno($fh);

Note that fileno(Symbol::qualify_to_ref($fh)) might be needed...

>         - Cleanups there were indirectly mentioned by Jakub
>                 - Elimination of anything even remotely looking like duplicate code
>                         - Creation of isBinaryAction() and isFeedAction()

Could you please do not use mixedCase names?


First, that is what %actions_info from

  [PATCH 16/24] gitweb: Introduce %actions_info, gathering information about actions
  http://thread.gmane.org/gmane.comp.version-control.git/163052/focus=163038
  http://repo.or.cz/w/git/jnareb-git.git/commitdiff/305a10339b33d56b4a50708d71e8f42453c8cb1f

I have invented for.

Second, why 'isBinaryAction()'?  there isn't something inherently
different between binary (':raw') and text (':utf8') output, as I have
repeatedly said before.  See my rewrite: there is no special case for
binary output (or perhaps binary output as in the case of 'blob_plain'
action).

>         - Adding in blacklist of "dumb" clients for purposes of downloading content
>         - Added more explicit disablement of "Generating..." page

Good, I'll check this.

>         - Added better error handling
>                 - Creation of .err file in the cache directory
>                 - Trap STDERR output into $output_err as this was spewing data prior
>                   to any header information being sent

Why it is needed?  We capture output of "die" via CGI::Util::set_message,
and "warn" output is captured to web server logs... unless you explicitely
use "print STDERR <sth>" -- don't do that instead.

>         - Added hidden field in footer for url & hash of url, which is extremely useful
>           for debugging

Nice idea, I'll see it.  Can it be disabled (information leakage)?

-- 
Jakub Narebski
Poland
ShadeHawk on #git

  parent reply	other threads:[~2010-12-09 23:27 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-09 21:57 [PATCH 00/18] Gitweb caching v8 John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 01/18] gitweb: Prepare for splitting gitweb John 'Warthog9' Hawley
2010-12-09 23:30   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 02/18] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 03/18] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 04/18] gitweb: Minimal testing of gitweb caching John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 05/18] gitweb: Regression fix concerning binary output of files John 'Warthog9' Hawley
2010-12-09 23:33   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 06/18] gitweb: Add more explicit means of disabling 'Generating...' page John 'Warthog9' Hawley
2010-12-09 21:57 ` [PATCH 07/18] gitweb: Revert back to $cache_enable vs. $caching_enabled John 'Warthog9' Hawley
2010-12-09 23:38   ` Jakub Narebski
2010-12-10  2:38     ` J.H.
2010-12-10 13:48       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 08/18] gitweb: Change is_cacheable() to return true always John 'Warthog9' Hawley
2010-12-09 23:46   ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 09/18] gitweb: Revert reset_output() back to original code John 'Warthog9' Hawley
2010-12-09 23:58   ` Jakub Narebski
2010-12-10  2:43     ` J.H.
2010-12-09 21:57 ` [PATCH 10/18] gitweb: Adding isBinaryAction() and isFeedAction() to determine the action type John 'Warthog9' Hawley
2010-12-10  0:06   ` Jakub Narebski
2010-12-10  3:39     ` J.H.
2010-12-10 12:10       ` Jakub Narebski
2010-12-10 12:25         ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 11/18] gitweb: add isDumbClient() check John 'Warthog9' Hawley
2010-12-10  0:12   ` Jakub Narebski
2010-12-10  4:00     ` J.H.
2010-12-11  0:07       ` Junio C Hamano
2010-12-11  0:15         ` Jakub Narebski
2010-12-11  1:15           ` J.H.
2010-12-11  1:40             ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 12/18] gitweb: Change file handles (in caching) to lexical variables as opposed to globs John 'Warthog9' Hawley
2010-12-10  0:16   ` Jakub Narebski
2010-12-10  0:32     ` Junio C Hamano
2010-12-10  0:47       ` Jakub Narebski
2010-12-10  5:56       ` J.H.
2010-12-09 21:57 ` [PATCH 13/18] gitweb: Add commented url & url hash to page footer John 'Warthog9' Hawley
2010-12-10  0:26   ` Jakub Narebski
2010-12-10  6:10     ` J.H.
2010-12-09 21:57 ` [PATCH 14/18] gitweb: add print_transient_header() function for central header printing John 'Warthog9' Hawley
2010-12-10  0:36   ` Jakub Narebski
2010-12-10  6:18     ` J.H.
2010-12-09 21:57 ` [PATCH 15/18] gitweb: Add show_warning() to display an immediate warning, with refresh John 'Warthog9' Hawley
2010-12-10  1:01   ` Jakub Narebski
2010-12-10  7:38     ` J.H.
2010-12-10 14:10       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 16/18] gitweb: When changing output (STDOUT) change STDERR as well John 'Warthog9' Hawley
2010-12-10  1:36   ` Jakub Narebski
2010-12-12  5:25     ` J.H.
2010-12-12 15:17       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 17/18] gitweb: Prepare for cached error pages & better error page handling John 'Warthog9' Hawley
2010-12-10  1:49   ` Jakub Narebski
2010-12-10  8:33     ` J.H.
2010-12-10 20:33       ` Jakub Narebski
2010-12-09 21:57 ` [PATCH 18/18] gitweb: Add better error handling for gitweb caching John 'Warthog9' Hawley
2010-12-10  1:56   ` Jakub Narebski
2010-12-09 23:26 ` Jakub Narebski [this message]
2010-12-10  0:43   ` [PATCH 00/18] Gitweb caching v8 J.H.
2010-12-10  1:27     ` Jakub Narebski
2010-12-10  0:39 ` Junio C Hamano
2010-12-10  0:45   ` J.H.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3bp4u34vj.fsf@localhost.localdomain \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=warthog9@eaglescrag.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).