git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* gitweb forgets to send utf8 header for raw blob views
@ 2008-05-28 18:04 Jan Engelhardt
  2008-05-29 11:32 ` Lea Wiemann
  2008-05-30  8:18 ` Jakub Narebski
  0 siblings, 2 replies; 12+ messages in thread
From: Jan Engelhardt @ 2008-05-28 18:04 UTC (permalink / raw)
  To: git

Hi,


I have configured gitweb to use utf8, and that works for text blob views 
like on
http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob;f=bin/git-forest;hb=HEAD
but it does not for raw blob views like
http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD


Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: gitweb forgets to send utf8 header for raw blob views
  2008-05-28 18:04 gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
@ 2008-05-29 11:32 ` Lea Wiemann
  2008-05-30  8:18 ` Jakub Narebski
  1 sibling, 0 replies; 12+ messages in thread
From: Lea Wiemann @ 2008-05-29 11:32 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

Jan Engelhardt wrote:
> [utf8 does not work] for raw blob views like
> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD

Gitweb should probably not be recoding blobs, so the best I can think of 
is check for UTF-8 validity and add charset=utf-8 in that case (and in 
other cases leave the charset undeclared).

The drawback with that is that we cannot send plain blobs without 
reading them into memory (or reading them twice), since we have to check 
for UTF-8 validity of the whole blob before sending it.  (Gitweb is 
currently reading the whole blob into memory, but that's unnecessary and 
could be changed in the future.)

After my next refactoring, there *might* be some chance to easily 
implement something like "if it's smaller than x KB (e.g. 512), read it 
into memory, check for valid UTF-8 and optionally add charset=utf-8, 
otherwise don't read it into memory and send it without charset=utf-8 
[or perhaps check for BOM presence at the beginning]."  I'll remember 
if/when it comes up in my refactoring and get back to the mailing list 
about it.

-- Lea

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: gitweb forgets to send utf8 header for raw blob views
  2008-05-28 18:04 gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
  2008-05-29 11:32 ` Lea Wiemann
@ 2008-05-30  8:18 ` Jakub Narebski
  2008-05-31 11:27   ` [PATCH] gitweb: Add charset info to "raw" blob output Jakub Narebski
  2008-05-31 15:04   ` gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
  1 sibling, 2 replies; 12+ messages in thread
From: Jakub Narebski @ 2008-05-30  8:18 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

Jan Engelhardt <jengelh@medozas.de> writes:

> I have configured gitweb to use utf8, and that works for text blob views 
> like on
> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob;f=bin/git-forest;hb=HEAD
> but it does not for raw blob views like
> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD

This can depend on configuration, both on gitweb configuration (you
can for example define $default_blob_plain_mimetype to 'text/plain;
charset=utf-8', and define $default_text_plain_charset to 'utf-8'),
and on your /etc/mime.types; gitweb does not add charset info if
mimetype is acquired from mime.types, which I guess is a mistake.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] gitweb: Add charset info to "raw" blob output
  2008-05-30  8:18 ` Jakub Narebski
@ 2008-05-31 11:27   ` Jakub Narebski
  2008-05-31 18:22     ` Junio C Hamano
  2008-05-31 15:04   ` gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Narebski @ 2008-05-31 11:27 UTC (permalink / raw)
  To: git; +Cc: Jan Engelhardt, Lea Wiemann, Jakub Narebski

Always add charset info from $default_text_plain_charset (if it is
defined) to "raw" (a=blob_plain) output for 'text/plain' blobs.
Adding charset info in a special case was removed from blob_mimetype().

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Please note that to have utf-8 for 'text/plain' blobs in blob_plain
view ("raw" output) you still have to set $default_text_plain_charset
to 'utf-8' (in gitweb configuration file).

 gitweb/gitweb.perl |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 57a1905..dd0f0ac 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2481,8 +2481,7 @@ sub blob_mimetype {
 	return $default_blob_plain_mimetype unless $fd;
 
 	if (-T $fd) {
-		return 'text/plain' .
-		       ($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
+		return 'text/plain';
 	} elsif (! $filename) {
 		return 'application/octet-stream';
 	} elsif ($filename =~ m/\.png$/i) {
@@ -4397,6 +4396,9 @@ sub git_blob_plain {
 		or die_error(undef, "Couldn't cat $file_name, $hash");
 
 	$type ||= blob_mimetype($fd, $file_name);
+	if ($type eq 'text/plain' && defined $default_text_plain_charset) {
+		$type .= "; charset=$default_text_plain_charset";
+	}
 
 	# save as filename, even when no $file_name is given
 	my $save_as = "$hash";

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: gitweb forgets to send utf8 header for raw blob views
  2008-05-30  8:18 ` Jakub Narebski
  2008-05-31 11:27   ` [PATCH] gitweb: Add charset info to "raw" blob output Jakub Narebski
@ 2008-05-31 15:04   ` Jan Engelhardt
  2008-05-31 22:39     ` Jakub Narebski
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Engelhardt @ 2008-05-31 15:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git


On Friday 2008-05-30 10:18, Jakub Narebski wrote:
>Jan Engelhardt <jengelh@medozas.de> writes:
>
>> I have configured gitweb to use utf8, and that works for text blob views 
>> like on
>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob;f=bin/git-forest;hb=HEAD
>> but it does not for raw blob views like
>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD
>
>This can depend on configuration, both on gitweb configuration (you
>can for example define $default_blob_plain_mimetype to 'text/plain;
>charset=utf-8', and define $default_text_plain_charset to 'utf-8'),
>and on your /etc/mime.types; gitweb does not add charset info if
>mimetype is acquired from mime.types, which I guess is a mistake.

Thanks for the hint. Setting 
	our $default_text_plain_charset  = "utf-8";
was all that was needed. I only had $fallback_encoding set to utf-8
for whatever reason...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] gitweb: Add charset info to "raw" blob output
  2008-05-31 11:27   ` [PATCH] gitweb: Add charset info to "raw" blob output Jakub Narebski
@ 2008-05-31 18:22     ` Junio C Hamano
  2008-06-01 11:06       ` Jakub Narebski
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2008-05-31 18:22 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jan Engelhardt, Lea Wiemann

Jakub Narebski <jnareb@gmail.com> writes:

> Always add charset info from $default_text_plain_charset (if it is
> defined) to "raw" (a=blob_plain) output for 'text/plain' blobs.
> Adding charset info in a special case was removed from blob_mimetype().
>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> ---

Looks Ok but it took a bit of digging on the list for me to figure out
that something like this was missing from the beginning of your commit log
message:

	Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
	guessed the content type to be text by reading from it, and not
	when the MIME type was obtained from /etc/mime.types.

	This fixes the bug by always adding....

But I wonder if moving of this to the calling site is the right thing to
do.  Wouldn't it become much more contained and robust if you did it this
way?

 gitweb/gitweb.perl |   34 +++++++++++++++++++---------------
 1 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 57a1905..f5338e1 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2471,29 +2471,33 @@ sub mimetype_guess {
 sub blob_mimetype {
 	my $fd = shift;
 	my $filename = shift;
+	my $mime;
 
 	if ($filename) {
-		my $mime = mimetype_guess($filename);
-		$mime and return $mime;
-	}
-
-	# just in case
-	return $default_blob_plain_mimetype unless $fd;
-
-	if (-T $fd) {
-		return 'text/plain' .
-		       ($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
+		$mime = mimetype_guess($filename);
+	} else if (!defined $fd) {
+		$mime = $default_blob_plain_mimetype;
+	} else if (-T $fd) {
+		$mime = 'text/plain';
 	} elsif (! $filename) {
-		return 'application/octet-stream';
+		$mime = 'application/octet-stream';
 	} elsif ($filename =~ m/\.png$/i) {
-		return 'image/png';
+		$mime = 'image/png';
 	} elsif ($filename =~ m/\.gif$/i) {
-		return 'image/gif';
+		$mime = 'image/gif';
 	} elsif ($filename =~ m/\.jpe?g$/i) {
-		return 'image/jpeg';
+		$mime = 'image/jpeg';
 	} else {
-		return 'application/octet-stream';
+		$mime = 'application/octet-stream';
 	}
+
+	# Type specific postprocessing can be added as needed...
+	if ($mime =~ /^text\//i &&
+	    $mime !~ /charset=/i && $default_text_plain_charset) {
+		$mime .=  '; charset='.$default_text_plain_charset;
+	}
+
+	return $mime;
 }
 
 ## ======================================================================

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: gitweb forgets to send utf8 header for raw blob views
  2008-05-31 15:04   ` gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
@ 2008-05-31 22:39     ` Jakub Narebski
  2008-06-01  2:08       ` Jan Engelhardt
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Narebski @ 2008-05-31 22:39 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

On Sat, 31 May 2008, Jan Engelhardt wrote:
> On Friday 2008-05-30 10:18, Jakub Narebski wrote:
>>Jan Engelhardt <jengelh@medozas.de> writes:
>>
>>> I have configured gitweb to use utf8, and that works for text blob views 
>>> like on
>>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob;f=bin/git-forest;hb=HEAD
>>> but it does not for raw blob views like
>>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD
>>
>> This can depend on configuration, both on gitweb configuration (you
>> can for example define $default_blob_plain_mimetype to 'text/plain;
>> charset=utf-8', and define $default_text_plain_charset to 'utf-8'),
>> and on your /etc/mime.types; gitweb does not add charset info if
>> mimetype is acquired from mime.types, which I guess is a mistake.
> 
> Thanks for the hint. Setting 
> 	our $default_text_plain_charset  = "utf-8";
> was all that was needed.

By the way, do you think that this should be the default for gitweb?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: gitweb forgets to send utf8 header for raw blob views
  2008-05-31 22:39     ` Jakub Narebski
@ 2008-06-01  2:08       ` Jan Engelhardt
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Engelhardt @ 2008-06-01  2:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git


On Sunday 2008-06-01 00:39, Jakub Narebski wrote:

>On Sat, 31 May 2008, Jan Engelhardt wrote:
>> On Friday 2008-05-30 10:18, Jakub Narebski wrote:
>>>Jan Engelhardt <jengelh@medozas.de> writes:
>>>
>>>> I have configured gitweb to use utf8, and that works for text blob views 
>>>> like on
>>>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob;f=bin/git-forest;hb=HEAD
>>>> but it does not for raw blob views like
>>>> http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD
>>>
>>> This can depend on configuration, both on gitweb configuration (you
>>> can for example define $default_blob_plain_mimetype to 'text/plain;
>>> charset=utf-8', and define $default_text_plain_charset to 'utf-8'),
>>> and on your /etc/mime.types; gitweb does not add charset info if
>>> mimetype is acquired from mime.types, which I guess is a mistake.
>> 
>> Thanks for the hint. Setting 
>> 	our $default_text_plain_charset  = "utf-8";
>> was all that was needed.
>
>By the way, do you think that this should be the default for gitweb?

Definitely. I also made tbz2 the default for me over tgz, because that's
just how it is.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] gitweb: Add charset info to "raw" blob output
  2008-05-31 18:22     ` Junio C Hamano
@ 2008-06-01 11:06       ` Jakub Narebski
  2008-06-01 12:15         ` Jan Engelhardt
  2008-06-03 14:47         ` [PATCH v2] gitweb: Add charset info to "raw" output of 'text/plain' blobs Jakub Narebski
  0 siblings, 2 replies; 12+ messages in thread
From: Jakub Narebski @ 2008-06-01 11:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jan Engelhardt, Lea Wiemann

On Sat, 31 May 2008, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Always add charset info from $default_text_plain_charset (if it is
>> defined) to "raw" (a=blob_plain) output for 'text/plain' blobs.
>> Adding charset info in a special case was removed from blob_mimetype().
>>
>> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
>> ---
> 
> Looks Ok but it took a bit of digging on the list for me to figure out
> that something like this was missing from the beginning of your commit log
> message:
> 
> 	Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
> 	guessed the content type to be text by reading from it, and not
> 	when the MIME type was obtained from /etc/mime.types.
> 
> 	This fixes the bug by always adding....

I'm sorry that I have forgot to put the "why" in commit message.
I'd add this when resending v2 of this patch.

Thanks for a comment.

> But I wonder if moving of this to the calling site is the right thing to
> do.  Wouldn't it become much more contained and robust if you did it this
> way?
[...]
>  sub blob_mimetype {

This _might_ be better.  I didn't do this for the following two reasons:

First, from purely theoretical point of view the name of subroutine is
blob_mimetype(), and I think that charset info has place in Content-Type,
but is not part of MIME type info.  

Second, blob_mimetype() is used in two places: in git_blob_plain
(in "raw" blob view) to generate correct Content-Type HTTP header, and
in git_blob to decide whether a.) blame makes sense, b.) whether to
redirect to "raw" (a=blob_plain) view.  I'd rather not muck with
charset info in second case, although I don't think that it matters
at all, at least for now.


So perhaps best of those ways would be to create thin wrapper subroutine,
blob_contenttype($fd, $file_name, $mimetype), where both $file_name and
(especially) $mimetype are optional parameters, and ise it in
git_blob_plain() subroutine...

> +	# Type specific postprocessing can be added as needed...
> +	if ($mime =~ /^text\//i &&
> +	    $mime !~ /charset=/i && $default_text_plain_charset) {
> +		$mime .=  '; charset='.$default_text_plain_charset;
> +	}
> +
> +	return $mime;

I'm not sure about it.  I worry a bit about text/html, which can, and
usually do, contain charset info inside the document.  I'm not sure
what happens when charset information from HTTP headers contradict
charset information from presented file.  That's why I have limited
adding charset info purely to 'text/plain', not 'text/*' without
charset info already present.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] gitweb: Add charset info to "raw" blob output
  2008-06-01 11:06       ` Jakub Narebski
@ 2008-06-01 12:15         ` Jan Engelhardt
  2008-06-01 12:16           ` Jan Engelhardt
  2008-06-03 14:47         ` [PATCH v2] gitweb: Add charset info to "raw" output of 'text/plain' blobs Jakub Narebski
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Engelhardt @ 2008-06-01 12:15 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git, Lea Wiemann


On Sunday 2008-06-01 13:06, Jakub Narebski wrote:
>> +	# Type specific postprocessing can be added as needed...
>> +	if ($mime =~ /^text\//i &&
>> +	    $mime !~ /charset=/i && $default_text_plain_charset) {
>> +		$mime .=  '; charset='.$default_text_plain_charset;
>> +	}
>> +
>> +	return $mime;
>
>I'm not sure about it.  I worry a bit about text/html, which can, and
>usually do, contain charset info inside the document.  I'm not sure
>what happens when charset information from HTTP headers contradict
>charset information from presented file.

The HTTP header takes -- as stupid as it looks -- precedence
over the HTML header. As such, a charset in the HTTP Response Header
should ONLY be sent if the file is guaranteed to be text/plain only.

>That's why I have limited
>adding charset info purely to 'text/plain', not 'text/*' without
>charset info already present.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] gitweb: Add charset info to "raw" blob output
  2008-06-01 12:15         ` Jan Engelhardt
@ 2008-06-01 12:16           ` Jan Engelhardt
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Engelhardt @ 2008-06-01 12:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git, Lea Wiemann


On Sunday 2008-06-01 14:15, Jan Engelhardt wrote:
>On Sunday 2008-06-01 13:06, Jakub Narebski wrote:
>>> +	# Type specific postprocessing can be added as needed...
>>> +	if ($mime =~ /^text\//i &&
>>> +	    $mime !~ /charset=/i && $default_text_plain_charset) {
>>> +		$mime .=  '; charset='.$default_text_plain_charset;
>>> +	}
>>> +
>>> +	return $mime;
>>
>>I'm not sure about it.  I worry a bit about text/html, which can, and
>>usually do, contain charset info inside the document.  I'm not sure
>>what happens when charset information from HTTP headers contradict
>>charset information from presented file.
>
>The HTTP header takes -- as stupid as it looks -- precedence
>over the HTML header. As such, a charset in the HTTP Response Header
>should ONLY be sent if the file is guaranteed to be text/plain only.

Minor correction; s/file/output/. gitweb normaly produces HTML for
all its normal views, so no Charset header here; but when it
outputs "raw", it should provide one.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] gitweb: Add charset info to "raw" output of 'text/plain' blobs
  2008-06-01 11:06       ` Jakub Narebski
  2008-06-01 12:15         ` Jan Engelhardt
@ 2008-06-03 14:47         ` Jakub Narebski
  1 sibling, 0 replies; 12+ messages in thread
From: Jakub Narebski @ 2008-06-03 14:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jan Engelhardt, Lea Wiemann

Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
guessed the content type to be text by reading from it, and not when
the MIME type was obtained from /etc/mime.types, or when gitweb
couldn't guess mimetype and used $default_blob_plain_mimetype.

This fixes the bug by always add charset info from
$default_text_plain_charset (if it is defined) to "raw" (a=blob_plain)
output for 'text/plain' blobs.

Generating information for Content-Type: header got separated into
blob_contenttype() subroutine; adding charset info in a special case
was removed from blob_mimetype(), which now should return mimetype
only.


While at it cleanup code a bit: put subroutine parameter
initialization first, make error message more robust (when $file_name
is not defined) if more cryptic, remove unnecessary '"' around
variable ("$var" -> $var).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---

On Sun, 1 June 2008, Jakub Narebski wrote:
> On Sat, 31 May 2008, Junio C Hamano wrote:
>> Jakub Narebski <jnareb@gmail.com> writes:
>> 
>>> Always add charset info from $default_text_plain_charset (if it is
>>> defined) to "raw" (a=blob_plain) output for 'text/plain' blobs.
>>> Adding charset info in a special case was removed from blob_mimetype().
>> 
>> Looks Ok but it took a bit of digging on the list for me to figure out
>> that something like this was missing from the beginning of your commit log
>> message:
>> 
>> 	Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
>> 	guessed the content type to be text by reading from it, and not
>> 	when the MIME type was obtained from /etc/mime.types.
>> 
>> 	This fixes the bug by always adding....
> 
> I'm sorry that I have forgot to put the "why" in commit message.
> I'd add this when resending v2 of this patch.

Added.

>> But I wonder if moving of this to the calling site is the right thing to
>> do.  Wouldn't it become much more contained and robust if you did it this
>> way?
> [...]
>>  sub blob_mimetype {
> 
> This _might_ be better.  I didn't do this for the following two reasons:
[...]
> So perhaps best of those ways would be to create thin wrapper subroutine,
> blob_contenttype($fd, $file_name, $mimetype), where both $file_name and
> (especially) $mimetype are optional parameters, and ise it in
> git_blob_plain() subroutine...

It is now done this way.  IMHO it is a best solution.

>> +	# Type specific postprocessing can be added as needed...
>> +	if ($mime =~ /^text\//i &&
>> +	    $mime !~ /charset=/i && $default_text_plain_charset) {
>> +		$mime .=  '; charset='.$default_text_plain_charset;
>> +	}
>> +
>> +	return $mime;
> 
> I'm not sure about it.  I worry a bit about text/html, which can, and
> usually do, contain charset info inside the document.  I'm not sure
> what happens when charset information from HTTP headers contradict
> charset information from presented file.  That's why I have limited
> adding charset info purely to 'text/plain', not 'text/*' without
> charset info already present.

Currently for the above reason gitweb adds charset info _only_
for 'text/plain' mimetype.

 gitweb/gitweb.perl |   29 ++++++++++++++++++++---------
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 57a1905..c6d43bf 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2481,8 +2481,7 @@ sub blob_mimetype {
 	return $default_blob_plain_mimetype unless $fd;
 
 	if (-T $fd) {
-		return 'text/plain' .
-		       ($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
+		return 'text/plain';
 	} elsif (! $filename) {
 		return 'application/octet-stream';
 	} elsif ($filename =~ m/\.png$/i) {
@@ -2496,6 +2495,17 @@ sub blob_mimetype {
 	}
 }
 
+sub blob_contenttype {
+	my ($fd, $file_name, $type) = @_;
+
+	$type ||= blob_mimetype($fd, $file_name);
+	if ($type eq 'text/plain' && defined $default_text_plain_charset) {
+		$type .= "; charset=$default_text_plain_charset";
+	}
+
+	return $type;
+}
+
 ## ======================================================================
 ## functions printing HTML: header, footer, error page
 
@@ -4377,6 +4387,7 @@ sub git_heads {
 }
 
 sub git_blob_plain {
+	my $type = shift;
 	my $expires;
 
 	if (!defined $hash) {
@@ -4392,13 +4403,13 @@ sub git_blob_plain {
 		$expires = "+1d";
 	}
 
-	my $type = shift;
 	open my $fd, "-|", git_cmd(), "cat-file", "blob", $hash
-		or die_error(undef, "Couldn't cat $file_name, $hash");
+		or die_error(undef, "Open git-cat-file blob '$hash' failed");
 
-	$type ||= blob_mimetype($fd, $file_name);
+	# content-type (can include charset)
+	$type = blob_contenttype($fd, $file_name, $type);
 
-	# save as filename, even when no $file_name is given
+	# "save as" filename, even when no $file_name is given
 	my $save_as = "$hash";
 	if (defined $file_name) {
 		$save_as = $file_name;
@@ -4407,9 +4418,9 @@ sub git_blob_plain {
 	}
 
 	print $cgi->header(
-		-type => "$type",
-		-expires=>$expires,
-		-content_disposition => 'inline; filename="' . "$save_as" . '"');
+		-type => $type,
+		-expires => $expires,
+		-content_disposition => 'inline; filename="' . $save_as . '"');
 	undef $/;
 	binmode STDOUT, ':raw';
 	print <$fd>;
-- 
1.5.5.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-06-03 14:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-28 18:04 gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
2008-05-29 11:32 ` Lea Wiemann
2008-05-30  8:18 ` Jakub Narebski
2008-05-31 11:27   ` [PATCH] gitweb: Add charset info to "raw" blob output Jakub Narebski
2008-05-31 18:22     ` Junio C Hamano
2008-06-01 11:06       ` Jakub Narebski
2008-06-01 12:15         ` Jan Engelhardt
2008-06-01 12:16           ` Jan Engelhardt
2008-06-03 14:47         ` [PATCH v2] gitweb: Add charset info to "raw" output of 'text/plain' blobs Jakub Narebski
2008-05-31 15:04   ` gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
2008-05-31 22:39     ` Jakub Narebski
2008-06-01  2:08       ` Jan Engelhardt

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).