user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: Re: Bug related to (maybe?) / in Message-Id
Date: Fri, 17 Feb 2023 09:52:55 +0100	[thread overview]
Message-ID: <20230217085255.xcsaoozloz2yuxil@pengutronix.de> (raw)
In-Reply-To: <20230216213628.M187845@dcvr>

[-- Attachment #1: Type: text/plain, Size: 6047 bytes --]

Hello Eric,

first of all: Thanks for your quick answer.

On Thu, Feb 16, 2023 at 09:36:28PM +0000, Eric Wong wrote:
> Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> > Hello,
> > 
> > The mail by Alexander Dahl that is (currently) the first hit on
> > https://lore.ptxdist.org/ptxdist/?q=ptxd_make_world_compile_commands_filter
> > results in a 404 when I follow the link.
> > 
> > The original mail has
> > 
> > 	Message-ID: <Y+07h0l/zJJAgs9s@falbala.internal.home.lespocky.de>
> > 
> > and the corresponding link is:
> > 
> > 	https://lore.ptxdist.org/ptxdist/Y+07h0l%2FzJJAgs9s@falbala.internal.home.lespocky.de/
> > 
> > I noticed this on public-inbox 1.8.0-1~bpo11+1 from Debian, upgrading to
> > 1.9.0-1~bpo11+1 didn't help.
> > 
> > Other mails with / in Message-Id are not accessible either, I tested
> > with:
> > 
> > 	YyHu/412LT8uQTy1@lenoch
> > 	Y0/5xdFZO3u0952+@lenoch
> 
> The TODO file has this:
> 
> 	* use REQUEST_URI properly for CGI / mod_perl2 compatibility
> 	  with Message-IDs which include '%' (done?)
> 
> So I guess it's not done...  To deal with '/' in the Message-ID,
> $env->{REQUEST_URI} really needs to be the raw, undecoded URI
> specified in the PSGI specs[1].
> 
> I'm not sure how to go about it Apache+CGI or mod_perl2..
> 
> Fwiw, the recommended configuration is:
> (nginx|haproxy) -> varnish -> public-inbox-{httpd,netd}
> 
> Maybe Apache2 mpm_event reverse proxy can work in lieu of
> (nginx|haproxy), but /T/, /t/, /t.mbox.gz requests are a bit
> faster on -httpd/-netd since 1.6+ on SMP machines.
> 
> > I also wonder why these mails yield the webserver's 404 page and not the
> > one provided by the public-inbox cgi?!
> 
> This may be the small size public-inbox's 404 page.  I don't
> know Apache configs well, but I know nginx did something
> similar.

> > Is this a problem in public-inbox, or is the apache configuration
> > somehow borked? Any hints welcome.
> 
> Do you have access to that server and can show us the configs?
> REQUEST_URI really needs to be raw in accordance to PSGI specs.
> 
> This can dump the request $env to stderr and show us
> REQUEST_URI, PATH_INFO, SCRIPT_NAME, and anything else
> which may enlighten us:
> 
> diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
> index 9ffcb879..f67fe8e6 100644
> --- a/lib/PublicInbox/WWW.pm
> +++ b/lib/PublicInbox/WWW.pm
> @@ -52,7 +52,8 @@ sub call {
>  		# none of the keys we care about will need escaping
>  		($k // '', uri_unescape($v // ''))
>  	} split(/[&;]+/, $env->{QUERY_STRING});
> -
> +	use Data::Dumper; $Data::Dumper::Useqq = 1;
> +	warn Dumper($env);
>  	my $path_info = path_info_raw($env);
>  	my $method = $env->{REQUEST_METHOD};

I added that patch and for the reported request this didn't trigger,
which I assume means that public-inbox isn't called at all.
 
Playing around with slashes got my admin and me on the right trail:
https://httpd.apache.org/docs/current/mod/core.html#allowencodedslashes

We set that to "On" and now it (mostly) works. Maybe it's worth adding
this hint to the documentation even though apache isn't the most
recommended setup? Maybe other servers have a similar security setting?

I wrote "mostly" because

	https://lore.ptxdist.org/ptxdist/Y+07h0l%2FzJJAgs9s@falbala.internal.home.lespocky.de/
	https://lore.ptxdist.org/ptxdist/Y+07h0l%2FzJJAgs9s@falbala.internal.home.lespocky.de
	https://lore.ptxdist.org/ptxdist/Y+07h0l/zJJAgs9s@falbala.internal.home.lespocky.de/

work as expected;

	https://lore.ptxdist.org/ptxdist/Y+07h0l/zJJAgs9s@falbala.internal.home.lespocky.de

however does not, that yields a short "Not Found".

With the patch applied the logged stuff for these URLs is mostly
identical. REMOTE_PORT differs which is expected. Otherwise only
PATH_INFO, PATH_TRANSLATED and REQUEST_URI differ. They are
respectively:

	"PATH_INFO" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de/",
	"PATH_TRANSLATED" => "/usr/lib/cgi-bin/public-inbox.cgi/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de/",
	"REQUEST_URI" => "/ptxdist/Y+07h0l%2FzJJAgs9s\@falbala.internal.home.lespocky.de/",

	"PATH_INFO" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de",
	"PATH_TRANSLATED" => "/usr/lib/cgi-bin/public-inbox.cgi/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de",
	"REQUEST_URI" => "/ptxdist/Y+07h0l%2FzJJAgs9s\@falbala.internal.home.lespocky.de",

	"PATH_INFO" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de/",
	"PATH_TRANSLATED" => "/usr/lib/cgi-bin/public-inbox.cgi/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de/",
	"REQUEST_URI" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de/",

	"PATH_INFO" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de",
	"PATH_TRANSLATED" => "/usr/lib/cgi-bin/public-inbox.cgi/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de",
	"REQUEST_URI" => "/ptxdist/Y+07h0l/zJJAgs9s\@falbala.internal.home.lespocky.de",

which I think is all as expected. In all cases we have

	"SCRIPT_NAME" => "",

. Not sure making the last URL work is easily possible (and worth the
effort)? If a Message-Id ends in "/T" or similar the result will always
be ambigous?

One thing I just noticed is:

$ curl https://lore.ptxdist.org/ptxdist/Y+07h0l/zJJAgs9s@falbala.internal.home.lespocky.de/T
Redirecting to https://lore.ptxdist.org/ptxdist/Y+07h0l/zJJAgs9s@falbala.internal.home.lespocky.de/T

which makes Firefox say: "The page isn’t redirecting properly". It works
fine with the / replaced by %2F:

$ curl https://lore.ptxdist.org/ptxdist/Y+07h0l%2fzJJAgs9s@falbala.internal.home.lespocky.de/T
Redirecting to https://lore.ptxdist.org/ptxdist/Y+07h0l%2fzJJAgs9s@falbala.internal.home.lespocky.de/T/#u

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-02-17  8:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 21:05 Bug related to (maybe?) / in Message-Id Uwe Kleine-König
2023-02-16 21:36 ` Eric Wong
2023-02-17  8:52   ` Uwe Kleine-König [this message]
2023-02-17 10:28     ` Eric Wong
2023-02-17 10:32       ` [PATCH] TODO: handle more cases of unencoded slashes Eric Wong
2023-02-17 11:08       ` [PATCH] public-inbox.cgi(1): Mention AllowEncodedSlashes for Apache setups Uwe Kleine-König
2023-02-17 13:15         ` Eric Wong
2023-02-18 17:58       ` Bug related to (maybe?) / in Message-Id Thomas Weißschuh
2023-02-18 18:06         ` Thomas Weißschuh
2023-03-07  8:31           ` Eric Wong
2023-03-07 22:30             ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230217085255.xcsaoozloz2yuxil@pengutronix.de \
    --to=u.kleine-koenig@pengutronix.de \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).