Commit 84d9e2d
gitweb: Allow UTF-8 encoded CGI query parameters and path_info
Gitweb forgot to turn query parameters into UTF-8. This results in a bug
that one cannot search for a string with characters outside US-ASCII. For
example searching for "Michał Kiedrowicz" (containing letter 'ł' - LATIN
SMALL LETTER L WITH STROKE, with Unicode codepoint U+0142, represented
with 0xc5 0x82 bytes in UTF-8 and percent-encoded as %C5%82) result in the
following incorrect data in search field
MichaÅ\202 Kiedrowicz
This is caused by CGI by default treating '0xc5 0x82' bytes as two
characters in Perl legacy encoding latin-1 (iso-8859-1), because 's'
query parameter is not processed explicitly as UTF-8 encoded string.
The solution used here follows "Using Unicode in a Perl CGI script"
article on http://www.lemoda.net/cgi/perl-unicode/index.html:
use CGI;
use Encode 'decode_utf8;
my $value = params('input');
$value = decode_utf8($value);
Decoding UTF-8 is done when filling %input_params hash and $path_info
variable; the former requires to move from explicit $cgi->param(<label>)
to $input_params{<name>} in a few places, which is a good idea anyway.
Also add -override=>1 parameter to $cgi->textfield() invocation in search
form. Otherwise CGI would use values from query string if it is present,
filling value from $cgi->param... without decode_utf8(). As we are using
value of appropriate parameter anyway, -override=>1 doesn't change the
situation but makes gitweb fill search field correctly.
We could simply use the '-utf8' pragma (via "use CGI '-utf8';") to solve
this, but according to CGI.pm documentation, it may cause problems with
POST requests containing binary files, and it requires CGI 3.31 (I think),
released with perl v5.8.9.
Reported-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Jakub Narębski <jnareb@gmail.com>
Tested-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>1 parent 828ea97 commit 84d9e2d
1 file changed
+8
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| |||
816 | 816 | | |
817 | 817 | | |
818 | 818 | | |
819 | | - | |
| 819 | + | |
820 | 820 | | |
821 | | - | |
| 821 | + | |
822 | 822 | | |
823 | 823 | | |
824 | 824 | | |
| |||
2765 | 2765 | | |
2766 | 2766 | | |
2767 | 2767 | | |
2768 | | - | |
| 2768 | + | |
2769 | 2769 | | |
2770 | 2770 | | |
2771 | 2771 | | |
| |||
3871 | 3871 | | |
3872 | 3872 | | |
3873 | 3873 | | |
3874 | | - | |
| 3874 | + | |
3875 | 3875 | | |
3876 | 3876 | | |
3877 | 3877 | | |
| |||
5280 | 5280 | | |
5281 | 5281 | | |
5282 | 5282 | | |
5283 | | - | |
| 5283 | + | |
5284 | 5284 | | |
5285 | 5285 | | |
5286 | 5286 | | |
| |||
5992 | 5992 | | |
5993 | 5993 | | |
5994 | 5994 | | |
5995 | | - | |
| 5995 | + | |
5996 | 5996 | | |
5997 | 5997 | | |
5998 | 5998 | | |
| |||
6195 | 6195 | | |
6196 | 6196 | | |
6197 | 6197 | | |
6198 | | - | |
| 6198 | + | |
6199 | 6199 | | |
6200 | 6200 | | |
6201 | 6201 | | |
| |||
0 commit comments