String substitution of UTF-8 encoded strings works fine when the regexp contains only ascii characters, but produces garbled output when regexp contains non-ascii.
my $str = "¿más?";
$str =~ s/[?]//g;
print "$str\n";
==> ¿más
$str =~ s/[¿]//g;
print "$str\n";
==> m�s
UPDATE: The answers to above made it clear that my original question was framed poorly. The answers focused on STDOUT, but in my actual problem, I am not printing to STDOUT. (I only did that to simplify the problem statement). In the actual problem, I retrieve data from sqlite store and use data as filenames to search file system. When I apply cleanup routines to the retrieved data, certain filenames get garbled.
One way to see this might be to simplify the example further:
my $str = "más";
$str =~ s/[?]//g;
print "$str\n";
==> más
$str =~ s/[¿]//g;
print "$str\n";
==> m�s
Now you can see that @ikegami's explanation does not apply. Something about the second s/// creates the problem. To be fair, both answers solved the problem as stated -- but any additional insights would be greatly appreciated!
UPDATE 2: As requested, have added sprintf's vector flag output. Note: Have also changed the target substitution character from ¿ to ¡ -- I now think that my code above (as @ikegami suggested) must have been copied incorrectly.
my $str = "más";
printf "%v02X\n", $str;
==> 6D.C3.A1.73
$str =~ s/[!]//g;
printf "%v02X\n", $str;
==> 6D.C3.A1.73
print "$str\n";
==> más
$str =~ s/[¡]//g;
printf "%v02X\n", $str;
==> 6D.C3.73
print "$str\n";
==> m�s