-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Closed
Labels
Description
Symfony version(s) affected: 5.0.2
Description
We used behat/transliterator until now, but because of issues with PHP 7.4 look into replacing it. I switched our codebase to use symfony/string and run into some differences with the result of transliteration to ASCII.
| Input | behat/transliterator | symfony/strings |
|---|---|---|
| 汉语 | yi-yu | han-yu |
| ភាសាខ្មែរ | bhaasaakhmaer | |
| ภาษาไทย | phaasaaaithy | phas-a-thiy |
| العَرَبِية | l-arabiy | al-arabit |
| 한국어 | hangugeo | hangug-eo |
| မြန်မာဘာသာ | m-n-maabhaasaa | |
| हिन्दी | hindii | hindi |
Notes:
- The result of
symfony/stringsis backed by http://dzcpy.github.io/transliteration for汉语(han-yu) - The result of
behat/transliteratoris backed by http://dzcpy.github.io/transliteration forភាសាខ្មែរ(bhaasaakhmaer),한국어(hangugeo) andहिन्दी(hindii) - For
العَرَبِيةandမြန်မာဘာသာhttp://dzcpy.github.io/transliteration disagrees with both libraries - If an empty string was returned from the transliteration, that may be due to missing support for a language and/or wrong detection.
How to reproduce
class TransliterationTest extends UnitTestCase
{
public function sourcesAndResults(): array
{
return [
['Überlandstraßen; adé', 'uberlandstrassen-ade'],
['TEST DRIVE: INFINITI Q50S 3.7', 'test-drive-infiniti-q50s-3-7'],
['汉语', 'yi-yu'],
['日本語', 'ri-ben-yu'],
['Việt', 'viet'],
['ភាសាខ្មែរ', 'bhaasaakhmaer'],
['ภาษาไทย', 'phaasaaaithy'],
['العَرَبِية', 'l-arabiy'],
['עברית', 'bryt'],
['한국어', 'hangugeo'],
['ελληνικά', 'ellenika'],
['မြန်မာဘာသာ', 'm-n-maabhaasaa'],
[' हिन्दी', 'hindii'],
['Иван Иванович', 'ivan-ivanovic'],
['Löic & René', 'loic-rene'],
['Châteauneuf du Pape', 'chateauneuf-du-pape'],
['Žluťoučký kůň', 'zlutoucky-kun'],
[' x- ', 'x'],
];
}
/**
* @test
* @dataProvider sourcesAndResults
*/
public function transliterationWorksAsExpected($source, $result): void
{
$result = strtolower((new AsciiSlugger())->slug($source)->toString());
self::assertEquals($result, $result);
}
}garak