Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Doc/tools/static/switchers.js
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@
// Returns the path segment of the language as a string, like 'fr/'
// or '' if not found.
function language_segment_from_url(url) {
var language_regexp = '\.org/(' + Object.keys(all_languages).join('|') + '/)';
var language_regexp = '\.org/([a-z]{2}(?:-[a-z]{2})?/)';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep using all_languages and just add (?: xxx ) to the regex.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I would prefer to keep it too, but I though:

var language_regexp = '\.org/((?:' + Object.keys(all_languages).join('|') + '|([a-z]{2}(?:-[a-z]{2})?)/)';

was pretty unreadable/unmaintainable just to match a language tag in an URL.

Sure we can enhance it by building a list of "language tag regexes" based on all_languages and adding the [a-z]{2}(?:-[a-z]{2})? to the end of it, then concatenating them all using the join('|'). Still a bit huge but more maintainable.

In every cases we have to keep the wildcardy part [a-z]{2}(?:-[a-z]{2})? to allow matching still-unknown languages (ones that are built, but not yet in the switcher, which is a supported case, see: https://www.python.org/dev/peps/pep-0545/#add-translation-to-the-language-switcher).

Or... we may add an exhaustive list of languages, not used to build the switcher but used to build the regex, containing every language tag we expect in the future, so when a translation is built, the switcher already knows it.

In any cases I find ([a-z]{2}(?:-[a-z]{2})?/) easier to read and maintain, even if it has a little chance to collide with a version in the future: if we introduce a version like dev but containing only two letters (or four letters separated by a dash). But I don't see this happen.

var match = url.match(language_regexp);
if (match !== null)
return match[1];
Expand All @@ -120,7 +120,7 @@
// Returns the path segment of the version as a string, like '3.6/'
// or '' if not found.
function version_segment_in_url(url) {
var language_segment = '(?:(?:' + Object.keys(all_languages).join('|') + ')/)';
var language_segment = '(?:[a-z]{2}(?:-[a-z]{2})?/)';
var version_segment = '(?:(?:' + version_regexs.join('|') + ')/)';
var version_regexp = '\\.org/' + language_segment + '?(' + version_segment + ')';
var match = url.match(version_regexp);
Expand Down