The segmenter currently used is ad hoc and not very sophisticated. Investigate what segmenters are available and what would be needed to use them in Wikispeech. The current method should remain as a fallback.
Alternatives
OpenNLP
This was looked at in T286984. It's written in Java which may make it a bit harder to work with for us. There's also a note about the license of the Swedish model that makes it sound like anything generated using it need to include a copyright note. I'm not sure if that's correct.
sentencex
Developed by WMF. Uses language specific rules to some extent. Uses fallback when a language doesn't have it's own implementation.