[WIP] add SEAME kaldi recipes#3063
Conversation
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it. |
|
@keli78 is there a reason this is WIP? I'd be OK to merge this without much checking, as it might be useful for someone in the future even if it's not perfect. |
|
@danpovey It's WIP because the data preparation scripts are not complete yet. Haihua's group provided me with the processed data so I directly used that and built the system. While there is no data splitting information shared to complete the preparation scripts. As we decided not to continue working on this topic then, I didn't put further efforts on it. There indeed were several persons asked me about this PR before. If you think it's necessary, I can spend some time figuring out the data splitting info and finishing the rest scripts? |
These are recipes for the Mandarin-English code-switched corpus - SEAME. Except for the data preparation scripts, all others are ready. RNNLM rescoring script and results are added. Will finish the data preparation part recently. To my knowledge, the results on this corpus are state-of-the-art by far. (Prepared data are from Haihua Xu's lab at NTU, thanks for their help)