-
本文描述一个通用实时语言识别系统——RTSRS(01)。在以前工作的基础上,每条口呼命令的参数在时间域上规正,采用二值频谱,大大压缩了参考音的参数存贮量,同时应用新的求差距的办法,使得识别所需的时间大为缩短,以致字表为200时能实时识别。专人的识别结果为:口呼数字,99.7%;20句话(每句7字),99.7%;四字成语100个,99.5%;四字成语150个,99.3%;四字成语200个,98.8%;四字成语400个,99.7%。非正式的实验表明,对于不同音节数的字表,乃至口呼英语数字或BASIC语句名字等,都有高的正确识别率。In this paper a universal real-time speech recognition system-RTSRS(01) is described. On the basis of the previous work[1], the parameters of a spoken command are normalized in the time domain. Using the binary spectrum as the final recognition parameters, which can largely reduce the amount of memory necessary for each reference command, and adapting a new method for calculating the separation between two spoken commands to be compared, it is possible to make the system RTSRS(01) capable of identifying single entities in a vocabulary of 200 items in real-time. The results of recognition for a specific speaker are as follows: 10 spoken Chinese digits-99.7%; 20 sentences (7 syllables for each)-99.7%; 100 phrases (4 syllables for each)-99.5%; 150 phrases (4 syllables for each)-99.3%; 200 phrases (4 syllables for each)-98.8%; 400 phrases (4 syllables for each)-97.7%. It has been shown by informal experiments that the system RTSRS(01) can be used to identify the vocabularies which include items with different numbes of syllables; furthermore, for the first 20 English digits and the names of BASIC statements, the correct recognition rate is also high.
[1] -
[1]
计量
- 文章访问数: 7339
- PDF下载量: 594
- 被引次数: 0