|
The aim of the present work was to find the answer to the question: To what extent can the multilayer perceptron be applicable in the automatic vowel recognition process in any given fragments of a particular speaker?Initial research was carried out with the use of recordings of 3 adult people's speech. Vowel recognition was performed with the application of multilayer perceptron. On the input of the network, N-element vectors were fed, which consisted of sound levels values obtained every 0.02s as a result of spectral analysis. Each created network was taught to recognise 6 vowels - a, e, o, u, i, y as well as one pattern including all other fragments of an utterance - consonants and pauses.The networks in which a result of over 90 % correct classifications for all the time moments was obtained were used to carry out a test on a completely different set of data. The best result in that part of research was 92% vowel recognition. At the same time, only 50% time moments, which made up these vowels, were correctly recognised. The other half was recognised as other vowels or a different fragment of the utterance. There also occurred 15% incorrect recognition of time moments making up consonants or pauses.
|