Samoa Digital Library

Fusion of Audio and Visual Information for Implementing Improved Speech Recognition System

Show simple item record

dc.contributor.author Acharya, Vikrant Satish
dc.date.accessioned 2021-11-25T08:49:02Z
dc.date.available 2021-11-25T08:49:02Z
dc.date.issued 2018
dc.identifier.citation Acharya, Vikrant Satish, "Fusion of Audio and Visual Information for Implementing Improved Speech Recognition System" (2018). Masters Theses. 884. https://scholarworks.gvsu.edu/theses/884 en_US
dc.identifier.uri ${sadil.baseUrl}/handle/123456789/592
dc.description 171 p. ; PDF (Masters Thesis) en_US
dc.description.abstract Speech recognition is a very useful technology because of its potential to develop applications, which are suitable for various needs of users. This research is an attempt to enhance the performance of a speech recognition system by combining the visual features (lip movement) with audio features. The results were calculated using utterances of numerals collected from participants inclusive of both male and female genders. Discrete Cosine Transform (DCT) coefficients were used for computing visual features and Mel Frequency Cepstral Coefficients (MFCC) were used for computing audio features. The classification was then carried out using Support Vector Machine (SVM). The results obtained from the combined/fused system were compared with the recognition rates of two standalone systems (Audio only and visual only). en_US
dc.language.iso en en_US
dc.publisher Grand Valley States University en_US
dc.title Fusion of Audio and Visual Information for Implementing Improved Speech Recognition System en_US
dc.title.alternative A Thesis Submitted to the Graduate Faculty of GRAND VALLEY STATE UNIVERSITY In Partial Fulfillment of the Requirements For the Degree of Master of Science in Engineering Electrical Engineering en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account