Eyes, JAPAN Blog > Voice-Based Emotion Classification

Voice-Based Emotion Classification

senaka

この記事は1年以上前に書かれたもので、内容が古い可能性がありますのでご注意ください。

Flow Diagram

Flow of emotion classification

Recent advances in computer and Internet technology enable increasingly seamless interaction between human and machines. Applications such as  Internet games, 3D collaborative virtual environments (eg Second Life), and web-based teaching involve interaction between human and machine.

Emotionless communication between humans is really dull. Enriching such communications with emotions leads to  a successful communication in personal environments and business environments. Voice is a reasonable biometric parameter to extract an emotional state of a speaker.

Voice-based emotion classification process can be divided into two major parts. Those are feature extraction and emotion classification. In feature extraction process, features which contain emotional cues of voice are computed. In classification process, features are compared with free defined emotion models.

Mel-Frequency Cepstral Coefficients are popular among research community as a successful feature set.  Once features are extracted, this  becomes a pattern classification problem and classifiers such as Hidden Markov Models (HMMs) and Artificial Neural Networks (ANNs) can be deployed. The above diagram illustrates emotion classification process uses Two-Layered Cascaded Cepstral Coefficients as features and HMMs as a classifier.

Comments are closed.