Acoustic Methods in Identifying Symptoms of Emotional States

Zuzanna PIĄTEK; Maciej KŁACZYŃSKI

doi:10.24425/aoa.2021.136580

Authors

Zuzanna PIĄTEK AGH University of Science and Technology, Poland
Maciej KŁACZYŃSKI AGH University of Science And Technology, Poland

Abstract

The study investigates the use of speech signal to recognise speakers’ emotional states. The introduction includes the definition and categorization of emotions, including facial expressions, speech and physiological signals. For the purpose of this work, a proprietary resource of emotionally-marked speech recordings was created. The collected recordings come from the media, including live journalistic broadcasts, which show spontaneous emotional reactions to real-time stimuli. For the purpose of signal speech analysis, a specific script was written in Python. Its algorithm includes the parameterization of speech recordings and determination of features correlated with emotional content in speech. After the parametrization process, data clustering was performed to allows for the grouping of feature vectors for speakers into greater collections which imitate specific emotional states. Using the t-Student test for dependent samples, some descriptors were distinguished, which identified significant differences in the values of features between emotional states. Some potential applications for this research were proposed, as well as other development directions for future studies of the topic.

Keywords:

emotion recognition, speech signal processing, clustering analysis, Sammon mapping

References

1. Abdel-Hamid L. (2020), Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features, Speech Communication, 122: 19–30, https://doi.org/10.1016/j.specom.2020.04.005

2. Bhavana A., Chauhanb P., Rajiv H., Shahc R. (2019), Bagged support vector machines for emotion recognition from speech, Knowledge-Based Systems, 184: 104886, 1–7, https://doi.org/10.1016/j.knosys.2019.104886

3. Boersma P., Weenink D. (2015–2019), Praat documentation – Manual, from http://www.praat.org/

4. Cichosz J. (2008), The use of selected speech signal features to recognize and model emotions for the Polish language, [in Polish: Wykorzystanie wybranych cech sygnału mowy do rozpoznawania i modelowania emocji dla języka polskiego], Ph.D. Thesis, Lodz University of Technology, Łódź.

5. Davletcharova A., Sugathan S., Abraham B., Pappachen James A. (2015), Detection and analysis of emotion from speech signals, Procedia Computer Science, 58: 91–96, https://doi.org/10.1016/j.procs.2015.08.032

6. Demenko G., Jastrzębska M. (2011), Analysis of voice stress in emergency calls,

[in Polish: Analiza stresu głosowego w rozmowach z telefonu alarmowego], XVIII Conference on Acoustic and Biomedical Engineering 2011, Zakopane.

7. El Haddad K. et al. (2017), Introducing AmuS: The Amused Speech Database, Proceedings of 5th International Conference on Statistical Language and Speech Processing SLSP 2017At: Le Mans, France, pp. 229–240, https://doi.org/10.1007/978-3-319-68456-7_19

8. Igras M., Ziółko B. (2013), Database of emotional speech recording, Studia Informatica, 34(2B): 67–77.

9. Janicki A., Turkot M. (2008), Recognition of the speaker's emotional state using the support vector machine (SVM), [in Polish:] Rozpoznawanie stanu emocjonalnego mówcy z wykorzystaniem maszyny wektorów wspierających (SVM), Przegląd Telekomunikacyjny- wiadomości telekomunikacyjne, 2008(8–9): 994–1005.

10. Kamińska D., Pelikant A. (2012), Spontaneus emotion redognition from speech signal using multimodal classification, [in Polish:] Zastosowanie multimodalnej klasyfikacji w rozpoznawaniu stanów emocjonalnych na podstawie mowy spontanicznej, Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska, 3: 36–39.

11. Kerkeni L. et al. (2019), Automatic speech emotion recognition using machine learning, [in:] Social Media and Machine Learning, https://doi.org/10.5772/intechopen.84856

12. Kłaczyński M. (2007), Vibroacoustic phenomena in the human voice channel,

[in Polish:] Zjawiska wibroakustyczne w kanale głosowym człowieka, Ph.D. Thesis, AGH University of Science and Technology, Kraków.

13. Nisbet R., Miner G., Yale K. (2018), Handbook of Statistical Analysis and Data Mining Applications, 2nd ed., Elsevier, https://doi.org/10.1016/C2012-0-06451-4

14. Ntalampiras S. (2021), Speech emotion recognition via learning analogies, Pattern Recognition Letters, 144: 21–26, https://doi.org/10.1016/j.patrec.2021.01.018

15. Obrębowski A. (2008), Voice organ and its importance in social communication,

[in Polish:] Narząd głosu i jego znaczenie w komunikacji społecznej, Publisher University of Medical Sciences, Poznan.

16. Özseven T. (2018), Investigation of the effect of spectrogram images and different texture analysis methods on speech emotion recognition, Applied Acoustics, 142: 70–77, https://doi.org/10.1016/j.apacoust.2018.08.003

17. Razuri J.G. et al. (2015), Speech emotion recognition in emotional feedback for Human-Robot Interaction, International Journal of Advanced Research in Artificial Intelligence, 4(2): 20–27, https://doi.org/10.14569/IJARAI.2015.040204

18. Sammon J. (1969), A nonlinear mapping for data structure analysis, IEEE Transactions on Computers, C-18(5): 401 – 409, https://doi.org/10.1109/T-C.1969.222678

19. Sidorova J. (2007), Speech Emotion Recognition, Master Thesis, Universitat Pompeu Fabra, Barcelona, https://doi.org/10.13140/RG.2.1.3498.0724

20. Ślot K. (2010), Biometric recognition. New methods for the quantitative representation of objects, [in Polish: Rozpoznawanie biometryczne. Nowe metody ilościowej reprezentacji obiektów], WKŁ, Warszawa.

21. Stolar M. et al. (2018), Acoustic characteristics of emotional speech using spectrogram image classification, [in:] 12th International Conference on Signal Processing and Communication Systems (ICSPCS), pp. 1–5, https://doi.org/10.1109/ICSPCS.2018.8631752

22. Sun Y., Wen G., Wang J. (2015), Weighted spectral features based on local Hu moments for speech emotion recognition, Biomedical Signal Processing and Control, 18: 80–90, https://doi.org/10.1016/j.bspc.2014.10.008/

23. Ververidis D., Kotropoulos C. (2003), A review of emotional speech databases, 9th Panhellenic Conference on Informatics (PCI), Thessaloniki, Greece, http://delab.csd.auth.gr/bci1/Panhellenic/560ververidis.pdf

24. Yeqing Y., Tao T. (2011), An new speech recognition method based on prosodic analysis and SVM in Zhuang language, [in:] 2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC), pp. 1209–1212, https://doi.org/10.1109/MEC.2011.6025684

25. Zetterholm E. (1998), Prosody and voice quality in the expression of emotions, [in:] Proceedings of 7th Australian International Conference on Speech Science and Technology, pp. 109–113, Australian Speech Science and Technology Association, Sydney.

26. Zhang Z. (2021), Speech feature selection and emotion recognition based on weighted binary cuckoo search, Alexandria Engineering Journal, 60(1): 1499–1507, https://doi.org/10.1016/j.aej.2020.11.004

27. Zvarevashe K., Olugbara O. (2020), Ensemble learning of hybrid acoustic features for speech emotion recognition, Algorithms, 3(3), 70, https://doi.org/10.3390/a13030070

Online first
Early birds
2026, Vol 51
	No 1	No 2
2025, Vol 50
	No 1	No 2	No 3	No 4
2024, Vol 49
	No 1	No 2	No 3	No 4
2023, Vol 48
	No 1	No 2	No 3	No 4
2022, Vol 47
	No 1	No 2	No 3	No 4
2021, Vol 46
	No 1	No 2	No 3	No 4
2020, Vol 45
	No 1	No 2	No 3	No 4
2019, Vol 44
	No 1	No 2	No 3	No 4
2018, Vol 43
	No 1	No 2	No 3	No 4
2017, Vol 42
	No 1	No 2	No 3	No 4
2016, Vol 41
	No 1	No 2	No 3	No 4
2015, Vol 40
	No 1	No 2	No 3	No 4
2014, Vol 39
	No 1	No 2	No 3	No 4
2013, Vol 38
	No 1	No 2	No 3	No 4
2012, Vol 37
	No 1	No 2	No 3	No 4
2011, Vol 36
	No 1	No 2	No 3	No 4
2010, Vol 35
	No 1	No 2	No 3	No 4
2009, Vol 34
	No 1	No 2	No 3	No 4
2008, Vol 33
	No 1	No 2	No 3	No 4	No 4(S)
2007, Vol 32
	No 1	No 2	No 3	No 4	No 4(S)
2006, Vol 31
	No 1	No 2	No 3	No 4	No 4(S)
2005, Vol 30
	No 1	No 2	No 3	No 4
2004, Vol 29
	No 1	No 2	No 3	No 4
2003, Vol 28
	No 1	No 2	No 3	No 4
2002, Vol 27
	No 1	No 2	No 3	No 4
2001, Vol 26
	No 1	No 2	No 3	No 4
2000, Vol 25
	No 1	No 2	No 3	No 4
1999, Vol 24
	No 1	No 2	No 3	No 4
1998, Vol 23
	No 1	No 2	No 3	No 4
1997, Vol 22
	No 1	No 2	No 3	No 4
1996, Vol 21
	No 1	No 2	No 3	No 4
1995, Vol 20
	No 1	No 2	No 3	No 4
1994, Vol 19
	No 1	No 2	No 3	No 4
1993, Vol 18
	No 1	No 2	No 3	No 4
1992, Vol 17
	No 1	No 2	No 3	No 4
1991, Vol 16
	No 1	No 2	No 3-4
1990, Vol 15
	No 1-2		No 3-4
1989, Vol 14
	No 1-2		No 3-4
1988, Vol 13
	No 1-2		No 3-4
1987, Vol 12
	No 1	No 2	No 3-4
1986, Vol 11
	No 1	No 2	No 3	No 4
1985, Vol 10
	No 1	No 2	No 3	No 4
1984, Vol 9
	No 1-2		No 3	No 4
1983, Vol 8
	No 1	No 2	No 3	No 4
1982, Vol 7
	No 1	No 2	No 3-4
1981, Vol 6
	No 1	No 2	No 3	No 4
1980, Vol 5
	No 1	No 2	No 3	No 4
1979, Vol 4
	No 1	No 2	No 3	No 4
1978, Vol 3
	No 1	No 2	No 3	No 4
1977, Vol 2
	No 1	No 2	No 3	No 4
1976, Vol 1
	No 1	No 2	No 3	No 4

Acoustic Methods in Identifying Symptoms of Emotional States

Downloads

Authors

Abstract

Keywords:

References

Other articles by the same author(s)

cover

ippt-pan

Issue

Pages

Section

DOI

Received

Revised

Accepted

Published

License

How to Cite

Principal Contact

Address

Support Contact