Abstract
The goal of this article is to present and compare recent approaches which use speech and voice analysis as biomarkers for screening tests and monitoring of some diseases. The article takes into account metabolic, respiratory, cardiovascular, endocrine, and nervous system disorders. A selection of articles was performed to identify studies that assess voice features quantitatively in selected disorders by acoustic and linguistic voice analysis. Information was extracted from each paper in order to compare various aspects of datasets, speech parameters, methods of applied analysis and obtained results. 110 research papers were reviewed and 47 databases were summarized. Speech analysis is a promising method for early diagnosis of certain disorders. Advanced computer voice analysis with machine learning algorithms combined with the widespread availability of smartphones allows diagnostic analysis to be conducted during the patient’s visit to the doctor or at the patient’s home during a telephone conversation. Speech analysis is a simple, low-cost, non-invasive and easy-toprovide method of medical diagnosis. These are remarkable advantages, but there are also disadvantages. The effectiveness of disease diagnoses varies from 65% up to 99%. For that reason it should be treated as a medical screening test and should be an indication of the need for classic medical tests.Keywords:
speech analysis, speech features, acoustic parameters, linguistic analysis, voice biomarkers, screening testsReferences
1. Afshan A., Guo J., Park S.J., Ravi V., Flint J., Alwan A. (2018), Effectiveness of voice quality features in detecting depression, [in:] Interspeech, pp. 1676–1680, https://doi.org/10.21437/Interspeech.2018-1399
2. Al Hanai T., Ghassemi M.M., Glass J.R. (2018), Detecting depression with audio/text sequence modeling of interviews, [in:] Interspeech, pp. 1716–1720, https://doi.org/10.21437/Interspeech.2018-2522
3. Alghowinem S., Goecke R., Epps J., Wagner M., Cohn J.F. (2016), Cross-cultural depression recognition from vocal biomarkers, [in:] Interspeech, pp. 1943–1947, https://doi.org/10.21437/Interspeech.2016-1339
4. Alghowinem S., Goecke R., Wagner M., Epps J., Breakspear M., Parker G. (2012), From joyous to clinically depressed: Mood detection using spontaneous speech, [in:] Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, Youngblood G.M., McCarthy P.M. [Eds.], pp. 141–146, https://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS12/paper/view/ 4478/4782.
5. Antolík T.K., Fougeron C. (2013), Consonant distortions in dysarthria due to Parkinson’s disease, amyotrophic lateral sclerosis and cerebellar ataxia, [in:] Interspeech, pp. 2152–2156, https://doi.org/10.21437/Interspeech.2013-509
6. Aydin K. et al. (2016), Voice characteristics associated with polycystic ovary syndrome, The Laryngoscope, 126(9): 2067–2072, https://doi.org/10.1002/lary.25818
7. Barsties B., Verfaillie R., Roy N., Maryn Y. (2013), Do body mass index and fat volume influence vocal quality, phonatory range, and aerodynamics in females?, CoDAS, 25(4): 310–318, https://doi.org/10.1590/s2317-17822013000400003
8. Bedi G. et al. (2015), Automated analysis of free speech predicts psychosis onset in high-risk youths, npj Schizophrenia, 1(1): 15030, https://doi.org/10.1038/npjschz.2015.30
9. Bozkurt E., Toledo-Ronen O., Sorin A., Hoory R. (2014), Exploring modulation spectrum features for speech-based depression level classification, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-312
10. Celebi S. et al. (2013). Acoustic, perceptual and aerodynamic voice evaluation in an obese population, The Journal of Laryngology and Otology, 127(10): 987–990, https://doi.org/10.1017/s0022215113001916
11. Chitkara D., Sharma R.K. (2016), Voice based detection of type 2 diabetes mellitus, [in:] 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), pp. 83–87, https://doi.org/10.1109/AEEICB.2016.7538402
12. Cummins N., Epps J., Sethu V., Breakspear M., Goecke R. (2013), Modeling spectral variability for the classification of depressed speech, [in:] Interspeech, pp. 857–861, https://doi.org/10.21437/Interspeech.2013-242
13. Cummins N., Scherer S., Krajewski J., Schnieder S., Epps J., Quatieri T.F. (2015a), A review of depression and suicide risk assessment using speech analysis, Speech Communication, 71: 10–49, https://doi.org/10.1016/j.specom.2015.03.004
14. Cummins N., Sethu V., Epps J., Krajewski J. (2015b), Relevance vector machine for depression prediction, [in:] Interspeech, pp. 110–114, https://doi.org/10.21437/Interspeech.2015-37
15. Cummins N., Sethu V., Epps J., Schnieder S., Krajewski J. (2015c), Analysis of acoustic space variability in speech affected by depression, Speech Communication, 75: 27–49, https://doi.org/10.1016/j.specom.2015.09.003
16. Da Cunha M.G.B., Passerotti G.H., Weber R., Zilberstein B., Cecconello I. (2011), Voice feature characteristic in morbid obese population, Obesity Surgery, 21(3): 340–344, https://doi.org/10.1007/s11695-009-9959-7
17. Dassie-Leite A.P., Behlau M., Nesi-França S., Lima M.N., de Lacerda L. (2018), Vocal evaluation of children with congenital hypothyroidism, Journal of Voice, 32(6): 11–19, https://doi.org/10.1016/j.jvoice.2017.08.006
18. Deshpande G., Schuller B. (2020), An overview on audio, signal, speech, & language processing for COVID-19, arXic preprint, https://doi.org/10.48550/arXiv.2005.08579
19. Despotovic V., Ismael M., Cornil M., Mc Call R., Fagherazzi G. (2021), Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results, Computers in Biology and Medicine, 138: 104944, https://doi.org/10.1016/j.compbiomed.2021.104944
20. DeVault D. et al. (2014), SimSensei kiosk: A virtual human interviewer for healthcare decision support, [in:] AAMAS ’14: Proceedings of the 2014 International Conference on Autonomous Agents and Multiagent Systems, pp. 1061–1068.
21. Dogan E., Sander C., Wagner X., Hegerl U., Kohls E. (2017), Smartphone-based monitoring of objective and subjective data in affective disorders: Where are we and where are we going? Systematic review, Journal of Medical Internet Research, 19(7): e262, https://doi.org/10.2196/jmir.7006
22. Ekblad L.L. et al. (2015), Insulin resistance is associated with poorer verbal fluency performance in women, Diabetologia, 58(11): 2545–2553, https://doi.org/10.1007/s00125-015-3715-4
23. Faurholt-Jepsen M. et al. (2016), Voice analysis as an objective state marker in bipolar disorder, Translational psychiatry, 6(7): e856–e856, https://doi.org/10.1038/tp.2016.123
24. Gosztolya G., Bagi A., Szalóki S., Szendi I., Hoffmann I. (2018), Identifying schizophrenia based on temporal parameters in spontaneous speech, https://doi.org/10.13140/RG.2.2.10884.78721
25. Gosztolya G., Vincze V., Tóth L., Pákáski M., Kálmán J., Hoffmann I. (2019), Identifying mild cognitive impairment and mild alzheimer’s disease based on spontaneous speech using ASR and linguistic features, Computer Speech & Language, 53: 181–197, https://doi.org/10.1016/j.csl.2018.07.007
26. Gratch J. et al. (2014), The distress analysis interview corpus of human and computer interviews, [in:] Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 3123–3128, http://www.lrec-conf.org/proceedings/lrec2014/pdf/508_Paper.pdf
27. Grósz T., Busa-Fekete R., Gosztolya G., Tóth L. (2015), Assessing the degree of nativeness and Parkinson’s condition using Gaussian processes and deep rectifier neural networks, [in:] Interspeech, pp. 919–923, https://doi.org/10.21437/Interspeech.2015-195
28. Grünerbl A. et al. (2014), Smartphone-based recognition of states and state changes in bipolar disorder patients, IEEE Journal of Biomedical and Health Informatics, 19(1): 140–148, https://doi.org/10.1109/jbhi.2014.2343154
29. Gugatschka M. et al. (2013), Subjective and objective vocal parameters in women with polycystic ovary syndrome, Journal of Voice, 27(1): 98–100, https://doi.org/10.1016/j.jvoice.2012.07.007
30. Guidi A., Schoentgen J., Bertschy G., Gentili C., Scilingo E.P., Vanello N. (2017), Features of vocal frequency contour and speech rhythm in 1250 bipolar disorder, Biomedical Signal Processing and Control, 37: 23–31, https://doi.org/10.1016/j.bspc.2017.01.017
31. Guidi A., Scilingo E. P., Gentili C., Bertschy G., Landini L., Vanello N. (2015), Analysis of running speech for the characterization of mood state in bipolar patients, [in:] 2015 AEIT International Annual Conference (AEIT), pp. 1–6, https://doi.org/10.1109/AEIT.2015.7415275
32. Hamdan A.-l., Jabbour J., Nassar J., Dahouk I., Azar S.T. (2012), Vocal characteristics in patients with type 2 diabetes mellitus, European Archives of Oto-Rhino-Laryngology, 269(5): 1489–1495, https://doi.org/10.1016/j.amjoto.2012.03.008
33. Hamdan A.-L., Safadi B., Chamseddine G., Kasty M., Turfe Z.A., Ziade G. (2014), Effect of weight loss on voice after bariatric surgery, Journal of Voice, 28(5): 618–623, https://doi.org/10.1016/j.jvoice.2014.03.004
34. Han J. et al. (2020), An early study on intelligent analysis of speech under COVID-19: Severity, sleep quality, fatigue, and anxiety, arXiv preprint, https://doi.org/10.48550/arXiv.2005.00096
35. Hannoun A., Zreik T., Husseini S.T., Mahfoud L., Sibai A., Hamdan A.-l. (2011), Vocal changes in patients with polycystic ovary syndrome, Journal of Voice, 25(4): 501–504, https://doi.org/10.1016/j.jvoice.2009.12.005
36. Hassan A., Shahin I., Alsabek M.B. (2020), COVID-19 detection system using recurrent neural networks, [in:] 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), pp. 1–5, https://doi.org/10.1109/CCCI49893.2020.9256562
37. Helfer B.S., Quatieri T.F., Williamson J.R., Mehta D.D., Horwitz R., Yu B. (2013), Classification of depression state based on articulatory precision, [in:] Interspeech, pp. 2172–2176, https://doi.org/10.21437/Interspeech.2013-513
38. Hemmerling, D., Orozco-Arroyave J.R., Skalski A., Gajda J., Nöth E. (2016), Automatic detection of Parkinson’s disease based on modulated vowels, [in:] Interspeech, pp. 1190–1194, https://doi.org/10.21437/Interspeech.2016-1062
39. Hönig F., Batliner A., Nöth E., Schnieder S., Krajewski J. (2014), Automatic modelling of depressed speech: relevant features and relevance of gender, [in:] Interspeech, pp. 1248–1252, https://doi.org/10.21437/Interspeech.2014-313
40. Horwitz-Martin R.L. et al. (2016), Relation of automatically extracted formant trajectories with intelligibility loss and speaking rate decline in amyotrophic lateral sclerosis, [in:] Interspeech, pp. 1205–1209, https://doi.org/10.21437/Interspeech.2016-403
41. Huang G., Pencina K.M., Coady J.A., Beleva Y.M., Bhasin S., Basaria S. (2015), Functional voice testing detects early changes in vocal pitch in women during testosterone administration, The Journal of Clinical Endocrinology Metabolism, 100(6): 2254–2260, https://doi.org/10.1210/jc.2015-1669
42. Huang K.-Y., Wu C.-H., Kuo Y.-T., Jang F.-L. (2016), Unipolar depression vs. bipolar disorder: An elicitation-based approach to short-term detection of mood disorder, [in:] Interspeech, pp. 1452–1456, https://doi.org/10.21437/Interspeech.2016-620
43. Junuzovic-Žunic L., Ibrahimagic A., Altumbabic S. (2019), Voice characteristics in patients with thyroid disorders, The Eurasian Journal of Medicine, 51(2): 101–105, https://doi.org/10.5152/eurasianjmed.2018.18331
44. Khorram S., Gideon J., McInnis M.G., Provost E.M. (2016), Recognition of depression in bipolar disorder: Leveraging cohort and person specific knowledge, [in:] Interspeech, pp. 1215–1219, https://doi.org/10.21437/Interspeech.2016-837
45. Kiss G., Sztahó D., Tulics M.G. (2021), Application for detecting depression, Parkinson’s disease and dysphonic speech, [in:] Interspeech, pp. 956–957.
46. Klumpp P., Janu T., Arias-Vergara T., Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Apkinson – A mobile monitoring solution for Parkinson’s disease, [in:] Interspeech, pp. 1839–1843, https://doi.org/10.21437/Interspeech.2017-416
47. Kones R., Rumana U. (2017), Cardiometabolic diseases of civilization: History and maturation of an evolving global threat. An update and call to action, Annals of Medicine, 49(3): 260–274, https://doi.org/10.1080/07853890.2016.1271957
48. Kopp W. (2019), How western diet and lifestyle drive the pandemic of obesity and civilization diseases, Diabetes, Metabolic Ayndrome and Obesity: Targets and Therapy, 12: 2221–2236, https://doi.org/10.2147/DMSO.S216791
49. Laguarta J., Hueto F., Subirana B. (2020), COVID-19 artificial intelligence diagnosis using only cough recordings, IEEE Open Journal of Engineering in Medicine and Biology, 1: 275–281, https://doi.org/10.1109/OJEMB.2020.3026928
50. Lechien J. et al. (2020), Features of mild-to-moderate COVID-19 patients with dysphonia, Journal of Voice, https://doi.org/10.1016/j.jvoice.2020.05.012
51. Lopez-Otero P., Docio-Fernandez L.D., Abad A., Garcia-Mateo C. (2017), Depression detection using automatic transcriptions of de-identified speech, [in:] Interspeech, pp. 3157–3161, https://doi.org/10.21437/Interspeech.2017-1201
52. Low D.M., Bentley K.H., Ghosh S.S. (2020), Automated assessment of psychiatric disorders using speech: A systematic review, Laryngoscope Investigative Otolaryngology, 5(1): 96–116, https://doi.org/10.1002/lio2.354
53. Mallela J. et al. (2020), Raw speech waveform based classification of patients with ALS, Parkinson’s disease and healthy controls using CNN-BLSTM, [in:] Interspeech, pp. 4586–4590, https://doi.org/10.21437/Interspeech.2020-2221
54. Maor E., Sara J.D., Orbelo D.M., Lerman L.O., Levanon Y., Lerman A. (2018), Voice signal characteristics are independently associated with coronary artery disease, Mayo Clinic Proceedings, pp. 840–847, https://doi.org/10.1016/j.mayocp.2017.12.025
55. McGinnis E.W. et al. (2019), Giving voice to vulnerable children: Machine learning analysis of speech detects anxiety and depression in early childhood, IEEE Journal of Biomedical and Health Informatics, 23(6): 2294–2301, https://doi.org/10.1109/JBHI.2019.2913590
56. Mirheidari B., Blackburn D., Walker T., Venneri A., Reuber M., Christensen H. (2018), Detecting signs of dementia using word vector representations, [in:] Interspeech, pp. 1893–1897, https://doi.org/10.21437/Interspeech.2018-1764
57. Mohammadzadeh A., Heydari E., Azizi F. (2011), Speech impairment in primary hypothyroidism, Journal of Endocrinological Investigation, 34(6): 431–433, https://doi.org/10.1007/BF03346708
58. Moro-Velazquez L., Gomez-Garcia J.A., Arias-Londoño J.D., Dehak N., Godino-Llorente J.I. (2021), Advances in Parkinson’s disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects, Biomedical Signal Processing and Control, 66: 102418, https://doi.org/10.1016/j.bspc.2021.102418
59. Mota N.B. et al. (2012), Speech graphs provide a quantitative measure of thought disorder in psychosis, PLOS ONE, 7(4): e34928. https://doi.org/10.1371/journal.pone.0034928
60. Mundt J.C., Vogel A.P., Feltner D.E., Lenderking W.R. (2012), Vocal acoustic biomarkers of depression severity and treatment response, Biological psychiatry, 72(7): 580–587, https://doi.org/10.1016/j.biopsych.2012.03.015
61. Orozco-Arroyave J.R., Arias-Londoño J.F., Vargas-Bonilla J.F., Gonzalez-Rativa M.C., Nöth E. (2014a), New spanish speech corpus database for the analysis of people suffering from Parkinson’s disease, [in:] LREC, pp. 342–347.
62. Orozco-Arroyave J.R. et al. (2014b), Automatic detection of Parkinson’s disease from words uttered in three different languages, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-375
63. Pan Y., Mirheidari B., Reuber M., Venneri A., Blackburn D., Christensen H. (2020), Improving detection of Alzheimer’s disease using automatic speech recognition to identify high-quality segments for more robust feature extraction, [in:] Interspeech, pp. 4961–4965, https://doi.org/10.21437/Interspeech.2020-2698
64. Pareek V., Sharma R.K. (2016), Coronary heart disease detection from voice analysis, [in:] 2016 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6, https://doi.org/10.1109/SCEECS.2016.7509344
65. Pettorino M., Gu W., Półrola P., Fan P. (2017), Rhythmic characteristics of Parkinsonian speech: A study on Mandarin and Polish, [in:] Interspeech, pp. 3172–3176, https://doi.org/10.21437/Interspeech.2017-850
66. Pinheiro A.P., Niznikiewicz M. (2019), Altered attentional processing 1395 of happy prosody in schizophrenia, Schizophrenia Research, 206: 217–224, https://doi.org/10.1016/j.schres.2018.11.024
67. Pinkas G., Karny Y., Malachi A., Barkai G., Bachar G., Aharonson V. (2020), SARS-CoV-2 detection from voice, IEEE Open Journal of Engineering in Medicine and Biology, 1: 268–274, https://doi.org/10.1109/ojemb.2020.3026468
68. Pinto S. et al. (2016), Dysarthria in individuals with Parkinson’s disease: A protocol for a binational, cross-sectional, case-controlled study in French and European Portuguese (FraLusoPark), BMJ Open, 6(11): https://doi.org/10.1136/bmjopen-2016-012885
69. Pinyopodjanard S., Suppakitjanusant P., Lomprew P., Kasemkosin N., Chailurkit L., Ongphiphadhanakul B. (2019), Instrumental acoustic voice characteristics in adults with type 2 diabetes, Journal of Voice, 35(1): 116–121, https://doi.org/10.1016/j.jvoice.2019.07.003
70. Pompili A. et al. (2020), Assessment of Parkinson’s disease medication state through automatic speech analysis, arXiv preprint, https://doi.org/10.48550/arXiv.2005.14647
71. Rohanian M., Hough J., Purver M. (2021), Alzheimer’s dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs, [in:] Interspeech, pp. 3820–3824, https://doi.org/10.21437/Interspeech.2021-1633
72. Rusz J. et al. (2018), Smartphone allows capture of speech abnormalities associated with high risk of developing Parkinson’s disease, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(8): 1495–1507, https://doi.org/10.1109/TNSRE.2018.2851787
73. Sadeghian R., Schaffer J.D., Zahorian S.A. (2017), Speech processing approach for diagnosing dementia in an early stage, [in:] Interspeech, pp. 2705–2709, https://doi.org/10.21437/Interspeech.2017-1712
74. Sahu S., Espy-Wilson C.Y. (2016), Speech features for depression detection, [in:] Interspeech, pp. 1928–1932, https://doi.org/10.21437/Interspeech.2016-1566
75. Sattler C. et al. (2017), Interdisciplinary longitudinal study on adult development and aging (ILSE), [in:] Encyclopedia of Geropsychology, Pachana N.A. [Ed.], pp. 1213–1222, Springer, https://doi.org/10.1007/978-981-287-082-7_238
76. Scherer S., Stratou G., Gratch J., Morency L.-P. (2013a), Investigating 1435 voice quality as a speaker-independent indicator of depression and PTSD, [in:] Interspeech, pp. 847–851, https://doi.org/10.21437/Interspeech.2013-240
77. Scherer S. et al. (2013b), Automatic behavior descriptors for psychological disorder analysis, [in:] 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8, https://doi.org/10.1109/FG.2013.6553789
78. Seneviratne N., Williamson J.R., Lammert A.C., Quatieri T.F., Espy-Wilson C. (2020), Extended study on the use of vocal tract variables to quantify neuromotor coordination in depression, [in:] Interspeech, pp. 4551–4555, https://doi.org/10.21437/Interspeech.2020-2758
79. Sharma N. et al. (2020), Coswara – A database of breathing, cough, and voice sounds for COVID-19 diagnosis, arXiv preprint, pp. 4811–4815, https://doi.org/10.21437/Interspeech.2020-2768
80. Simantiraki O., Charonyktakis P., Pampouchidou A., Tsiknakis M., Cooke M. (2017), Glottal source features for automatic speech-based depression assessment, [in:] Interspeech, pp. 2700–2704, https://doi.org/10.21437/Interspeech.2017-1251
81. Sirmans S.M., Pate K.A. (2014), Epidemiology, diagnosis, and management of polycystic ovary syndrome, Clinical Epidemiology, 6: 1–13, https://doi.org/10.2147/clep.s37559
82. Skodda S., Grönheit W., Schlegel U. (2011), Intonation and speech rate in Parkinson’s disease: General and dynamic aspects and responsiveness to levodopa admission, Journal of Voice, 25(4): e199–e205, https://doi.org/10.1016/j.jvoice.2010.04.007
83. Solomon N.P., Helou L.B., Dietrich-Burns K., Stojadinovic A. (2011), Do obesity and weight loss affect vocal function?, [in:] Seminars in Speech and Language, 31(1): 31–42, https://doi.org/10.1055/s-0031-1271973
84. de Souza L.B.R., Pereira R.M., dos Santos M.M., Godoy C.M.A. (2014), Fundamental frequency, phonation maximum time and vocal complaints in morbidly obese women, ABCD. Arquivos Brasileiros de Cirurgia Digestiva, 27(1): 43–46. doi: 10.1590/ s0102-67202014000100011.
85. de Souza L.B.R., dos Santos M.M. (2018), Body mass index and acoustic voice parameters: Is there a relationship?, Brazilian Journal of Otorhinolaryngology, 84(4): 410–415, https://doi.org/10.1016/j.bjorl.2017.04.003
86. Stasak B., Epps J., Cummins N., Goecke R. (2016), An investigation of emotional speech in depression classification, [in:] Interspeech, pp. 485–489, https://doi.org/10.21437/Interspeech.2016-867
87. Stasak B., Epps J., Goecke R. (2017), Elicitation design for acoustic depression classification: An investigation of articulation effort, linguistic complexity, and word affect, [in:] Interspeech, pp. 834–838, https://doi.org/10.21437/Interspeech.2017-1223
88. Stasak B., Huang Z., Razavi S., Joachim D., Epps J. (2021). Automatic detection of COVID-19 based on short-duration acoustic smartphone speech analysis, Journal of Healthcare Informatics Research, 5(2): 201–217, https://doi.org/10.1007/s41666-020-00090-4
89. Stogowska E., Kamnski K.A., Ziółko B., Kowalska I. (2022), Voice changes in reproductive disorders, thyroid disorders and diabetes: A review, Endocrine Connections, 11(3): e201505, https://doi.org/10.1530/EC-21-0505
90. Subirana B. et al. (2020), Hi sigma, do I have the Coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic, arXiv preprint, https://doi.org/10.48550/arXiv.2004.06510
91. Sztahó D., Kiss G., Vicsi K. (2015), Estimating the severity of Parkinson’s disease from speech using linear regression and database partitioning, [in:] Interspeech, pp. 498–502, https://doi.org/10.21437/Interspeech.2015-183
92. Ujiro T. et al. (2018), Detection of dementia from responses to atypical questions asked by embodied conversational agents, [in:] Interspeech, pp. 1691–1695, https://doi.org/10.21437/Interspeech.2018-1514
93. Valstar M. et al. (2013), Avec 2013: The continuous audio/visual emotion and depression recognition challenge, [in:] AVEC’13 Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge, pp. 3–10, https://doi.org/10.1145/2512530.2512533
94. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Nöth E. (2018), A multitask learning approach to assess the dysarthria severity in patients with Parkinson’s disease, [in:] Interspeech, pp. 456–460, https://doi.org/10.21437/interspeech.2018-1988
95. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Arias-Londoño J.D., Nöth E. (2015), Automatic detection of Parkinson’s disease from continuous speech recorded in non-controlled noise conditions, [in:] Interspeech, pp. 105–109, https://doi.org/10.21437/Interspeech.2015-36
96. Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Convolutional neural network to model articulation impairments in patients with Parkinson’s disease, [in:] Interspeech, pp. 314–318, https://doi.org/10.21437/Interspeech.2017-1078
97. Villa-Cañas T., Arias-Londoño J.D., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Nöth E.
(2015), Low-frequency components analysis in running speech for the automatic detection of Parkinson’s disease, [in:] Interspeech, pp. 100–104, https://doi.org/10.21437/Interspeech.2015-35
98. Villatoro-Tello E., Dubagunta P., Fritsch J., Ramírez-de-la Rosa G., Motlicek P., Magimai-Doss M. (2021), Late fusion of the available lexicon and raw waveform-based acoustic modeling for depression and dementia recognition, [in:] Interspeech, pp. 1927-1931, https://doi.org/10.21437/Interspeech.2021-1288
99. Wang J., Kothalkar P.V., Cao B., Heitzman D. (2016), Towards automatic detection of amyotrophic lateral sclerosis from speech acoustic and articulatory samples, [in:] Interspeech, pp. 1195–1199, https://doi.org/10.21437/Interspeech.2016-1542
100. Wankerl S., Nöth E., Evert S. (2017), An n-gram based approach to the automatic diagnosis of Alzheimer’s disease from spoken language, [in:] Interspeech, pp. 3162–3166, https://doi.org/10.21437/Interspeech.2017-1572
101. Warnita T., Inoue N., Shinoda K. (2018), Detecting Alzheimer’s disease using gated convolutional neural network from audio data, [in:] Interspeech, pp. 1706–1710, https://doi.org/10.21437/Interspeech.2018-1713
102. Wei W.,Wang J., Ma J., Cheng N., Xiao J. (2020), A real-time robot-based auxiliary system for risk evaluation of COVID-19 infection, arXiv preprint, https://doi.org/10.48550/arXiv.2008.07695
103. Weiner J., Angrick M., Umesh S., Schultz T. (2018), Investigating the effect of audio duration on dementia detection using acoustic features, [in:] Interspeech, pp. 2324–2328, https://doi.org/10.21437/Interspeech.2018-57
104. Weiner J., Herff C., Schultz T. (2016), Speech-based detection of Alzheimer’s disease in conversational German, [in:] Interspeech, pp. 1938–1942, https://doi.org/10.21437/Interspeech.2016-100
105. Wodzinski M., Skalski A., Hemmerling D., Orozco-Arroyave J.R., Nöth E. (2019), Deep learning approach to Parkinson’s disease detection using voice recordings and convolutional neural network dedicated to image classification, [in:] 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 717–720, https://doi.org/10.1109/embc.2019.8856972
106. Xezonaki D., Paraskevopoulos G., Potamianos A., Narayanan S. (2020), Affective conditioning on hierarchical attention networks applied to depression detection from transcribed clinical interviews, [in:] Interspeech, pp. 4556–4560, https://doi.org/10.21437/Interspeech.2020-2819
107. Yang Y., Fairbairn C., Cohn J.F. (2012), Detecting depression severity from vocal prosody, IEEE Transactions on Affective Computing, 4(2): 142–150, https://doi.org/10.1109/T-AFFC.2012.38
108. Zhan A. et al. (2016), High frequency remote monitoring of Parkinson’s disease via smartphone: Platform overview and medication response detection, arXiv preprint, https://doi.org/10.48550/arXiv.1601.00960
109. Zhao Z. et al. (2020), Hybrid network feature extraction for depression assessment from speech, [in:] Interspeech, pp. 4956–4960, https://doi.org/10.21437/Interspeech.2020-2396
110. Zlotnik A., Montero J.M., San-Segundo R., Gallardo-Antolín A. (2015), Random forest-based prediction of Parkinson’s disease progression using acoustic, ASR and intelligibility features, [in:] Interspeech, pp. 503–507, https://doi.org/10.21437/Interspeech.2015-184
2. Al Hanai T., Ghassemi M.M., Glass J.R. (2018), Detecting depression with audio/text sequence modeling of interviews, [in:] Interspeech, pp. 1716–1720, https://doi.org/10.21437/Interspeech.2018-2522
3. Alghowinem S., Goecke R., Epps J., Wagner M., Cohn J.F. (2016), Cross-cultural depression recognition from vocal biomarkers, [in:] Interspeech, pp. 1943–1947, https://doi.org/10.21437/Interspeech.2016-1339
4. Alghowinem S., Goecke R., Wagner M., Epps J., Breakspear M., Parker G. (2012), From joyous to clinically depressed: Mood detection using spontaneous speech, [in:] Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, Youngblood G.M., McCarthy P.M. [Eds.], pp. 141–146, https://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS12/paper/view/ 4478/4782.
5. Antolík T.K., Fougeron C. (2013), Consonant distortions in dysarthria due to Parkinson’s disease, amyotrophic lateral sclerosis and cerebellar ataxia, [in:] Interspeech, pp. 2152–2156, https://doi.org/10.21437/Interspeech.2013-509
6. Aydin K. et al. (2016), Voice characteristics associated with polycystic ovary syndrome, The Laryngoscope, 126(9): 2067–2072, https://doi.org/10.1002/lary.25818
7. Barsties B., Verfaillie R., Roy N., Maryn Y. (2013), Do body mass index and fat volume influence vocal quality, phonatory range, and aerodynamics in females?, CoDAS, 25(4): 310–318, https://doi.org/10.1590/s2317-17822013000400003
8. Bedi G. et al. (2015), Automated analysis of free speech predicts psychosis onset in high-risk youths, npj Schizophrenia, 1(1): 15030, https://doi.org/10.1038/npjschz.2015.30
9. Bozkurt E., Toledo-Ronen O., Sorin A., Hoory R. (2014), Exploring modulation spectrum features for speech-based depression level classification, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-312
10. Celebi S. et al. (2013). Acoustic, perceptual and aerodynamic voice evaluation in an obese population, The Journal of Laryngology and Otology, 127(10): 987–990, https://doi.org/10.1017/s0022215113001916
11. Chitkara D., Sharma R.K. (2016), Voice based detection of type 2 diabetes mellitus, [in:] 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), pp. 83–87, https://doi.org/10.1109/AEEICB.2016.7538402
12. Cummins N., Epps J., Sethu V., Breakspear M., Goecke R. (2013), Modeling spectral variability for the classification of depressed speech, [in:] Interspeech, pp. 857–861, https://doi.org/10.21437/Interspeech.2013-242
13. Cummins N., Scherer S., Krajewski J., Schnieder S., Epps J., Quatieri T.F. (2015a), A review of depression and suicide risk assessment using speech analysis, Speech Communication, 71: 10–49, https://doi.org/10.1016/j.specom.2015.03.004
14. Cummins N., Sethu V., Epps J., Krajewski J. (2015b), Relevance vector machine for depression prediction, [in:] Interspeech, pp. 110–114, https://doi.org/10.21437/Interspeech.2015-37
15. Cummins N., Sethu V., Epps J., Schnieder S., Krajewski J. (2015c), Analysis of acoustic space variability in speech affected by depression, Speech Communication, 75: 27–49, https://doi.org/10.1016/j.specom.2015.09.003
16. Da Cunha M.G.B., Passerotti G.H., Weber R., Zilberstein B., Cecconello I. (2011), Voice feature characteristic in morbid obese population, Obesity Surgery, 21(3): 340–344, https://doi.org/10.1007/s11695-009-9959-7
17. Dassie-Leite A.P., Behlau M., Nesi-França S., Lima M.N., de Lacerda L. (2018), Vocal evaluation of children with congenital hypothyroidism, Journal of Voice, 32(6): 11–19, https://doi.org/10.1016/j.jvoice.2017.08.006
18. Deshpande G., Schuller B. (2020), An overview on audio, signal, speech, & language processing for COVID-19, arXic preprint, https://doi.org/10.48550/arXiv.2005.08579
19. Despotovic V., Ismael M., Cornil M., Mc Call R., Fagherazzi G. (2021), Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results, Computers in Biology and Medicine, 138: 104944, https://doi.org/10.1016/j.compbiomed.2021.104944
20. DeVault D. et al. (2014), SimSensei kiosk: A virtual human interviewer for healthcare decision support, [in:] AAMAS ’14: Proceedings of the 2014 International Conference on Autonomous Agents and Multiagent Systems, pp. 1061–1068.
21. Dogan E., Sander C., Wagner X., Hegerl U., Kohls E. (2017), Smartphone-based monitoring of objective and subjective data in affective disorders: Where are we and where are we going? Systematic review, Journal of Medical Internet Research, 19(7): e262, https://doi.org/10.2196/jmir.7006
22. Ekblad L.L. et al. (2015), Insulin resistance is associated with poorer verbal fluency performance in women, Diabetologia, 58(11): 2545–2553, https://doi.org/10.1007/s00125-015-3715-4
23. Faurholt-Jepsen M. et al. (2016), Voice analysis as an objective state marker in bipolar disorder, Translational psychiatry, 6(7): e856–e856, https://doi.org/10.1038/tp.2016.123
24. Gosztolya G., Bagi A., Szalóki S., Szendi I., Hoffmann I. (2018), Identifying schizophrenia based on temporal parameters in spontaneous speech, https://doi.org/10.13140/RG.2.2.10884.78721
25. Gosztolya G., Vincze V., Tóth L., Pákáski M., Kálmán J., Hoffmann I. (2019), Identifying mild cognitive impairment and mild alzheimer’s disease based on spontaneous speech using ASR and linguistic features, Computer Speech & Language, 53: 181–197, https://doi.org/10.1016/j.csl.2018.07.007
26. Gratch J. et al. (2014), The distress analysis interview corpus of human and computer interviews, [in:] Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 3123–3128, http://www.lrec-conf.org/proceedings/lrec2014/pdf/508_Paper.pdf
27. Grósz T., Busa-Fekete R., Gosztolya G., Tóth L. (2015), Assessing the degree of nativeness and Parkinson’s condition using Gaussian processes and deep rectifier neural networks, [in:] Interspeech, pp. 919–923, https://doi.org/10.21437/Interspeech.2015-195
28. Grünerbl A. et al. (2014), Smartphone-based recognition of states and state changes in bipolar disorder patients, IEEE Journal of Biomedical and Health Informatics, 19(1): 140–148, https://doi.org/10.1109/jbhi.2014.2343154
29. Gugatschka M. et al. (2013), Subjective and objective vocal parameters in women with polycystic ovary syndrome, Journal of Voice, 27(1): 98–100, https://doi.org/10.1016/j.jvoice.2012.07.007
30. Guidi A., Schoentgen J., Bertschy G., Gentili C., Scilingo E.P., Vanello N. (2017), Features of vocal frequency contour and speech rhythm in 1250 bipolar disorder, Biomedical Signal Processing and Control, 37: 23–31, https://doi.org/10.1016/j.bspc.2017.01.017
31. Guidi A., Scilingo E. P., Gentili C., Bertschy G., Landini L., Vanello N. (2015), Analysis of running speech for the characterization of mood state in bipolar patients, [in:] 2015 AEIT International Annual Conference (AEIT), pp. 1–6, https://doi.org/10.1109/AEIT.2015.7415275
32. Hamdan A.-l., Jabbour J., Nassar J., Dahouk I., Azar S.T. (2012), Vocal characteristics in patients with type 2 diabetes mellitus, European Archives of Oto-Rhino-Laryngology, 269(5): 1489–1495, https://doi.org/10.1016/j.amjoto.2012.03.008
33. Hamdan A.-L., Safadi B., Chamseddine G., Kasty M., Turfe Z.A., Ziade G. (2014), Effect of weight loss on voice after bariatric surgery, Journal of Voice, 28(5): 618–623, https://doi.org/10.1016/j.jvoice.2014.03.004
34. Han J. et al. (2020), An early study on intelligent analysis of speech under COVID-19: Severity, sleep quality, fatigue, and anxiety, arXiv preprint, https://doi.org/10.48550/arXiv.2005.00096
35. Hannoun A., Zreik T., Husseini S.T., Mahfoud L., Sibai A., Hamdan A.-l. (2011), Vocal changes in patients with polycystic ovary syndrome, Journal of Voice, 25(4): 501–504, https://doi.org/10.1016/j.jvoice.2009.12.005
36. Hassan A., Shahin I., Alsabek M.B. (2020), COVID-19 detection system using recurrent neural networks, [in:] 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), pp. 1–5, https://doi.org/10.1109/CCCI49893.2020.9256562
37. Helfer B.S., Quatieri T.F., Williamson J.R., Mehta D.D., Horwitz R., Yu B. (2013), Classification of depression state based on articulatory precision, [in:] Interspeech, pp. 2172–2176, https://doi.org/10.21437/Interspeech.2013-513
38. Hemmerling, D., Orozco-Arroyave J.R., Skalski A., Gajda J., Nöth E. (2016), Automatic detection of Parkinson’s disease based on modulated vowels, [in:] Interspeech, pp. 1190–1194, https://doi.org/10.21437/Interspeech.2016-1062
39. Hönig F., Batliner A., Nöth E., Schnieder S., Krajewski J. (2014), Automatic modelling of depressed speech: relevant features and relevance of gender, [in:] Interspeech, pp. 1248–1252, https://doi.org/10.21437/Interspeech.2014-313
40. Horwitz-Martin R.L. et al. (2016), Relation of automatically extracted formant trajectories with intelligibility loss and speaking rate decline in amyotrophic lateral sclerosis, [in:] Interspeech, pp. 1205–1209, https://doi.org/10.21437/Interspeech.2016-403
41. Huang G., Pencina K.M., Coady J.A., Beleva Y.M., Bhasin S., Basaria S. (2015), Functional voice testing detects early changes in vocal pitch in women during testosterone administration, The Journal of Clinical Endocrinology Metabolism, 100(6): 2254–2260, https://doi.org/10.1210/jc.2015-1669
42. Huang K.-Y., Wu C.-H., Kuo Y.-T., Jang F.-L. (2016), Unipolar depression vs. bipolar disorder: An elicitation-based approach to short-term detection of mood disorder, [in:] Interspeech, pp. 1452–1456, https://doi.org/10.21437/Interspeech.2016-620
43. Junuzovic-Žunic L., Ibrahimagic A., Altumbabic S. (2019), Voice characteristics in patients with thyroid disorders, The Eurasian Journal of Medicine, 51(2): 101–105, https://doi.org/10.5152/eurasianjmed.2018.18331
44. Khorram S., Gideon J., McInnis M.G., Provost E.M. (2016), Recognition of depression in bipolar disorder: Leveraging cohort and person specific knowledge, [in:] Interspeech, pp. 1215–1219, https://doi.org/10.21437/Interspeech.2016-837
45. Kiss G., Sztahó D., Tulics M.G. (2021), Application for detecting depression, Parkinson’s disease and dysphonic speech, [in:] Interspeech, pp. 956–957.
46. Klumpp P., Janu T., Arias-Vergara T., Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Apkinson – A mobile monitoring solution for Parkinson’s disease, [in:] Interspeech, pp. 1839–1843, https://doi.org/10.21437/Interspeech.2017-416
47. Kones R., Rumana U. (2017), Cardiometabolic diseases of civilization: History and maturation of an evolving global threat. An update and call to action, Annals of Medicine, 49(3): 260–274, https://doi.org/10.1080/07853890.2016.1271957
48. Kopp W. (2019), How western diet and lifestyle drive the pandemic of obesity and civilization diseases, Diabetes, Metabolic Ayndrome and Obesity: Targets and Therapy, 12: 2221–2236, https://doi.org/10.2147/DMSO.S216791
49. Laguarta J., Hueto F., Subirana B. (2020), COVID-19 artificial intelligence diagnosis using only cough recordings, IEEE Open Journal of Engineering in Medicine and Biology, 1: 275–281, https://doi.org/10.1109/OJEMB.2020.3026928
50. Lechien J. et al. (2020), Features of mild-to-moderate COVID-19 patients with dysphonia, Journal of Voice, https://doi.org/10.1016/j.jvoice.2020.05.012
51. Lopez-Otero P., Docio-Fernandez L.D., Abad A., Garcia-Mateo C. (2017), Depression detection using automatic transcriptions of de-identified speech, [in:] Interspeech, pp. 3157–3161, https://doi.org/10.21437/Interspeech.2017-1201
52. Low D.M., Bentley K.H., Ghosh S.S. (2020), Automated assessment of psychiatric disorders using speech: A systematic review, Laryngoscope Investigative Otolaryngology, 5(1): 96–116, https://doi.org/10.1002/lio2.354
53. Mallela J. et al. (2020), Raw speech waveform based classification of patients with ALS, Parkinson’s disease and healthy controls using CNN-BLSTM, [in:] Interspeech, pp. 4586–4590, https://doi.org/10.21437/Interspeech.2020-2221
54. Maor E., Sara J.D., Orbelo D.M., Lerman L.O., Levanon Y., Lerman A. (2018), Voice signal characteristics are independently associated with coronary artery disease, Mayo Clinic Proceedings, pp. 840–847, https://doi.org/10.1016/j.mayocp.2017.12.025
55. McGinnis E.W. et al. (2019), Giving voice to vulnerable children: Machine learning analysis of speech detects anxiety and depression in early childhood, IEEE Journal of Biomedical and Health Informatics, 23(6): 2294–2301, https://doi.org/10.1109/JBHI.2019.2913590
56. Mirheidari B., Blackburn D., Walker T., Venneri A., Reuber M., Christensen H. (2018), Detecting signs of dementia using word vector representations, [in:] Interspeech, pp. 1893–1897, https://doi.org/10.21437/Interspeech.2018-1764
57. Mohammadzadeh A., Heydari E., Azizi F. (2011), Speech impairment in primary hypothyroidism, Journal of Endocrinological Investigation, 34(6): 431–433, https://doi.org/10.1007/BF03346708
58. Moro-Velazquez L., Gomez-Garcia J.A., Arias-Londoño J.D., Dehak N., Godino-Llorente J.I. (2021), Advances in Parkinson’s disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects, Biomedical Signal Processing and Control, 66: 102418, https://doi.org/10.1016/j.bspc.2021.102418
59. Mota N.B. et al. (2012), Speech graphs provide a quantitative measure of thought disorder in psychosis, PLOS ONE, 7(4): e34928. https://doi.org/10.1371/journal.pone.0034928
60. Mundt J.C., Vogel A.P., Feltner D.E., Lenderking W.R. (2012), Vocal acoustic biomarkers of depression severity and treatment response, Biological psychiatry, 72(7): 580–587, https://doi.org/10.1016/j.biopsych.2012.03.015
61. Orozco-Arroyave J.R., Arias-Londoño J.F., Vargas-Bonilla J.F., Gonzalez-Rativa M.C., Nöth E. (2014a), New spanish speech corpus database for the analysis of people suffering from Parkinson’s disease, [in:] LREC, pp. 342–347.
62. Orozco-Arroyave J.R. et al. (2014b), Automatic detection of Parkinson’s disease from words uttered in three different languages, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-375
63. Pan Y., Mirheidari B., Reuber M., Venneri A., Blackburn D., Christensen H. (2020), Improving detection of Alzheimer’s disease using automatic speech recognition to identify high-quality segments for more robust feature extraction, [in:] Interspeech, pp. 4961–4965, https://doi.org/10.21437/Interspeech.2020-2698
64. Pareek V., Sharma R.K. (2016), Coronary heart disease detection from voice analysis, [in:] 2016 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6, https://doi.org/10.1109/SCEECS.2016.7509344
65. Pettorino M., Gu W., Półrola P., Fan P. (2017), Rhythmic characteristics of Parkinsonian speech: A study on Mandarin and Polish, [in:] Interspeech, pp. 3172–3176, https://doi.org/10.21437/Interspeech.2017-850
66. Pinheiro A.P., Niznikiewicz M. (2019), Altered attentional processing 1395 of happy prosody in schizophrenia, Schizophrenia Research, 206: 217–224, https://doi.org/10.1016/j.schres.2018.11.024
67. Pinkas G., Karny Y., Malachi A., Barkai G., Bachar G., Aharonson V. (2020), SARS-CoV-2 detection from voice, IEEE Open Journal of Engineering in Medicine and Biology, 1: 268–274, https://doi.org/10.1109/ojemb.2020.3026468
68. Pinto S. et al. (2016), Dysarthria in individuals with Parkinson’s disease: A protocol for a binational, cross-sectional, case-controlled study in French and European Portuguese (FraLusoPark), BMJ Open, 6(11): https://doi.org/10.1136/bmjopen-2016-012885
69. Pinyopodjanard S., Suppakitjanusant P., Lomprew P., Kasemkosin N., Chailurkit L., Ongphiphadhanakul B. (2019), Instrumental acoustic voice characteristics in adults with type 2 diabetes, Journal of Voice, 35(1): 116–121, https://doi.org/10.1016/j.jvoice.2019.07.003
70. Pompili A. et al. (2020), Assessment of Parkinson’s disease medication state through automatic speech analysis, arXiv preprint, https://doi.org/10.48550/arXiv.2005.14647
71. Rohanian M., Hough J., Purver M. (2021), Alzheimer’s dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs, [in:] Interspeech, pp. 3820–3824, https://doi.org/10.21437/Interspeech.2021-1633
72. Rusz J. et al. (2018), Smartphone allows capture of speech abnormalities associated with high risk of developing Parkinson’s disease, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(8): 1495–1507, https://doi.org/10.1109/TNSRE.2018.2851787
73. Sadeghian R., Schaffer J.D., Zahorian S.A. (2017), Speech processing approach for diagnosing dementia in an early stage, [in:] Interspeech, pp. 2705–2709, https://doi.org/10.21437/Interspeech.2017-1712
74. Sahu S., Espy-Wilson C.Y. (2016), Speech features for depression detection, [in:] Interspeech, pp. 1928–1932, https://doi.org/10.21437/Interspeech.2016-1566
75. Sattler C. et al. (2017), Interdisciplinary longitudinal study on adult development and aging (ILSE), [in:] Encyclopedia of Geropsychology, Pachana N.A. [Ed.], pp. 1213–1222, Springer, https://doi.org/10.1007/978-981-287-082-7_238
76. Scherer S., Stratou G., Gratch J., Morency L.-P. (2013a), Investigating 1435 voice quality as a speaker-independent indicator of depression and PTSD, [in:] Interspeech, pp. 847–851, https://doi.org/10.21437/Interspeech.2013-240
77. Scherer S. et al. (2013b), Automatic behavior descriptors for psychological disorder analysis, [in:] 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8, https://doi.org/10.1109/FG.2013.6553789
78. Seneviratne N., Williamson J.R., Lammert A.C., Quatieri T.F., Espy-Wilson C. (2020), Extended study on the use of vocal tract variables to quantify neuromotor coordination in depression, [in:] Interspeech, pp. 4551–4555, https://doi.org/10.21437/Interspeech.2020-2758
79. Sharma N. et al. (2020), Coswara – A database of breathing, cough, and voice sounds for COVID-19 diagnosis, arXiv preprint, pp. 4811–4815, https://doi.org/10.21437/Interspeech.2020-2768
80. Simantiraki O., Charonyktakis P., Pampouchidou A., Tsiknakis M., Cooke M. (2017), Glottal source features for automatic speech-based depression assessment, [in:] Interspeech, pp. 2700–2704, https://doi.org/10.21437/Interspeech.2017-1251
81. Sirmans S.M., Pate K.A. (2014), Epidemiology, diagnosis, and management of polycystic ovary syndrome, Clinical Epidemiology, 6: 1–13, https://doi.org/10.2147/clep.s37559
82. Skodda S., Grönheit W., Schlegel U. (2011), Intonation and speech rate in Parkinson’s disease: General and dynamic aspects and responsiveness to levodopa admission, Journal of Voice, 25(4): e199–e205, https://doi.org/10.1016/j.jvoice.2010.04.007
83. Solomon N.P., Helou L.B., Dietrich-Burns K., Stojadinovic A. (2011), Do obesity and weight loss affect vocal function?, [in:] Seminars in Speech and Language, 31(1): 31–42, https://doi.org/10.1055/s-0031-1271973
84. de Souza L.B.R., Pereira R.M., dos Santos M.M., Godoy C.M.A. (2014), Fundamental frequency, phonation maximum time and vocal complaints in morbidly obese women, ABCD. Arquivos Brasileiros de Cirurgia Digestiva, 27(1): 43–46. doi: 10.1590/ s0102-67202014000100011.
85. de Souza L.B.R., dos Santos M.M. (2018), Body mass index and acoustic voice parameters: Is there a relationship?, Brazilian Journal of Otorhinolaryngology, 84(4): 410–415, https://doi.org/10.1016/j.bjorl.2017.04.003
86. Stasak B., Epps J., Cummins N., Goecke R. (2016), An investigation of emotional speech in depression classification, [in:] Interspeech, pp. 485–489, https://doi.org/10.21437/Interspeech.2016-867
87. Stasak B., Epps J., Goecke R. (2017), Elicitation design for acoustic depression classification: An investigation of articulation effort, linguistic complexity, and word affect, [in:] Interspeech, pp. 834–838, https://doi.org/10.21437/Interspeech.2017-1223
88. Stasak B., Huang Z., Razavi S., Joachim D., Epps J. (2021). Automatic detection of COVID-19 based on short-duration acoustic smartphone speech analysis, Journal of Healthcare Informatics Research, 5(2): 201–217, https://doi.org/10.1007/s41666-020-00090-4
89. Stogowska E., Kamnski K.A., Ziółko B., Kowalska I. (2022), Voice changes in reproductive disorders, thyroid disorders and diabetes: A review, Endocrine Connections, 11(3): e201505, https://doi.org/10.1530/EC-21-0505
90. Subirana B. et al. (2020), Hi sigma, do I have the Coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic, arXiv preprint, https://doi.org/10.48550/arXiv.2004.06510
91. Sztahó D., Kiss G., Vicsi K. (2015), Estimating the severity of Parkinson’s disease from speech using linear regression and database partitioning, [in:] Interspeech, pp. 498–502, https://doi.org/10.21437/Interspeech.2015-183
92. Ujiro T. et al. (2018), Detection of dementia from responses to atypical questions asked by embodied conversational agents, [in:] Interspeech, pp. 1691–1695, https://doi.org/10.21437/Interspeech.2018-1514
93. Valstar M. et al. (2013), Avec 2013: The continuous audio/visual emotion and depression recognition challenge, [in:] AVEC’13 Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge, pp. 3–10, https://doi.org/10.1145/2512530.2512533
94. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Nöth E. (2018), A multitask learning approach to assess the dysarthria severity in patients with Parkinson’s disease, [in:] Interspeech, pp. 456–460, https://doi.org/10.21437/interspeech.2018-1988
95. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Arias-Londoño J.D., Nöth E. (2015), Automatic detection of Parkinson’s disease from continuous speech recorded in non-controlled noise conditions, [in:] Interspeech, pp. 105–109, https://doi.org/10.21437/Interspeech.2015-36
96. Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Convolutional neural network to model articulation impairments in patients with Parkinson’s disease, [in:] Interspeech, pp. 314–318, https://doi.org/10.21437/Interspeech.2017-1078
97. Villa-Cañas T., Arias-Londoño J.D., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Nöth E.
(2015), Low-frequency components analysis in running speech for the automatic detection of Parkinson’s disease, [in:] Interspeech, pp. 100–104, https://doi.org/10.21437/Interspeech.2015-35
98. Villatoro-Tello E., Dubagunta P., Fritsch J., Ramírez-de-la Rosa G., Motlicek P., Magimai-Doss M. (2021), Late fusion of the available lexicon and raw waveform-based acoustic modeling for depression and dementia recognition, [in:] Interspeech, pp. 1927-1931, https://doi.org/10.21437/Interspeech.2021-1288
99. Wang J., Kothalkar P.V., Cao B., Heitzman D. (2016), Towards automatic detection of amyotrophic lateral sclerosis from speech acoustic and articulatory samples, [in:] Interspeech, pp. 1195–1199, https://doi.org/10.21437/Interspeech.2016-1542
100. Wankerl S., Nöth E., Evert S. (2017), An n-gram based approach to the automatic diagnosis of Alzheimer’s disease from spoken language, [in:] Interspeech, pp. 3162–3166, https://doi.org/10.21437/Interspeech.2017-1572
101. Warnita T., Inoue N., Shinoda K. (2018), Detecting Alzheimer’s disease using gated convolutional neural network from audio data, [in:] Interspeech, pp. 1706–1710, https://doi.org/10.21437/Interspeech.2018-1713
102. Wei W.,Wang J., Ma J., Cheng N., Xiao J. (2020), A real-time robot-based auxiliary system for risk evaluation of COVID-19 infection, arXiv preprint, https://doi.org/10.48550/arXiv.2008.07695
103. Weiner J., Angrick M., Umesh S., Schultz T. (2018), Investigating the effect of audio duration on dementia detection using acoustic features, [in:] Interspeech, pp. 2324–2328, https://doi.org/10.21437/Interspeech.2018-57
104. Weiner J., Herff C., Schultz T. (2016), Speech-based detection of Alzheimer’s disease in conversational German, [in:] Interspeech, pp. 1938–1942, https://doi.org/10.21437/Interspeech.2016-100
105. Wodzinski M., Skalski A., Hemmerling D., Orozco-Arroyave J.R., Nöth E. (2019), Deep learning approach to Parkinson’s disease detection using voice recordings and convolutional neural network dedicated to image classification, [in:] 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 717–720, https://doi.org/10.1109/embc.2019.8856972
106. Xezonaki D., Paraskevopoulos G., Potamianos A., Narayanan S. (2020), Affective conditioning on hierarchical attention networks applied to depression detection from transcribed clinical interviews, [in:] Interspeech, pp. 4556–4560, https://doi.org/10.21437/Interspeech.2020-2819
107. Yang Y., Fairbairn C., Cohn J.F. (2012), Detecting depression severity from vocal prosody, IEEE Transactions on Affective Computing, 4(2): 142–150, https://doi.org/10.1109/T-AFFC.2012.38
108. Zhan A. et al. (2016), High frequency remote monitoring of Parkinson’s disease via smartphone: Platform overview and medication response detection, arXiv preprint, https://doi.org/10.48550/arXiv.1601.00960
109. Zhao Z. et al. (2020), Hybrid network feature extraction for depression assessment from speech, [in:] Interspeech, pp. 4956–4960, https://doi.org/10.21437/Interspeech.2020-2396
110. Zlotnik A., Montero J.M., San-Segundo R., Gallardo-Antolín A. (2015), Random forest-based prediction of Parkinson’s disease progression using acoustic, ASR and intelligibility features, [in:] Interspeech, pp. 503–507, https://doi.org/10.21437/Interspeech.2015-184

