Speech Analysis as a Tool for Detection and Monitoring of Medical Conditions: A review

Downloads

Authors

  • Magdalena IGRAS-CYBULSKA 1) Techmo sp. z o.o. 2) AGH University of Science and Technology, Poland ORCID ID 0000-0001-5621-7901
  • Daria HEMMERLING 1) Techmo sp. z o.o. 2) AGH University of Science and Technology, Poland ORCID ID 0000-0002-2193-7690
  • Mariusz ZIÓŁKO Techmo sp. z o. o., Poland ORCID ID 0000-0001-6260-7850
  • Wojciech DATKA 1) Medical University of Bialystok 2) Jagiellonian University, Poland
  • Ewa STOGOWSKA Medical University of Bialystok, Poland
  • Michał KUCHARSKI Techmo sp. z o. o., Poland
  • Rafał RZEPKA Hokkaido University
  • Bartosz ZIÓŁKO 1) Techmo sp. z o.o. 2) Hokkaido University, Poland ORCID ID 0000-0001-5485-8879

Abstract

The goal of this article is to present and compare recent approaches which use speech and voice analysis as biomarkers for screening tests and monitoring of some diseases. The article takes into account metabolic, respiratory, cardiovascular, endocrine, and nervous system disorders. A selection of articles was performed to identify studies that assess voice features quantitatively in selected disorders by acoustic and linguistic voice analysis. Information was extracted from each paper in order to compare various aspects of datasets, speech parameters, methods of applied analysis and obtained results. 110 research papers were reviewed and 47 databases were summarized. Speech analysis is a promising method for early diagnosis of certain disorders. Advanced computer voice analysis with machine learning algorithms combined with the widespread availability of smartphones allows diagnostic analysis to be conducted during the patient’s visit to the doctor or at the patient’s home during a telephone conversation. Speech analysis is a simple, low-cost, non-invasive and easy-toprovide method of medical diagnosis. These are remarkable advantages, but there are also disadvantages. The effectiveness of disease diagnoses varies from 65% up to 99%. For that reason it should be treated as a medical screening test and should be an indication of the need for classic medical tests.

Keywords:

speech analysis, speech features, acoustic parameters, linguistic analysis, voice biomarkers, screening tests

References

1. Afshan A., Guo J., Park S.J., Ravi V., Flint J., Alwan A. (2018), Effectiveness of voice quality features in detecting depression, [in:] Interspeech, pp. 1676–1680, https://doi.org/10.21437/Interspeech.2018-1399

2. Al Hanai T., Ghassemi M.M., Glass J.R. (2018), Detecting depression with audio/text sequence modeling of interviews, [in:] Interspeech, pp. 1716–1720, https://doi.org/10.21437/Interspeech.2018-2522

3. Alghowinem S., Goecke R., Epps J., Wagner M., Cohn J.F. (2016), Cross-cultural depression recognition from vocal biomarkers, [in:] Interspeech, pp. 1943–1947, https://doi.org/10.21437/Interspeech.2016-1339

4. Alghowinem S., Goecke R., Wagner M., Epps J., Breakspear M., Parker G. (2012), From joyous to clinically depressed: Mood detection using spontaneous speech, [in:] Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, Youngblood G.M., McCarthy P.M. [Eds.], pp. 141–146, https://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS12/paper/view/ 4478/4782.

5. Antolík T.K., Fougeron C. (2013), Consonant distortions in dysarthria due to Parkinson’s disease, amyotrophic lateral sclerosis and cerebellar ataxia, [in:] Interspeech, pp. 2152–2156, https://doi.org/10.21437/Interspeech.2013-509

6. Aydin K. et al. (2016), Voice characteristics associated with polycystic ovary syndrome, The Laryngoscope, 126(9): 2067–2072, https://doi.org/10.1002/lary.25818

7. Barsties B., Verfaillie R., Roy N., Maryn Y. (2013), Do body mass index and fat volume influence vocal quality, phonatory range, and aerodynamics in females?, CoDAS, 25(4): 310–318, https://doi.org/10.1590/s2317-17822013000400003

8. Bedi G. et al. (2015), Automated analysis of free speech predicts psychosis onset in high-risk youths, npj Schizophrenia, 1(1): 15030, https://doi.org/10.1038/npjschz.2015.30

9. Bozkurt E., Toledo-Ronen O., Sorin A., Hoory R. (2014), Exploring modulation spectrum features for speech-based depression level classification, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-312

10. Celebi S. et al. (2013). Acoustic, perceptual and aerodynamic voice evaluation in an obese population, The Journal of Laryngology and Otology, 127(10): 987–990, https://doi.org/10.1017/s0022215113001916

11. Chitkara D., Sharma R.K. (2016), Voice based detection of type 2 diabetes mellitus, [in:] 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), pp. 83–87, https://doi.org/10.1109/AEEICB.2016.7538402

12. Cummins N., Epps J., Sethu V., Breakspear M., Goecke R. (2013), Modeling spectral variability for the classification of depressed speech, [in:] Interspeech, pp. 857–861, https://doi.org/10.21437/Interspeech.2013-242

13. Cummins N., Scherer S., Krajewski J., Schnieder S., Epps J., Quatieri T.F. (2015a), A review of depression and suicide risk assessment using speech analysis, Speech Communication, 71: 10–49, https://doi.org/10.1016/j.specom.2015.03.004

14. Cummins N., Sethu V., Epps J., Krajewski J. (2015b), Relevance vector machine for depression prediction, [in:] Interspeech, pp. 110–114, https://doi.org/10.21437/Interspeech.2015-37

15. Cummins N., Sethu V., Epps J., Schnieder S., Krajewski J. (2015c), Analysis of acoustic space variability in speech affected by depression, Speech Communication, 75: 27–49, https://doi.org/10.1016/j.specom.2015.09.003

16. Da Cunha M.G.B., Passerotti G.H., Weber R., Zilberstein B., Cecconello I. (2011), Voice feature characteristic in morbid obese population, Obesity Surgery, 21(3): 340–344, https://doi.org/10.1007/s11695-009-9959-7

17. Dassie-Leite A.P., Behlau M., Nesi-França S., Lima M.N., de Lacerda L. (2018), Vocal evaluation of children with congenital hypothyroidism, Journal of Voice, 32(6): 11–19, https://doi.org/10.1016/j.jvoice.2017.08.006

18. Deshpande G., Schuller B. (2020), An overview on audio, signal, speech, & language processing for COVID-19, arXic preprint, https://doi.org/10.48550/arXiv.2005.08579

19. Despotovic V., Ismael M., Cornil M., Mc Call R., Fagherazzi G. (2021), Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results, Computers in Biology and Medicine, 138: 104944, https://doi.org/10.1016/j.compbiomed.2021.104944

20. DeVault D. et al. (2014), SimSensei kiosk: A virtual human interviewer for healthcare decision support, [in:] AAMAS ’14: Proceedings of the 2014 International Conference on Autonomous Agents and Multiagent Systems, pp. 1061–1068.

21. Dogan E., Sander C., Wagner X., Hegerl U., Kohls E. (2017), Smartphone-based monitoring of objective and subjective data in affective disorders: Where are we and where are we going? Systematic review, Journal of Medical Internet Research, 19(7): e262, https://doi.org/10.2196/jmir.7006

22. Ekblad L.L. et al. (2015), Insulin resistance is associated with poorer verbal fluency performance in women, Diabetologia, 58(11): 2545–2553, https://doi.org/10.1007/s00125-015-3715-4

23. Faurholt-Jepsen M. et al. (2016), Voice analysis as an objective state marker in bipolar disorder, Translational psychiatry, 6(7): e856–e856, https://doi.org/10.1038/tp.2016.123

24. Gosztolya G., Bagi A., Szalóki S., Szendi I., Hoffmann I. (2018), Identifying schizophrenia based on temporal parameters in spontaneous speech, https://doi.org/10.13140/RG.2.2.10884.78721

25. Gosztolya G., Vincze V., Tóth L., Pákáski M., Kálmán J., Hoffmann I. (2019), Identifying mild cognitive impairment and mild alzheimer’s disease based on spontaneous speech using ASR and linguistic features, Computer Speech & Language, 53: 181–197, https://doi.org/10.1016/j.csl.2018.07.007

26. Gratch J. et al. (2014), The distress analysis interview corpus of human and computer interviews, [in:] Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 3123–3128, http://www.lrec-conf.org/proceedings/lrec2014/pdf/508_Paper.pdf

27. Grósz T., Busa-Fekete R., Gosztolya G., Tóth L. (2015), Assessing the degree of nativeness and Parkinson’s condition using Gaussian processes and deep rectifier neural networks, [in:] Interspeech, pp. 919–923, https://doi.org/10.21437/Interspeech.2015-195

28. Grünerbl A. et al. (2014), Smartphone-based recognition of states and state changes in bipolar disorder patients, IEEE Journal of Biomedical and Health Informatics, 19(1): 140–148, https://doi.org/10.1109/jbhi.2014.2343154

29. Gugatschka M. et al. (2013), Subjective and objective vocal parameters in women with polycystic ovary syndrome, Journal of Voice, 27(1): 98–100, https://doi.org/10.1016/j.jvoice.2012.07.007

30. Guidi A., Schoentgen J., Bertschy G., Gentili C., Scilingo E.P., Vanello N. (2017), Features of vocal frequency contour and speech rhythm in 1250 bipolar disorder, Biomedical Signal Processing and Control, 37: 23–31, https://doi.org/10.1016/j.bspc.2017.01.017

31. Guidi A., Scilingo E. P., Gentili C., Bertschy G., Landini L., Vanello N. (2015), Analysis of running speech for the characterization of mood state in bipolar patients, [in:] 2015 AEIT International Annual Conference (AEIT), pp. 1–6, https://doi.org/10.1109/AEIT.2015.7415275

32. Hamdan A.-l., Jabbour J., Nassar J., Dahouk I., Azar S.T. (2012), Vocal characteristics in patients with type 2 diabetes mellitus, European Archives of Oto-Rhino-Laryngology, 269(5): 1489–1495, https://doi.org/10.1016/j.amjoto.2012.03.008

33. Hamdan A.-L., Safadi B., Chamseddine G., Kasty M., Turfe Z.A., Ziade G. (2014), Effect of weight loss on voice after bariatric surgery, Journal of Voice, 28(5): 618–623, https://doi.org/10.1016/j.jvoice.2014.03.004

34. Han J. et al. (2020), An early study on intelligent analysis of speech under COVID-19: Severity, sleep quality, fatigue, and anxiety, arXiv preprint, https://doi.org/10.48550/arXiv.2005.00096

35. Hannoun A., Zreik T., Husseini S.T., Mahfoud L., Sibai A., Hamdan A.-l. (2011), Vocal changes in patients with polycystic ovary syndrome, Journal of Voice, 25(4): 501–504, https://doi.org/10.1016/j.jvoice.2009.12.005

36. Hassan A., Shahin I., Alsabek M.B. (2020), COVID-19 detection system using recurrent neural networks, [in:] 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), pp. 1–5, https://doi.org/10.1109/CCCI49893.2020.9256562

37. Helfer B.S., Quatieri T.F., Williamson J.R., Mehta D.D., Horwitz R., Yu B. (2013), Classification of depression state based on articulatory precision, [in:] Interspeech, pp. 2172–2176, https://doi.org/10.21437/Interspeech.2013-513

38. Hemmerling, D., Orozco-Arroyave J.R., Skalski A., Gajda J., Nöth E. (2016), Automatic detection of Parkinson’s disease based on modulated vowels, [in:] Interspeech, pp. 1190–1194, https://doi.org/10.21437/Interspeech.2016-1062

39. Hönig F., Batliner A., Nöth E., Schnieder S., Krajewski J. (2014), Automatic modelling of depressed speech: relevant features and relevance of gender, [in:] Interspeech, pp. 1248–1252, https://doi.org/10.21437/Interspeech.2014-313

40. Horwitz-Martin R.L. et al. (2016), Relation of automatically extracted formant trajectories with intelligibility loss and speaking rate decline in amyotrophic lateral sclerosis, [in:] Interspeech, pp. 1205–1209, https://doi.org/10.21437/Interspeech.2016-403

41. Huang G., Pencina K.M., Coady J.A., Beleva Y.M., Bhasin S., Basaria S. (2015), Functional voice testing detects early changes in vocal pitch in women during testosterone administration, The Journal of Clinical Endocrinology Metabolism, 100(6): 2254–2260, https://doi.org/10.1210/jc.2015-1669

42. Huang K.-Y., Wu C.-H., Kuo Y.-T., Jang F.-L. (2016), Unipolar depression vs. bipolar disorder: An elicitation-based approach to short-term detection of mood disorder, [in:] Interspeech, pp. 1452–1456, https://doi.org/10.21437/Interspeech.2016-620

43. Junuzovic-Žunic L., Ibrahimagic A., Altumbabic S. (2019), Voice characteristics in patients with thyroid disorders, The Eurasian Journal of Medicine, 51(2): 101–105, https://doi.org/10.5152/eurasianjmed.2018.18331

44. Khorram S., Gideon J., McInnis M.G., Provost E.M. (2016), Recognition of depression in bipolar disorder: Leveraging cohort and person specific knowledge, [in:] Interspeech, pp. 1215–1219, https://doi.org/10.21437/Interspeech.2016-837

45. Kiss G., Sztahó D., Tulics M.G. (2021), Application for detecting depression, Parkinson’s disease and dysphonic speech, [in:] Interspeech, pp. 956–957.

46. Klumpp P., Janu T., Arias-Vergara T., Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Apkinson – A mobile monitoring solution for Parkinson’s disease, [in:] Interspeech, pp. 1839–1843, https://doi.org/10.21437/Interspeech.2017-416

47. Kones R., Rumana U. (2017), Cardiometabolic diseases of civilization: History and maturation of an evolving global threat. An update and call to action, Annals of Medicine, 49(3): 260–274, https://doi.org/10.1080/07853890.2016.1271957

48. Kopp W. (2019), How western diet and lifestyle drive the pandemic of obesity and civilization diseases, Diabetes, Metabolic Ayndrome and Obesity: Targets and Therapy, 12: 2221–2236, https://doi.org/10.2147/DMSO.S216791

49. Laguarta J., Hueto F., Subirana B. (2020), COVID-19 artificial intelligence diagnosis using only cough recordings, IEEE Open Journal of Engineering in Medicine and Biology, 1: 275–281, https://doi.org/10.1109/OJEMB.2020.3026928

50. Lechien J. et al. (2020), Features of mild-to-moderate COVID-19 patients with dysphonia, Journal of Voice, https://doi.org/10.1016/j.jvoice.2020.05.012

51. Lopez-Otero P., Docio-Fernandez L.D., Abad A., Garcia-Mateo C. (2017), Depression detection using automatic transcriptions of de-identified speech, [in:] Interspeech, pp. 3157–3161, https://doi.org/10.21437/Interspeech.2017-1201

52. Low D.M., Bentley K.H., Ghosh S.S. (2020), Automated assessment of psychiatric disorders using speech: A systematic review, Laryngoscope Investigative Otolaryngology, 5(1): 96–116, https://doi.org/10.1002/lio2.354

53. Mallela J. et al. (2020), Raw speech waveform based classification of patients with ALS, Parkinson’s disease and healthy controls using CNN-BLSTM, [in:] Interspeech, pp. 4586–4590, https://doi.org/10.21437/Interspeech.2020-2221

54. Maor E., Sara J.D., Orbelo D.M., Lerman L.O., Levanon Y., Lerman A. (2018), Voice signal characteristics are independently associated with coronary artery disease, Mayo Clinic Proceedings, pp. 840–847, https://doi.org/10.1016/j.mayocp.2017.12.025

55. McGinnis E.W. et al. (2019), Giving voice to vulnerable children: Machine learning analysis of speech detects anxiety and depression in early childhood, IEEE Journal of Biomedical and Health Informatics, 23(6): 2294–2301, https://doi.org/10.1109/JBHI.2019.2913590

56. Mirheidari B., Blackburn D., Walker T., Venneri A., Reuber M., Christensen H. (2018), Detecting signs of dementia using word vector representations, [in:] Interspeech, pp. 1893–1897, https://doi.org/10.21437/Interspeech.2018-1764

57. Mohammadzadeh A., Heydari E., Azizi F. (2011), Speech impairment in primary hypothyroidism, Journal of Endocrinological Investigation, 34(6): 431–433, https://doi.org/10.1007/BF03346708

58. Moro-Velazquez L., Gomez-Garcia J.A., Arias-Londoño J.D., Dehak N., Godino-Llorente J.I. (2021), Advances in Parkinson’s disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects, Biomedical Signal Processing and Control, 66: 102418, https://doi.org/10.1016/j.bspc.2021.102418

59. Mota N.B. et al. (2012), Speech graphs provide a quantitative measure of thought disorder in psychosis, PLOS ONE, 7(4): e34928. https://doi.org/10.1371/journal.pone.0034928

60. Mundt J.C., Vogel A.P., Feltner D.E., Lenderking W.R. (2012), Vocal acoustic biomarkers of depression severity and treatment response, Biological psychiatry, 72(7): 580–587, https://doi.org/10.1016/j.biopsych.2012.03.015

61. Orozco-Arroyave J.R., Arias-Londoño J.F., Vargas-Bonilla J.F., Gonzalez-Rativa M.C., Nöth E. (2014a), New spanish speech corpus database for the analysis of people suffering from Parkinson’s disease, [in:] LREC, pp. 342–347.

62. Orozco-Arroyave J.R. et al. (2014b), Automatic detection of Parkinson’s disease from words uttered in three different languages, [in:] Interspeech, https://doi.org/10.21437/Interspeech.2014-375

63. Pan Y., Mirheidari B., Reuber M., Venneri A., Blackburn D., Christensen H. (2020), Improving detection of Alzheimer’s disease using automatic speech recognition to identify high-quality segments for more robust feature extraction, [in:] Interspeech, pp. 4961–4965, https://doi.org/10.21437/Interspeech.2020-2698

64. Pareek V., Sharma R.K. (2016), Coronary heart disease detection from voice analysis, [in:] 2016 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6, https://doi.org/10.1109/SCEECS.2016.7509344

65. Pettorino M., Gu W., Półrola P., Fan P. (2017), Rhythmic characteristics of Parkinsonian speech: A study on Mandarin and Polish, [in:] Interspeech, pp. 3172–3176, https://doi.org/10.21437/Interspeech.2017-850

66. Pinheiro A.P., Niznikiewicz M. (2019), Altered attentional processing 1395 of happy prosody in schizophrenia, Schizophrenia Research, 206: 217–224, https://doi.org/10.1016/j.schres.2018.11.024

67. Pinkas G., Karny Y., Malachi A., Barkai G., Bachar G., Aharonson V. (2020), SARS-CoV-2 detection from voice, IEEE Open Journal of Engineering in Medicine and Biology, 1: 268–274, https://doi.org/10.1109/ojemb.2020.3026468

68. Pinto S. et al. (2016), Dysarthria in individuals with Parkinson’s disease: A protocol for a binational, cross-sectional, case-controlled study in French and European Portuguese (FraLusoPark), BMJ Open, 6(11): https://doi.org/10.1136/bmjopen-2016-012885

69. Pinyopodjanard S., Suppakitjanusant P., Lomprew P., Kasemkosin N., Chailurkit L., Ongphiphadhanakul B. (2019), Instrumental acoustic voice characteristics in adults with type 2 diabetes, Journal of Voice, 35(1): 116–121, https://doi.org/10.1016/j.jvoice.2019.07.003

70. Pompili A. et al. (2020), Assessment of Parkinson’s disease medication state through automatic speech analysis, arXiv preprint, https://doi.org/10.48550/arXiv.2005.14647

71. Rohanian M., Hough J., Purver M. (2021), Alzheimer’s dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs, [in:] Interspeech, pp. 3820–3824, https://doi.org/10.21437/Interspeech.2021-1633

72. Rusz J. et al. (2018), Smartphone allows capture of speech abnormalities associated with high risk of developing Parkinson’s disease, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(8): 1495–1507, https://doi.org/10.1109/TNSRE.2018.2851787

73. Sadeghian R., Schaffer J.D., Zahorian S.A. (2017), Speech processing approach for diagnosing dementia in an early stage, [in:] Interspeech, pp. 2705–2709, https://doi.org/10.21437/Interspeech.2017-1712

74. Sahu S., Espy-Wilson C.Y. (2016), Speech features for depression detection, [in:] Interspeech, pp. 1928–1932, https://doi.org/10.21437/Interspeech.2016-1566

75. Sattler C. et al. (2017), Interdisciplinary longitudinal study on adult development and aging (ILSE), [in:] Encyclopedia of Geropsychology, Pachana N.A. [Ed.], pp. 1213–1222, Springer, https://doi.org/10.1007/978-981-287-082-7_238

76. Scherer S., Stratou G., Gratch J., Morency L.-P. (2013a), Investigating 1435 voice quality as a speaker-independent indicator of depression and PTSD, [in:] Interspeech, pp. 847–851, https://doi.org/10.21437/Interspeech.2013-240

77. Scherer S. et al. (2013b), Automatic behavior descriptors for psychological disorder analysis, [in:] 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8, https://doi.org/10.1109/FG.2013.6553789

78. Seneviratne N., Williamson J.R., Lammert A.C., Quatieri T.F., Espy-Wilson C. (2020), Extended study on the use of vocal tract variables to quantify neuromotor coordination in depression, [in:] Interspeech, pp. 4551–4555, https://doi.org/10.21437/Interspeech.2020-2758

79. Sharma N. et al. (2020), Coswara – A database of breathing, cough, and voice sounds for COVID-19 diagnosis, arXiv preprint, pp. 4811–4815, https://doi.org/10.21437/Interspeech.2020-2768

80. Simantiraki O., Charonyktakis P., Pampouchidou A., Tsiknakis M., Cooke M. (2017), Glottal source features for automatic speech-based depression assessment, [in:] Interspeech, pp. 2700–2704, https://doi.org/10.21437/Interspeech.2017-1251

81. Sirmans S.M., Pate K.A. (2014), Epidemiology, diagnosis, and management of polycystic ovary syndrome, Clinical Epidemiology, 6: 1–13, https://doi.org/10.2147/clep.s37559

82. Skodda S., Grönheit W., Schlegel U. (2011), Intonation and speech rate in Parkinson’s disease: General and dynamic aspects and responsiveness to levodopa admission, Journal of Voice, 25(4): e199–e205, https://doi.org/10.1016/j.jvoice.2010.04.007

83. Solomon N.P., Helou L.B., Dietrich-Burns K., Stojadinovic A. (2011), Do obesity and weight loss affect vocal function?, [in:] Seminars in Speech and Language, 31(1): 31–42, https://doi.org/10.1055/s-0031-1271973

84. de Souza L.B.R., Pereira R.M., dos Santos M.M., Godoy C.M.A. (2014), Fundamental frequency, phonation maximum time and vocal complaints in morbidly obese women, ABCD. Arquivos Brasileiros de Cirurgia Digestiva, 27(1): 43–46. doi: 10.1590/ s0102-67202014000100011.

85. de Souza L.B.R., dos Santos M.M. (2018), Body mass index and acoustic voice parameters: Is there a relationship?, Brazilian Journal of Otorhinolaryngology, 84(4): 410–415, https://doi.org/10.1016/j.bjorl.2017.04.003

86. Stasak B., Epps J., Cummins N., Goecke R. (2016), An investigation of emotional speech in depression classification, [in:] Interspeech, pp. 485–489, https://doi.org/10.21437/Interspeech.2016-867

87. Stasak B., Epps J., Goecke R. (2017), Elicitation design for acoustic depression classification: An investigation of articulation effort, linguistic complexity, and word affect, [in:] Interspeech, pp. 834–838, https://doi.org/10.21437/Interspeech.2017-1223

88. Stasak B., Huang Z., Razavi S., Joachim D., Epps J. (2021). Automatic detection of COVID-19 based on short-duration acoustic smartphone speech analysis, Journal of Healthcare Informatics Research, 5(2): 201–217, https://doi.org/10.1007/s41666-020-00090-4

89. Stogowska E., Kamnski K.A., Ziółko B., Kowalska I. (2022), Voice changes in reproductive disorders, thyroid disorders and diabetes: A review, Endocrine Connections, 11(3): e201505, https://doi.org/10.1530/EC-21-0505

90. Subirana B. et al. (2020), Hi sigma, do I have the Coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic, arXiv preprint, https://doi.org/10.48550/arXiv.2004.06510

91. Sztahó D., Kiss G., Vicsi K. (2015), Estimating the severity of Parkinson’s disease from speech using linear regression and database partitioning, [in:] Interspeech, pp. 498–502, https://doi.org/10.21437/Interspeech.2015-183

92. Ujiro T. et al. (2018), Detection of dementia from responses to atypical questions asked by embodied conversational agents, [in:] Interspeech, pp. 1691–1695, https://doi.org/10.21437/Interspeech.2018-1514

93. Valstar M. et al. (2013), Avec 2013: The continuous audio/visual emotion and depression recognition challenge, [in:] AVEC’13 Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge, pp. 3–10, https://doi.org/10.1145/2512530.2512533

94. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Nöth E. (2018), A multitask learning approach to assess the dysarthria severity in patients with Parkinson’s disease, [in:] Interspeech, pp. 456–460, https://doi.org/10.21437/interspeech.2018-1988

95. Vásquez-Correa J.C., Arias-Vergara T., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Arias-Londoño J.D., Nöth E. (2015), Automatic detection of Parkinson’s disease from continuous speech recorded in non-controlled noise conditions, [in:] Interspeech, pp. 105–109, https://doi.org/10.21437/Interspeech.2015-36

96. Vásquez-Correa J.C., Orozco-Arroyave J.R., Nöth E. (2017), Convolutional neural network to model articulation impairments in patients with Parkinson’s disease, [in:] Interspeech, pp. 314–318, https://doi.org/10.21437/Interspeech.2017-1078

97. Villa-Cañas T., Arias-Londoño J.D., Orozco-Arroyave J.R., Vargas-Bonilla J.F., Nöth E.
(2015), Low-frequency components analysis in running speech for the automatic detection of Parkinson’s disease, [in:] Interspeech, pp. 100–104, https://doi.org/10.21437/Interspeech.2015-35

98. Villatoro-Tello E., Dubagunta P., Fritsch J., Ramírez-de-la Rosa G., Motlicek P., Magimai-Doss M. (2021), Late fusion of the available lexicon and raw waveform-based acoustic modeling for depression and dementia recognition, [in:] Interspeech, pp. 1927-1931, https://doi.org/10.21437/Interspeech.2021-1288

99. Wang J., Kothalkar P.V., Cao B., Heitzman D. (2016), Towards automatic detection of amyotrophic lateral sclerosis from speech acoustic and articulatory samples, [in:] Interspeech, pp. 1195–1199, https://doi.org/10.21437/Interspeech.2016-1542

100. Wankerl S., Nöth E., Evert S. (2017), An n-gram based approach to the automatic diagnosis of Alzheimer’s disease from spoken language, [in:] Interspeech, pp. 3162–3166, https://doi.org/10.21437/Interspeech.2017-1572

101. Warnita T., Inoue N., Shinoda K. (2018), Detecting Alzheimer’s disease using gated convolutional neural network from audio data, [in:] Interspeech, pp. 1706–1710, https://doi.org/10.21437/Interspeech.2018-1713

102. Wei W.,Wang J., Ma J., Cheng N., Xiao J. (2020), A real-time robot-based auxiliary system for risk evaluation of COVID-19 infection, arXiv preprint, https://doi.org/10.48550/arXiv.2008.07695

103. Weiner J., Angrick M., Umesh S., Schultz T. (2018), Investigating the effect of audio duration on dementia detection using acoustic features, [in:] Interspeech, pp. 2324–2328, https://doi.org/10.21437/Interspeech.2018-57

104. Weiner J., Herff C., Schultz T. (2016), Speech-based detection of Alzheimer’s disease in conversational German, [in:] Interspeech, pp. 1938–1942, https://doi.org/10.21437/Interspeech.2016-100

105. Wodzinski M., Skalski A., Hemmerling D., Orozco-Arroyave J.R., Nöth E. (2019), Deep learning approach to Parkinson’s disease detection using voice recordings and convolutional neural network dedicated to image classification, [in:] 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 717–720, https://doi.org/10.1109/embc.2019.8856972

106. Xezonaki D., Paraskevopoulos G., Potamianos A., Narayanan S. (2020), Affective conditioning on hierarchical attention networks applied to depression detection from transcribed clinical interviews, [in:] Interspeech, pp. 4556–4560, https://doi.org/10.21437/Interspeech.2020-2819

107. Yang Y., Fairbairn C., Cohn J.F. (2012), Detecting depression severity from vocal prosody, IEEE Transactions on Affective Computing, 4(2): 142–150, https://doi.org/10.1109/T-AFFC.2012.38

108. Zhan A. et al. (2016), High frequency remote monitoring of Parkinson’s disease via smartphone: Platform overview and medication response detection, arXiv preprint, https://doi.org/10.48550/arXiv.1601.00960

109. Zhao Z. et al. (2020), Hybrid network feature extraction for depression assessment from speech, [in:] Interspeech, pp. 4956–4960, https://doi.org/10.21437/Interspeech.2020-2396

110. Zlotnik A., Montero J.M., San-Segundo R., Gallardo-Antolín A. (2015), Random forest-based prediction of Parkinson’s disease progression using acoustic, ASR and intelligibility features, [in:] Interspeech, pp. 503–507, https://doi.org/10.21437/Interspeech.2015-184