Abstract
This study is aimed to evaluate a method for distinguishing between healthy and pathological voices. The evaluation was carried out using several acoustic parameters including COVAREP (collaborative voice analysis repository for speech technologies), the auditory-perceptual RBH (roughness, breathiness, hoarseness) scale, and AVQI (acoustic voice quality index). Finally, a classifier is trained using machine learning algorithms from the WEKA (Waikato Environment for Knowledge Analysis) platform. The study group comprised 75 voice recordings of individuals affected by vocal fold paralysis. The control group consisted of 49 voice recordings of healthy individuals. The results indicate that the voice quality of the study group is significantly different than the voice quality of the control group. Acoustic parameters implemented in COVAREP and the RBH scale have proven to be reliable methods assessing voice quality. In addition, data classification achieved over 90 % accuracy for every classifier.Keywords:
voice quality, AVQI, COVAREP, RBH scale, vocal fold paralysisReferences
1. Aha D.W., Kibler D., Albert M.K. (1991), Instance-based learning algorithms, Machine learning, 6: 37–66, https://doi.org/10.1007/bf00153759
2. Airas M., Alku P. (2007), Comparison of multiple voice source parameters in different phonation types, [in:] Eighth Annual Conference of the International Speech Communication Association, https://doi.org/10.21437/interspeech.2007-28
3. Alku P., Backstrom T., Vilkman E. (2002), Normalized amplitude quotient for parametrization of the glottal flow, The Journal of the Acoustical Society of America, 112(2): 701–710, https://doi.org/10.1121/1.1490365
4. Alku P., Strik H., Vilkman E. (1997), Parabolic spectral parameter – A new method for quantification of the glottal flow, Speech Communication, 22(1): 67–79, https://doi.org/10.1016/s0167-6393(97)00020-4
5. Alpaydin E. (2004), Introduction to Machine Learning, MIT Press.
6. Askenfelt A.G., Hammarberg B. (1986), Speech waveform perturbation analysis: A perceptual-acoustical comparison of seven measures, Journal of Speech, Language, and Hearing Research, 29(1): 50–64, https://doi.org/10.1044/jshr.2901.50
7. Barsties B., Maryn Y. (2012), Der acoustic voice quality index [in German: Ein Messverfahren zur allgemeinen Stimmqualitat], HNO, 60(8): 715–720, https://doi.org/10.1007/s00106-012-2499-9
8. Behrbohm H., Kaschke O., Nawka T., Swift A.C. (2011), Ear, Nose and Throat Diseases with Head and Neck Surgery [in Polish: Choroby ucha, nosa i gardła z chirurgią głowy i szyi], 2nd ed., Edra Urban & Partner.
9. Boersma P. (2001), Praat, a system for doing phonetics by computer, Glot International, 5(9/10): 341–345.
10. Chen H.-C., Jen Y.-M., Wang C.-H., Lee J.-C., Lin Y.-S. (2007), Etiology of vocal cord paralysis, ORL, 69(3): 167–171, https://doi.org/10.1159/000099226
11. Childers D.G., Lee C.K. (1991), Vocal quality factors: Analysis, synthesis, and perception, The Journal of the Acoustical Society of America, 90(5): 2394–2410, https://doi.org/10.1121/1.402044
12. Compton E.C. et al. (2022), Developing an Artificial Intelligence tool to predict vocal cord pathology in primary care settings, The Laryngoscope, 133(8): 1531–4995, https://doi.org/10.1002/lary.30432
13. Cooper W.E., Sorensen J.M. (1981), Fundamental Frequency in Sentence Production, Springer Science & Business Media.
14. Crowson M.G. et al. (2020), A contemporary review of machine learning in otolaryngology–head and neck surgery, The Laryngoscope, 130(1): 45–51, https://doi.org/10.1002/lary.27850
15. Degottex G., Kane J., Drugman T., Raitio T., Scherer S. (2014), COVAREP – A collaborative voice analysis repository for speech technologies, [in:] 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 960–964, https://doi.org/10.1109/icassp.2014.6853739
16. Dejonckere P.H. et al. (2001), A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques, European Archives of Oto-rhino-laryngology, 258: 77–82, https://doi.org/10.1007/s004050000299
17. Deliyski D.D., Shaw H.S., Evans M.K. (2005), Adverse effects of environmental noise on acoustic voice quality measurements, Journal of Voice, 19(1): 15–28, https://doi.org/10.1016/j.jvoice.2004.07.003
18. Dibazar A.A., Berger T.W., Narayanan S.S. (2006), Pathological voice assessment, [in:] 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, 2006: 1669–1673, https://doi.org/10.1109/IEMBS.2006.259835
19. Friedman N., Geiger D., Goldszmidt M. (1997), Bayesian network classifiers, Machine Learning, 29: 131–163, https://doi.org/10.1023/A:1007465528199
20. Godino-Llorente J.I., Gómez-Vilda P., Saenz-Lechón N., Blanco-Velasco M., Cruz-Roldan F., Ferrer-Ballester M.A. (2005), Support vector machines applied to the detection of voice disorders, [in:] Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science, Faundez-Zanuy M., Janer L., Esposito A., Satue-Villar A., Roure J., Espinosa-Duro V. [Eds.], pp. 219–230, https://doi.org/10.1007/11613107_19
21. Hacki T. (1989), Classification of glottal dysfunctions on the basis of electroglottography [in German: Klassifizierung von glottiscysfunktionen mit hilfe der elektroglottographie], Folia phoniatrica, 41(1): 43–48, https://doi.org/10.1159/000265931
22. Hanson H.M. (1997), Glottal characteristics of female speakers: Acoustic correlates, The Journal of the Acoustical Society of America, 101(1): 466–481, https://doi.org/10.1121/1.417991
23. Hillenbrand J., Houde R.A. (1996), Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech, Journal of Speech, Language, and Hearing Research, 39(2): 311–321, https://doi.org/10.1044/jshr.3902.311
24. Hirano M. (1981), Clinical Examination of Voice, Springer-Verlag, New York.
25. Hogikyan N.D. (2004), The voice-related quality of life (V-RQOL) measure: History and ongoing utility of a validated voice outcomes instrument, Perspectives on Voice and Voice Disorders, 14(1): 3–5, https://doi.org/ 10.1044/vvd14.1.3.
26. Hosokawa K. et al. (2017), Validation of the acoustic voice quality index in the Japanese language, Journal of Voice, 31(2): 260.e1–260.e9, https://doi.org/10.1016/j.jvoice.2016.05.010
27. Ingrisano D.R., Perry C.K., Jepson K.R. (1998), Environmental noise: A threat to automatic voice analysis, American Journal of Speech-Language Pathology, 7(1): 91–96, doi: https://doi.org/10.1044/1058-0360.0701.91
28. Jeong G.-E. et al. (2022), Treatment efficacy of voice therapy following injection laryngoplasty for unilateral vocal fold paralysis, Journal of Voice, 36(2): 242–248, https://doi.org/10.1016/j.jvoice.2020.05.014
29. Kane J., Gobl C. (2011), Identifying regions of nonmodal phonation using features of the wavelet transform, [in:] Twelfth Annual Conference of the International Speech Communication Association, pp. 177–180, https://doi.org/10.21437/interspeech.2011-76
30. Kane J., Gobl C. (2013), Wavelet maxima dispersion for breathy to tense voice discrimination, [in:] IEEE Transactions on Audio, Speech, and Language Processing, 21(6): 1170–1179, https://doi.org/10.1109/tasl.2013.2245653
31. Kankare E. et al. (2020), The acoustic voice quality index version 02.02 in the Finnish-speaking population, Logopedics Phoniatrics Vocology, 45(2): 49–56, https://doi.org/10.1080/14015439.2018.1556332
32. Kosztyła-Hojna B., Moskal D., Kuryliszyn-Moskal A., Rutkowski R. (2014), Visual assessment of voice disorders in patients with occupational dysphonia, Annals of Agricultural and Environmental Medicine, 21(4): 898–902, https://doi.org/10.5604/12321966.1129955
33. Landwehr N., Hall M., Frank E. (2005), Logistic model trees, Machine Learning, 59: 161–205, https://doi.org/10.1007/s10994-005-0466-3
34. Laukkanen A.-M., Rantala L. (2022), Does the acoustic voice quality index (AVQI) correlate with perceived creak and strain in normophonic young adult Finnish females?, Folia Phoniatrica et Logopaedica, 74(1): 62–69, https://doi.org/10.1159/000514796
35. Majkowska M. (2004), Basic issues of voice emission and hygiene [in Polish: Podstawowe zagadnienia emisji i higieny głosu], [in:] Prace Naukowe Akademii im. Jana Długosza w Częstochowie, 5: 93–101.
36. Maryn Y., Corthals P., Van Cauwenberge P., Roy N., De Bodt M. (2010), Toward improved ecological validity in the acoustic measurement of overall voice quality: Combining continuous speech and sustained vowels, [in:] Journal of Voice, 24(5): 540–555, https://doi.org/10.1016/j.jvoice.2008.12.014
37. Maryn Y., De Bodt M., Barsties B., Roy N. (2014), The value of the acoustic voice quality index as a measure of dysphonia severity in subjects speaking different languages, European Archives of Oto-Rhino-Laryngology, 271: 1609–1619, https://doi.org/10.1007/s00405-013-2730-7
38. Maryn Y., Roy N. (2012), Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity, Jornal da Sociedade Brasileira de Fonoaudiologia, 24: 107–112, https://doi.org/10.1590/s2179-64912012000200003
39. Maryn Y., Roy N., De Bodt M., Van Cauwenberge P., Corthals P. (2009), Acoustic measurement of overall voice quality: A meta-analysis, The Journal of the Acoustical Society of America, 126(5): 2619–2634, https://doi.org/10.1121/1.3224706
40. Maryn Y., Weenink D. (2015), Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index, Journal of Voice, 29(1): 35–43, https://doi.org/10.1016/j.jvoice.2014.06.015
41. Montalbaron M.B. et al. (2023), Presumptive diagnosis in tele-health laryngology: A multi-center observational study, The Annals of Otology, Rhinology, and Laryngology, 132(12): 1511–1519, https://doi.org/10.1177/00034894231165811
42. Nawka, T., Anders, L., Wendler, J. (1994), The auditory assessment of hoarse voices according to the RBH system [in German], Sprache, Stimme, Gehor, 18: 130–133.
43. Nemr K. et al. (2012), GRBAS and Cape-V scales: High reliability and consensus when applied at different times, Journal of Voice, 26(6): 812.e17–218.e22, https://doi.org/10.1016/j.jvoice.2012.03.005
44. Parsa V., Jamieson D.G. (2001), Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech, Journal of Speech, Language, and Hearing Research, 44(2): 327–339, https://doi.org/10.1044/1092-4388(2001/027)
45. Patel R.R. et al. (2018), Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function, American Journal of Speech-Language Pathology, 27(3): 887–905, https://doi.org/10.1044/2018 ajslp-17-0009.
46. Portney L.G., Watkins M.P. (2009), Foundations of Clinical Research: Applications to Practice, 3rd ed., Pearson/Prentice Hall Upper Saddle River, NJ.
47. Quinlan J.R. (1999), C4.5: Programs for Machine Learning, Morgan Kaufman.
48. Reynolds V. et al. (2012), Objective assessment of pediatric voice disorders with the acoustic voice quality index, Journal of Voice, 26(5): 672.e1–372.e7, https://doi.org/10.1016/j.jvoice.2012.02.002
49. Roper T.A. (2014), Clinical Skills, 2nd ed., Oxford University Press.
50. Rosłanowski A. (2008), Phoniatric database [in Polish: Baza nagrań foniatrycznych], B.Eng., Polish-Japanese Academy of Information Technology.
51. Speyer R. et al. (2010), Maximum phonation time: Variability and reliability, Journal of Voice, 24(3): 281–284, https://doi.org/10.1016/j.jvoice.2008.10.004
52. Suvvari T.K. (2023), The role of Artificial Intelligence in diagnosis and management of laryngeal disorders, Ear, Nose & Throat Journal, https://doi.org/10.1177/01455613231175053
53. Szklanny K. (2019), Acoustic parameters in the evaluation of voice quality of choral singers. Prototype of mobile application for voice quality evaluation, Archives of Acoustics, 44(3): 439–446, https://doi.org/10.24425/aoa.2019.129257
54. Szklanny K., Wrzeciono P. (2019), Relation of RBH auditory-perceptual scale to acoustic and electroglottographic voice analysis in children with vocal nodules, IEEE Access, 7: 41647–41658, https://doi.org/10.1109/ACCESS.2019.2907397
55. Tadeusiewicz R. (1988), Speech Signal [in Polish: Sygnał mowy], Wydawnictwa Komunikacji i Łączności, Warszawa.
56. Tirronen S., Javanmardi F., Kodali M., Reddy Kadiri S., Alku P. (2023), Utilizing Wav2Vec in database-independent voice disorder detection, [in:] ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, https://doi.org/10.1109/ICASSP49357.2023.10094798
57. Uloza V., Petrauskas T., Padervinskis E., Ulozaite N., Barsties B., Maryn Y. (2017), Validation of the acoustic voice quality index in the Lithuanian language, Journal of Voice, 31(2): 257.e1–257.e11, https://doi.org/10.1016/j.jvoice.2016.06.002
58. Verde L., De Pietro G., Sannino G. (2018), Voice disorder identification by using machine learning techniques, IEEE access, 6: 16246–16255, https://doi.org/10.1109/access.2018.2816338
59. Verikas A., Gelzinis A., Bacauskiene M., Uloza V. (2006), Towards a computer-aided diagnosis system for vocal cord diseases, Artificial Intelligence in Medicine, 36(1): 71–84, https://doi.org/10.1016/j.artmed.2004.11.001
60. Wilson J., Webb A., Carding P., Steen I., MacKenzie K., Deary I. (2004), The voice symptom scale (VoiSS) and the vocal handicap index (VHI): A comparison of structure and content, Clinical Otolaryngology & Allied Sciences, 29(2): 169–174, https://doi.org/10.1111/j.0307-7772.2004.00775.x
2. Airas M., Alku P. (2007), Comparison of multiple voice source parameters in different phonation types, [in:] Eighth Annual Conference of the International Speech Communication Association, https://doi.org/10.21437/interspeech.2007-28
3. Alku P., Backstrom T., Vilkman E. (2002), Normalized amplitude quotient for parametrization of the glottal flow, The Journal of the Acoustical Society of America, 112(2): 701–710, https://doi.org/10.1121/1.1490365
4. Alku P., Strik H., Vilkman E. (1997), Parabolic spectral parameter – A new method for quantification of the glottal flow, Speech Communication, 22(1): 67–79, https://doi.org/10.1016/s0167-6393(97)00020-4
5. Alpaydin E. (2004), Introduction to Machine Learning, MIT Press.
6. Askenfelt A.G., Hammarberg B. (1986), Speech waveform perturbation analysis: A perceptual-acoustical comparison of seven measures, Journal of Speech, Language, and Hearing Research, 29(1): 50–64, https://doi.org/10.1044/jshr.2901.50
7. Barsties B., Maryn Y. (2012), Der acoustic voice quality index [in German: Ein Messverfahren zur allgemeinen Stimmqualitat], HNO, 60(8): 715–720, https://doi.org/10.1007/s00106-012-2499-9
8. Behrbohm H., Kaschke O., Nawka T., Swift A.C. (2011), Ear, Nose and Throat Diseases with Head and Neck Surgery [in Polish: Choroby ucha, nosa i gardła z chirurgią głowy i szyi], 2nd ed., Edra Urban & Partner.
9. Boersma P. (2001), Praat, a system for doing phonetics by computer, Glot International, 5(9/10): 341–345.
10. Chen H.-C., Jen Y.-M., Wang C.-H., Lee J.-C., Lin Y.-S. (2007), Etiology of vocal cord paralysis, ORL, 69(3): 167–171, https://doi.org/10.1159/000099226
11. Childers D.G., Lee C.K. (1991), Vocal quality factors: Analysis, synthesis, and perception, The Journal of the Acoustical Society of America, 90(5): 2394–2410, https://doi.org/10.1121/1.402044
12. Compton E.C. et al. (2022), Developing an Artificial Intelligence tool to predict vocal cord pathology in primary care settings, The Laryngoscope, 133(8): 1531–4995, https://doi.org/10.1002/lary.30432
13. Cooper W.E., Sorensen J.M. (1981), Fundamental Frequency in Sentence Production, Springer Science & Business Media.
14. Crowson M.G. et al. (2020), A contemporary review of machine learning in otolaryngology–head and neck surgery, The Laryngoscope, 130(1): 45–51, https://doi.org/10.1002/lary.27850
15. Degottex G., Kane J., Drugman T., Raitio T., Scherer S. (2014), COVAREP – A collaborative voice analysis repository for speech technologies, [in:] 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 960–964, https://doi.org/10.1109/icassp.2014.6853739
16. Dejonckere P.H. et al. (2001), A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques, European Archives of Oto-rhino-laryngology, 258: 77–82, https://doi.org/10.1007/s004050000299
17. Deliyski D.D., Shaw H.S., Evans M.K. (2005), Adverse effects of environmental noise on acoustic voice quality measurements, Journal of Voice, 19(1): 15–28, https://doi.org/10.1016/j.jvoice.2004.07.003
18. Dibazar A.A., Berger T.W., Narayanan S.S. (2006), Pathological voice assessment, [in:] 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, 2006: 1669–1673, https://doi.org/10.1109/IEMBS.2006.259835
19. Friedman N., Geiger D., Goldszmidt M. (1997), Bayesian network classifiers, Machine Learning, 29: 131–163, https://doi.org/10.1023/A:1007465528199
20. Godino-Llorente J.I., Gómez-Vilda P., Saenz-Lechón N., Blanco-Velasco M., Cruz-Roldan F., Ferrer-Ballester M.A. (2005), Support vector machines applied to the detection of voice disorders, [in:] Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science, Faundez-Zanuy M., Janer L., Esposito A., Satue-Villar A., Roure J., Espinosa-Duro V. [Eds.], pp. 219–230, https://doi.org/10.1007/11613107_19
21. Hacki T. (1989), Classification of glottal dysfunctions on the basis of electroglottography [in German: Klassifizierung von glottiscysfunktionen mit hilfe der elektroglottographie], Folia phoniatrica, 41(1): 43–48, https://doi.org/10.1159/000265931
22. Hanson H.M. (1997), Glottal characteristics of female speakers: Acoustic correlates, The Journal of the Acoustical Society of America, 101(1): 466–481, https://doi.org/10.1121/1.417991
23. Hillenbrand J., Houde R.A. (1996), Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech, Journal of Speech, Language, and Hearing Research, 39(2): 311–321, https://doi.org/10.1044/jshr.3902.311
24. Hirano M. (1981), Clinical Examination of Voice, Springer-Verlag, New York.
25. Hogikyan N.D. (2004), The voice-related quality of life (V-RQOL) measure: History and ongoing utility of a validated voice outcomes instrument, Perspectives on Voice and Voice Disorders, 14(1): 3–5, https://doi.org/ 10.1044/vvd14.1.3.
26. Hosokawa K. et al. (2017), Validation of the acoustic voice quality index in the Japanese language, Journal of Voice, 31(2): 260.e1–260.e9, https://doi.org/10.1016/j.jvoice.2016.05.010
27. Ingrisano D.R., Perry C.K., Jepson K.R. (1998), Environmental noise: A threat to automatic voice analysis, American Journal of Speech-Language Pathology, 7(1): 91–96, doi: https://doi.org/10.1044/1058-0360.0701.91
28. Jeong G.-E. et al. (2022), Treatment efficacy of voice therapy following injection laryngoplasty for unilateral vocal fold paralysis, Journal of Voice, 36(2): 242–248, https://doi.org/10.1016/j.jvoice.2020.05.014
29. Kane J., Gobl C. (2011), Identifying regions of nonmodal phonation using features of the wavelet transform, [in:] Twelfth Annual Conference of the International Speech Communication Association, pp. 177–180, https://doi.org/10.21437/interspeech.2011-76
30. Kane J., Gobl C. (2013), Wavelet maxima dispersion for breathy to tense voice discrimination, [in:] IEEE Transactions on Audio, Speech, and Language Processing, 21(6): 1170–1179, https://doi.org/10.1109/tasl.2013.2245653
31. Kankare E. et al. (2020), The acoustic voice quality index version 02.02 in the Finnish-speaking population, Logopedics Phoniatrics Vocology, 45(2): 49–56, https://doi.org/10.1080/14015439.2018.1556332
32. Kosztyła-Hojna B., Moskal D., Kuryliszyn-Moskal A., Rutkowski R. (2014), Visual assessment of voice disorders in patients with occupational dysphonia, Annals of Agricultural and Environmental Medicine, 21(4): 898–902, https://doi.org/10.5604/12321966.1129955
33. Landwehr N., Hall M., Frank E. (2005), Logistic model trees, Machine Learning, 59: 161–205, https://doi.org/10.1007/s10994-005-0466-3
34. Laukkanen A.-M., Rantala L. (2022), Does the acoustic voice quality index (AVQI) correlate with perceived creak and strain in normophonic young adult Finnish females?, Folia Phoniatrica et Logopaedica, 74(1): 62–69, https://doi.org/10.1159/000514796
35. Majkowska M. (2004), Basic issues of voice emission and hygiene [in Polish: Podstawowe zagadnienia emisji i higieny głosu], [in:] Prace Naukowe Akademii im. Jana Długosza w Częstochowie, 5: 93–101.
36. Maryn Y., Corthals P., Van Cauwenberge P., Roy N., De Bodt M. (2010), Toward improved ecological validity in the acoustic measurement of overall voice quality: Combining continuous speech and sustained vowels, [in:] Journal of Voice, 24(5): 540–555, https://doi.org/10.1016/j.jvoice.2008.12.014
37. Maryn Y., De Bodt M., Barsties B., Roy N. (2014), The value of the acoustic voice quality index as a measure of dysphonia severity in subjects speaking different languages, European Archives of Oto-Rhino-Laryngology, 271: 1609–1619, https://doi.org/10.1007/s00405-013-2730-7
38. Maryn Y., Roy N. (2012), Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity, Jornal da Sociedade Brasileira de Fonoaudiologia, 24: 107–112, https://doi.org/10.1590/s2179-64912012000200003
39. Maryn Y., Roy N., De Bodt M., Van Cauwenberge P., Corthals P. (2009), Acoustic measurement of overall voice quality: A meta-analysis, The Journal of the Acoustical Society of America, 126(5): 2619–2634, https://doi.org/10.1121/1.3224706
40. Maryn Y., Weenink D. (2015), Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index, Journal of Voice, 29(1): 35–43, https://doi.org/10.1016/j.jvoice.2014.06.015
41. Montalbaron M.B. et al. (2023), Presumptive diagnosis in tele-health laryngology: A multi-center observational study, The Annals of Otology, Rhinology, and Laryngology, 132(12): 1511–1519, https://doi.org/10.1177/00034894231165811
42. Nawka, T., Anders, L., Wendler, J. (1994), The auditory assessment of hoarse voices according to the RBH system [in German], Sprache, Stimme, Gehor, 18: 130–133.
43. Nemr K. et al. (2012), GRBAS and Cape-V scales: High reliability and consensus when applied at different times, Journal of Voice, 26(6): 812.e17–218.e22, https://doi.org/10.1016/j.jvoice.2012.03.005
44. Parsa V., Jamieson D.G. (2001), Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech, Journal of Speech, Language, and Hearing Research, 44(2): 327–339, https://doi.org/10.1044/1092-4388(2001/027)
45. Patel R.R. et al. (2018), Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function, American Journal of Speech-Language Pathology, 27(3): 887–905, https://doi.org/10.1044/2018 ajslp-17-0009.
46. Portney L.G., Watkins M.P. (2009), Foundations of Clinical Research: Applications to Practice, 3rd ed., Pearson/Prentice Hall Upper Saddle River, NJ.
47. Quinlan J.R. (1999), C4.5: Programs for Machine Learning, Morgan Kaufman.
48. Reynolds V. et al. (2012), Objective assessment of pediatric voice disorders with the acoustic voice quality index, Journal of Voice, 26(5): 672.e1–372.e7, https://doi.org/10.1016/j.jvoice.2012.02.002
49. Roper T.A. (2014), Clinical Skills, 2nd ed., Oxford University Press.
50. Rosłanowski A. (2008), Phoniatric database [in Polish: Baza nagrań foniatrycznych], B.Eng., Polish-Japanese Academy of Information Technology.
51. Speyer R. et al. (2010), Maximum phonation time: Variability and reliability, Journal of Voice, 24(3): 281–284, https://doi.org/10.1016/j.jvoice.2008.10.004
52. Suvvari T.K. (2023), The role of Artificial Intelligence in diagnosis and management of laryngeal disorders, Ear, Nose & Throat Journal, https://doi.org/10.1177/01455613231175053
53. Szklanny K. (2019), Acoustic parameters in the evaluation of voice quality of choral singers. Prototype of mobile application for voice quality evaluation, Archives of Acoustics, 44(3): 439–446, https://doi.org/10.24425/aoa.2019.129257
54. Szklanny K., Wrzeciono P. (2019), Relation of RBH auditory-perceptual scale to acoustic and electroglottographic voice analysis in children with vocal nodules, IEEE Access, 7: 41647–41658, https://doi.org/10.1109/ACCESS.2019.2907397
55. Tadeusiewicz R. (1988), Speech Signal [in Polish: Sygnał mowy], Wydawnictwa Komunikacji i Łączności, Warszawa.
56. Tirronen S., Javanmardi F., Kodali M., Reddy Kadiri S., Alku P. (2023), Utilizing Wav2Vec in database-independent voice disorder detection, [in:] ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, https://doi.org/10.1109/ICASSP49357.2023.10094798
57. Uloza V., Petrauskas T., Padervinskis E., Ulozaite N., Barsties B., Maryn Y. (2017), Validation of the acoustic voice quality index in the Lithuanian language, Journal of Voice, 31(2): 257.e1–257.e11, https://doi.org/10.1016/j.jvoice.2016.06.002
58. Verde L., De Pietro G., Sannino G. (2018), Voice disorder identification by using machine learning techniques, IEEE access, 6: 16246–16255, https://doi.org/10.1109/access.2018.2816338
59. Verikas A., Gelzinis A., Bacauskiene M., Uloza V. (2006), Towards a computer-aided diagnosis system for vocal cord diseases, Artificial Intelligence in Medicine, 36(1): 71–84, https://doi.org/10.1016/j.artmed.2004.11.001
60. Wilson J., Webb A., Carding P., Steen I., MacKenzie K., Deary I. (2004), The voice symptom scale (VoiSS) and the vocal handicap index (VHI): A comparison of structure and content, Clinical Otolaryngology & Allied Sciences, 29(2): 169–174, https://doi.org/10.1111/j.0307-7772.2004.00775.x

