A Symmetric Approach in the Three-Dimensional Digital Waveguide Modeling of the Vocal Tract

Tahir MUSHTAQ; Ahmad KAMRAN; Muhammad Zubair QURESHI; Zafar IQBAL

doi:10.24425/aoa.2023.146641

Authors

Tahir MUSHTAQ COMSATS University Islamabad, Pakistan
Ahmad KAMRAN COMSATS University Islamabad, Pakistan
Muhammad Zubair QURESHI Air University, Pakistan
Zafar IQBAL Government Graduate College of Science, Pakistan

Abstract

Simulation of wave propagation in the three-dimensional (3D) modeling of the vocal tract has shown significant promise for enhancing the accuracy of speech production. Recent 3D waveguide models of the vocal tract have been designed for better accuracy but require a lot of computational tasks. A high computational cost in these models leads to novel work in reducing the computational cost while retaining accuracy and performance. In the current work, we divide the geometry of the vocal tract into four equal symmetric parts with the introduction of two axial perpendicular planes, and the simulation is performed on only one part. A novel strategy is defined to implement symmetric conditions in the mesh. The complete standard 3D digital waveguide model is assumed as a benchmark model. The proposed model is compared with the benchmark model in terms of formant frequencies and efficiency. For the demonstration, the vowels /O/, /i/, /E/, /A/, and /u/ have been selected for the simulations. According to the results, the benchmark and current models are nearly identical in terms of frequency profiles and formant frequencies. Still the current model is three times more effective than the benchmark model.

Keywords:

symmetric, digital waveguide, vocal tract, delay lines, rectilinear uniform grid

References

1. Arnela M. et al. (2016a), Influence of lips on the production of vowels based on finite element simulations and experiments, The Journal of the Acoustical Society of America, 139(5): 2852–2859, https://doi.org/10.1121/1.4950698

2. Arnela M. et al. (2016b), Influence of vocal tract geometry simplifications on the numerical simulation of vowel sounds, The Journal of the Acoustical Society of America, 140(3): 1707–1718, https://doi.org/10.1121/1.4962488

3. Arnela M., Dabbaghchian S., Guasch O., Engwall O. (2019), MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs, IEEE/ACM Transactions on Audio, Speech, Language Processing, 27(12): 2173–2182, https://doi.org/10.1109/TASLP.2019.2942439

4. Beeson M.J., Murphy D.T. (2004), RoomWeaver: A digital waveguide mesh based room acoustics research tool, [in:] Proceedings of the Seventh International Conference on Digital Audio Effects, pp. 268–273.

5. Blandin R. et al. (2015), Effects of higher order propagation modes in vocal tract like geometries, The Journal of the Acoustical Society of America, 137(2): 832–843, https://doi.org/10.1121/1.4906166

6. Blandin R., Félix S., Doc J.-B., Birkholz P. (2021), Combining multimodal method and 2D finite elements for the efficient simulation of vocal tract acoustics, [in:] Proceedings of the 27th International Congress on Sound and Vibration.

7. Gully A.J., Daffern H., Murphy D.T. (2017), Diphthong synthesis using the dynamic 3D digital waveguide mesh, IEEE/ACM Transactions on Audio, Speech, Language Processing, 26(2): 243–255, https://doi.org/10.1109/TASLP.2017.2774921

8. Gully A.J., Tucker B. (2019), Modeling voiced stop consonants using the 3D dynamic digital waveguide mesh vocal tract model, [in:] Proceedings of the International Congress of Phonetic Sciences 2019, Australasian Speech Science and Technology Association Inc.

9. Karjalainen M., Erkut C. (2004), Digital waveguides versus finite difference structures: Equivalence and mixed modeling, EURASIP Journal on Applied Signal Processing, 2004(7): 978–989, https://doi.org/10.1155/S1110865704401176

10. Lim Y., Zhu Y., Lingala S.G., Byrd D., Narayanan S., Nayak K.S. (2019), 3D dynamic MRI of the vocal tract during natural speech, Magnetic Resonance in Medicine, 81(3): 1511–1520, https://doi.org/10.1002/mrm.27570

11. Makarov I.S. (2009), Approximating the vocal tract by conical horns, Acoustical Physics, 55(2): 261–269, https://doi.org/10.1134/S106377100902016X

12. Markel J.E., Gray A.H. (1976), Linear Prediction of Speech, Springer.

13. Mathur S., Story B.H., Rodríguez J.J. (2006), Vocal-tract modeling: Fractional elongation of segment lengths in a waveguide model with half-sample delays, IEEE Transactions on Audio, Speech, and Language Processing, 14(5): 1754–1762, https://doi.org/10.1109/TSA.2005.858550

14. Mohapatra D.R., Fleischer M., Zappi V., Birkholz P., Fels S. (2022), Three-dimensional finitedifference time-domain acoustic analysis of simplified vocal tract shapes, [in:] Proceedings of Interspeech, pp. 764–768, https://doi.org/10.21437/Interspeech.2022-10649

15. Mohapatra D.R., Zappi V., Fels S. (2019), An extended two-dimensional vocal tract model for fast acoustic simulation of single-axis symmetric three-dimensional tubes, [in:] Proceedings of Interspeech 2019, pp. 3760–3764, https://doi.org/10.21437/Interspeech.2019-1764

16. Mullen J. (2006), Physical modelling of the vocal tract with the 2D digital waveguide mesh, Ph.D. Thesis, The University of York.

17. Mullen J., Howard D.M., Murphy D.T. (2003), Digital waveguide mesh modeling of the vocal tract acoustics, [in:] 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 119–122, https://doi.org/10.1109/ASPAA.2003.1285834

18. Mullen J., Howard D.M., Murphy D.T. (2006), Waveguide physical modeling of vocal tract acoustics: Flexible formant bandwidth control from increased model dimensionality, IEEE Transactions on Audio, Speech, and Language Processing, 14(3): 964–971, https://doi.org/10.1109/TSA.2005.858052

19. Mullen J., Howard D.M., Murphy D.T. (2007), Real-time dynamic articulations in the 2-D waveguide mesh vocal tract model, IEEE Transactions on Audio, Speech, and Language Processing, 15(2): 577–585, https://doi.org/10.1109/TASL.2006.876751

20. Murphy D.T., Beeson M. (2007), The KW-boundary hybrid digital waveguide mesh for room acoustics applications, IEEE Transactions on Audio, Speech, and Language Processing, 15(2): 552–564, https://doi.org/10.1109/TASL.2006.881681

21. Murphy D.T., Howard D.M. (2000), 2-D digital waveguide mesh topologies in room acoustics modelling, [in:] Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFx), pp. 211–216.

22. Qureshi T.M., Ishaq M. (2019), Real-time vocal tract model for elongation of segment lengths in a waveguide model, Archives of Acoustics, 44(2): 287–300, https://doi.org/10.24425/aoa.2019.128492

23. Qureshi T.M., Syed K.S. (2015), Two dimensional featured one dimensional digital waveguide model for the vocal tract, Computer Speech & Language, 33(1): 47–66, https://doi.org/10.1016/j.csl.2014.12.004

24. Qureshi T.M., Syed K.S. (2019), Improved vocal tract model for the elongation of segment lengths in a real time, Computer Speech & Language, 57(4): 41–58, https://doi.org/10.1016/j.csl.2019.02.001

25. Qureshi T.M., Syed K.S., Zafar A. (2020), Nonuniform rectilinear grid in the waveguide modeling of the vocal tract, Archives of Acoustics, 45(4): 585–600, https://doi.org/10.24425/aoa.2020.135247

26. Rabiner L.R., Schafer R.W. (1978), Digital Processing of Speech Signals, Prentice-Hall.

27. Schickhofer L., Mihaescu M. (2020), Analysis of the aerodynamic sound of speech through static vocal tract models of various glottal shapes, Journal of Biomechanics, 99: 109484, https://doi.org/10.1016/j.jbiomech.2019.109484

28. Speed M., Murphy D., Howard D. (2013a), Modeling the vocal tract transfer function using a 3D digital waveguide mesh, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2): 453–464, https://doi.org/10.1109/TASLP.2013.2294579

29. Speed M., Murphy D., Howard D. (2013b), Three-dimensional digital waveguide mesh simulation of cylindrical vocal tract analogs, IEEE Transaction on Audio, Speech, and Language Processing, 21(2): 449–455, https://doi.org/10.1109/TASL.2012.2224342

30. Story B.H., Titze I.R., Hoffman E.A. (1996), Vocal tract area functions from magnetic resonance imaging, The Journal of the Acoustical Society of America, 100(1): 537–554, https://doi.org/10.1121/1.415960

31. Strube H.W. (2003), Are conical segments useful for vocal-tract simulation? (L), The Journal of the Acoustical Society of America, 114(6): 3028–3031, https://doi.org/10.1121/1.1623789

32. Treysscde F. (2021), A model reduction method for fast finite element analysis of continuously symmetric waveguides, Journal of Sound and Vibration, 508: 116204, https://doi.org/10.1016/j.jsv.2021.116204

33. Vampola T., Horácek J., Laukkanen A.-M., Švec J.G. (2015), Human vocal tract resonances and the corresponding mode shapes investigated by three-dimensional finite-element modelling based on CT measurement, Logopedics Phoniatrics Vocology, 40(1): 14–23, https://doi.org/10.3109/14015439.2013.775333

34. Van Duyne S.A., Smith J.O. (1993), Physical modeling with the 2-D digital waveguide mesh, [in:] Proceedings of the International Computer Music Conference.

35. Van Duyne S.A., Smith J.O. (1995), The tetrahedral digital waveguide mesh, [in:] Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 234–237, https://doi.org/10.1109/ASPAA.1995.482998

Online first
Early birds
2025, Vol 50
	No 1	No 2	No 3	No 4
2024, Vol 49
	No 1	No 2	No 3	No 4
2023, Vol 48
	No 1	No 2	No 3	No 4
2022, Vol 47
	No 1	No 2	No 3	No 4
2021, Vol 46
	No 1	No 2	No 3	No 4
2020, Vol 45
	No 1	No 2	No 3	No 4
2019, Vol 44
	No 1	No 2	No 3	No 4
2018, Vol 43
	No 1	No 2	No 3	No 4
2017, Vol 42
	No 1	No 2	No 3	No 4
2016, Vol 41
	No 1	No 2	No 3	No 4
2015, Vol 40
	No 1	No 2	No 3	No 4
2014, Vol 39
	No 1	No 2	No 3	No 4
2013, Vol 38
	No 1	No 2	No 3	No 4
2012, Vol 37
	No 1	No 2	No 3	No 4
2011, Vol 36
	No 1	No 2	No 3	No 4
2010, Vol 35
	No 1	No 2	No 3	No 4
2009, Vol 34
	No 1	No 2	No 3	No 4
2008, Vol 33
	No 1	No 2	No 3	No 4	No 4(S)
2007, Vol 32
	No 1	No 2	No 3	No 4	No 4(S)
2006, Vol 31
	No 1	No 2	No 3	No 4	No 4(S)
2005, Vol 30
	No 1	No 2	No 3	No 4
2004, Vol 29
	No 1	No 2	No 3	No 4
2003, Vol 28
	No 1	No 2	No 3	No 4
2002, Vol 27
	No 1	No 2	No 3	No 4
2001, Vol 26
	No 1	No 2	No 3	No 4
2000, Vol 25
	No 1	No 2	No 3	No 4
1999, Vol 24
	No 1	No 2	No 3	No 4
1998, Vol 23
	No 1	No 2	No 3	No 4
1997, Vol 22
	No 1	No 2	No 3	No 4
1996, Vol 21
	No 1	No 2	No 3	No 4
1995, Vol 20
	No 1	No 2	No 3	No 4
1994, Vol 19
	No 1	No 2	No 3	No 4
1993, Vol 18
	No 1	No 2	No 3	No 4
1992, Vol 17
	No 1	No 2	No 3	No 4
1991, Vol 16
	No 1	No 2	No 3-4
1990, Vol 15
	No 1-2		No 3-4
1989, Vol 14
	No 1-2		No 3-4
1988, Vol 13
	No 1-2		No 3-4
1987, Vol 12
	No 1	No 2	No 3-4
1986, Vol 11
	No 1	No 2	No 3	No 4
1985, Vol 10
	No 1	No 2	No 3	No 4
1984, Vol 9
	No 1-2		No 3	No 4
1983, Vol 8
	No 1	No 2	No 3	No 4
1982, Vol 7
	No 1	No 2	No 3-4
1981, Vol 6
	No 1	No 2	No 3	No 4
1980, Vol 5
	No 1	No 2	No 3	No 4
1979, Vol 4
	No 1	No 2	No 3	No 4
1978, Vol 3
	No 1	No 2	No 3	No 4
1977, Vol 2
	No 1	No 2	No 3	No 4
1976, Vol 1
	No 1	No 2	No 3	No 4

A Symmetric Approach in the Three-Dimensional Digital Waveguide Modeling of the Vocal Tract

Downloads

Authors

Abstract

Keywords:

References

cover

ippt-pan

Issue

Pages

Section

DOI

Received

Revised

Accepted

Published

License

How to Cite

Principal Contact

Address

Support Contact