Archives of Acoustics, 48, 3, pp. 317–324, 2023
10.24425/aoa.2023.146641

A Symmetric Approach in the Three-Dimensional Digital Waveguide Modeling of the Vocal Tract

Tahir MUSHTAQ
COMSATS University Islamabad
Pakistan

Ahmad KAMRAN
COMSATS University Islamabad
Pakistan

Muhammad Zubair QURESHI
Air University
Pakistan

Zafar IQBAL
Government Graduate College of Science
Pakistan

Simulation of wave propagation in the three-dimensional (3D) modeling of the vocal tract has shown significant promise for enhancing the accuracy of speech production. Recent 3D waveguide models of the vocal tract have been designed for better accuracy but require a lot of computational tasks. A high computational cost in these models leads to novel work in reducing the computational cost while retaining accuracy and performance. In the current work, we divide the geometry of the vocal tract into four equal symmetric parts with the introduction of two axial perpendicular planes, and the simulation is performed on only one part. A novel strategy is defined to implement symmetric conditions in the mesh. The complete standard 3D digital waveguide model is assumed as a benchmark model. The proposed model is compared with the benchmark model in terms of formant frequencies and efficiency. For the demonstration, the vowels /O/, /i/, /E/, /A/, and /u/ have been selected for the simulations. According to the results, the benchmark and current models are nearly identical in terms of frequency profiles and formant frequencies. Still the current model is three times more effective than the benchmark model.
Keywords: symmetric; digital waveguide; vocal tract; delay lines; rectilinear uniform grid
Full Text: PDF
Copyright © 2024 The Author(s). This work is licensed under the Creative Commons Attribution 4.0 International CC BY 4.0.

References

Arnela M. et al. (2016a), Influence of lips on the production of vowels based on finite element simulations and experiments, The Journal of the Acoustical Society of America, 139(5): 2852–2859, doi: 10.1121/1.4950698.

Arnela M. et al. (2016b), Influence of vocal tract geometry simplifications on the numerical simulation of vowel sounds, The Journal of the Acoustical Society of America, 140(3): 1707–1718, doi: 10.1121/1.4962488.

Arnela M., Dabbaghchian S., Guasch O., Engwall O. (2019), MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs, IEEE/ACM Transactions on Audio, Speech, Language Processing, 27(12): 2173–2182, doi: 10.1109/TASLP.2019.2942439.

Beeson M.J., Murphy D.T. (2004), RoomWeaver: A digital waveguide mesh based room acoustics research tool, [in:] Proceedings of the Seventh International Conference on Digital Audio Effects, pp. 268–273.

Blandin R. et al. (2015), Effects of higher order propagation modes in vocal tract like geometries, The Journal of the Acoustical Society of America, 137(2): 832–843, doi: 10.1121/1.4906166.

Blandin R., Félix S., Doc J.-B., Birkholz P. (2021), Combining multimodal method and 2D finite elements for the efficient simulation of vocal tract acoustics, [in:] Proceedings of the 27th International Congress on Sound and Vibration.

Gully A.J., Daffern H., Murphy D.T. (2017), Diphthong synthesis using the dynamic 3D digital waveguide mesh, IEEE/ACM Transactions on Audio, Speech, Language Processing, 26(2): 243–255, doi: 10.1109/TASLP.2017.2774921.

Gully A.J., Tucker B. (2019), Modeling voiced stop consonants using the 3D dynamic digital waveguide mesh vocal tract model, [in:] Proceedings of the International Congress of Phonetic Sciences 2019, Australasian Speech Science and Technology Association Inc.

Karjalainen M., Erkut C. (2004), Digital waveguides versus finite difference structures: Equivalence and mixed modeling, EURASIP Journal on Applied Signal Processing, 2004(7): 978–989, doi: 10.1155/S1110865704401176.

Lim Y., Zhu Y., Lingala S.G., Byrd D., Narayanan S., Nayak K.S. (2019), 3D dynamic MRI of the vocal tract during natural speech, Magnetic Resonance in Medicine, 81(3): 1511–1520, doi: 10.1002/mrm.27570.

Makarov I.S. (2009), Approximating the vocal tract by conical horns, Acoustical Physics, 55(2): 261–269, doi: 10.1134/S106377100902016X.

Markel J.E., Gray A.H. (1976), Linear Prediction of Speech, Springer.

Mathur S., Story B.H., Rodríguez J.J. (2006), Vocal-tract modeling: Fractional elongation of segment lengths in a waveguide model with half-sample delays, IEEE Transactions on Audio, Speech, and Language Processing, 14(5): 1754–1762, doi: 10.1109/TSA.2005.858550.

Mohapatra D.R., Fleischer M., Zappi V., Birkholz P., Fels S. (2022), Three-dimensional finitedifference time-domain acoustic analysis of simplified vocal tract shapes, [in:] Proceedings of Interspeech, pp. 764–768, doi: 10.21437/Interspeech.2022-10649.

Mohapatra D.R., Zappi V., Fels S. (2019), An extended two-dimensional vocal tract model for fast acoustic simulation of single-axis symmetric three-dimensional tubes, [in:] Proceedings of Interspeech 2019, pp. 3760–3764, doi: 10.21437/Interspeech.2019-1764.

Mullen J. (2006), Physical modelling of the vocal tract with the 2D digital waveguide mesh, Ph.D. Thesis, The University of York.

Mullen J., Howard D.M., Murphy D.T. (2003), Digital waveguide mesh modeling of the vocal tract acoustics, [in:] 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 119–122, doi: 10.1109/ASPAA.2003.1285834.

Mullen J., Howard D.M., Murphy D.T. (2006), Waveguide physical modeling of vocal tract acoustics: Flexible formant bandwidth control from increased model dimensionality, IEEE Transactions on Audio, Speech, and Language Processing, 14(3): 964–971, doi: 10.1109/TSA.2005.858052.

Mullen J., Howard D.M., Murphy D.T. (2007), Real-time dynamic articulations in the 2-D waveguide mesh vocal tract model, IEEE Transactions on Audio, Speech, and Language Processing, 15(2): 577–585, doi: 10.1109/TASL.2006.876751.

Murphy D.T., Beeson M. (2007), The KW-boundary hybrid digital waveguide mesh for room acoustics applications, IEEE Transactions on Audio, Speech, and Language Processing, 15(2): 552–564, doi: 10.1109/TASL.2006.881681.

Murphy D.T., Howard D.M. (2000), 2-D digital waveguide mesh topologies in room acoustics modelling, [in:] Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFx), pp. 211–216.

Qureshi T.M., Ishaq M. (2019), Real-time vocal tract model for elongation of segment lengths in a waveguide model, Archives of Acoustics, 44(2): 287–300, doi: 10.24425/aoa.2019.128492.

Qureshi T.M., Syed K.S. (2015), Two dimensional featured one dimensional digital waveguide model for the vocal tract, Computer Speech & Language, 33(1): 47–66, doi: 10.1016/j.csl.2014.12.004.

Qureshi T.M., Syed K.S. (2019), Improved vocal tract model for the elongation of segment lengths in a real time, Computer Speech & Language, 57(4): 41–58, doi: 10.1016/j.csl.2019.02.001.

Qureshi T.M., Syed K.S., Zafar A. (2020), Nonuniform rectilinear grid in the waveguide modeling of the vocal tract, Archives of Acoustics, 45(4): 585–600, doi: 10.24425/aoa.2020.135247.

Rabiner L.R., Schafer R.W. (1978), Digital Processing of Speech Signals, Prentice-Hall.

Schickhofer L., Mihaescu M. (2020), Analysis of the aerodynamic sound of speech through static vocal tract models of various glottal shapes, Journal of Biomechanics, 99: 109484, doi: 10.1016/j.jbiomech.2019.109484.

Speed M., Murphy D., Howard D. (2013a), Modeling the vocal tract transfer function using a 3D digital waveguide mesh, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2): 453–464, doi: 10.1109/TASLP.2013.2294579.

Speed M., Murphy D., Howard D. (2013b), Three-dimensional digital waveguide mesh simulation of cylindrical vocal tract analogs, IEEE Transaction on Audio, Speech, and Language Processing, 21(2): 449–455, doi: 10.1109/TASL.2012.2224342.

Story B.H., Titze I.R., Hoffman E.A. (1996), Vocal tract area functions from magnetic resonance imaging, The Journal of the Acoustical Society of America, 100(1): 537–554, doi: 10.1121/1.415960.

Strube H.W. (2003), Are conical segments useful for vocal-tract simulation? (L), The Journal of the Acoustical Society of America, 114(6): 3028–3031, doi: 10.1121/1.1623789.

Treysscde F. (2021), A model reduction method for fast finite element analysis of continuously symmetric waveguides, Journal of Sound and Vibration, 508: 116204, doi: 10.1016/j.jsv.2021.116204.

Vampola T., Horácek J., Laukkanen A.-M., Švec J.G. (2015), Human vocal tract resonances and the corresponding mode shapes investigated by three-dimensional finite-element modelling based on CT measurement, Logopedics Phoniatrics Vocology, 40(1): 14–23, doi: 10.3109/14015439.2013.775333.

Van Duyne S.A., Smith J.O. (1993), Physical modeling with the 2-D digital waveguide mesh, [in:] Proceedings of the International Computer Music Conference.

Van Duyne S.A., Smith J.O. (1995), The tetrahedral digital waveguide mesh, [in:] Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 234–237, doi: 10.1109/ASPAA.1995.482998.




DOI: 10.24425/aoa.2023.146641