Abstract
As the virtual reality (VR) market is growing at a fast pace, numerous users and producers are emerging with the hope to navigate VR towards mainstream adoption. Although most solutions focus on providing high- resolution and high-quality videos, the acoustics in VR is as important as visual cues for maintaining consistency with the natural world. We therefore investigate one of the most important audio solutions for VR applications: ambisonics. Several VR producers such as Google, HTC, and Facebook support the ambisonic audio format. Binaural ambisonics builds a virtual loudspeaker array over a VR headset, providing immersive sound. The configuration of the virtual loudspeaker influences the listening perception, as has been widely discussed in the literature. However, few studies have investigated the influence of the orientation of the virtual loudspeaker array. That is, the same loudspeaker arrays with different orientations can produce different spatial effects. This paper introduces a VR audio technique with optimal design and proposes a dual-mode audio solution. Both an objective measurement and a subjective listening test show that the proposed solution effectively enhances spatial audio quality.Keywords:
audio quality, ambisonics, immersive sound, loudspeaker array, spatial effect, virtual reality.References
1. Algazi V.R., Duda R.O., Thompson D.M. (2004), Motion-tracked binaural sound, Journal of the Audio Engineering Society, 52(11): 1142–1156, http://www.aes.org/e-lib/browse.cfm?elib=12644
2. Blauert J., Rabenstein R. (2012), Providing surround sound with loudspeakers: a synopsis of current methods, Archives of Acoustics, 37(1): 55–62, https://doi.org/10.2478/v10168-012-0002-y
3. Clark H.A.M., Dutton G.F., Vanderlyn P.B. (1958), The ‘stereosonic’ recording and reproducing system: a two-channel systems for domestic tape records, Journal of the Audio Engineering Society, 6(2): 102–117, https://doi.org/10.1049/pi-b-1.1957.0180
4. Collins T. (2013), Binaural ambisonic decoding with enhanced lateral localization, Proceedings of Audio Engineering Society 134th Convention, http://www.aes org/e-lib/browse.cfm?elib=16779.
5. D’Orazio D., Guidorzi P., Garai M. (2009), A Matlab Toolbox for the analysis of Ando’s factors, Proceedings of Audio Engineering Society 126th Convention, http://www.aes.org/e-lib/browse.cfm?elib=14994
6. Gaik W. (1993), Combined evaluation of interaural time and intensity differences: psychoacoustic results and computer modelling, The Journal of the Acoustical Society of America, 94(1): 98–110, https://doi.org/10.1121/1.406947
7. Gardenfors D. (2003), Designing sound-based computer games, Digital Creativity, 14(2): 111–114, https://doi.org/10.1076/digc.14.2.111.27863
8. Gaudy T., Natkin S., Archambault D. (2009), Pyvox 2: An audio game accessible to visually impaired people playable without visual nor verbal instructions, Transactions on Edutainment II, 5660: 176–186, https://doi.org/10.1007/978-3-642-03270-7_12
9. Gerzon M.A. (1975), The design of precisely coincident microphone arrays for stereo and surround sound, Proceedings of Audio Engineering Society 50th Convention, London, http://www.aes.org/e-lib/ browse.cfm?elib=2466.
10. IRCAM (2002), Listen HRTF database, http://recher che.ircam.fr/equipes/salles/listen/.
11. Kleczkowski P., Krol A., Malecki P. (2015), Reproduction of phantom sources improves with separation of direct and reflected sounds, Archives of Acoustics, 40(4): 575–584, https://doi.org/10.1515/aoa-2015-0057
12. Matsumura T., Iwanaga N., Kobayashi W., Onoye T., Shirakawa I. (2005), Embedded 3D sound movement system based on feature extraction of headrelated transfer function, IEEE Transactions on Consumer Electronics, 51(1): 262–267, https://doi.org/10.1109/TCE 2005.1405730.
13. McKeag A., McGrath D. (1996), Sound field format to binaural decoder with head-tracking, Proceedings of 6th Australian Regional Audio Engineering Society Convention, http://www.aes.org/e-lib/ browse.cfm?elib=7477.
14. Monro G. (2000), In-phase corrections for ambisonics, Proceedings of International Computer Music Conference, http://hdl.handle.net/2027/spo.bbp2372 2000.194.
15. Ozga A. (2017), Scientific ideas included in the concepts of bioacoustics, acoustic ecology, ecoacoustics, soundscape ecology and vibroacoustics, Archives of Acoustics, 42(3): 415–421, https://doi.org/10.1515/aoa-2017-0043
16. Rumsey F. (2001), Spatial Audio, Jordan Hill, Oxford: Focal Press.
17. Sato S. (2014), MATLAB program for calculating the parameters of the autocorrelation and interaural cross-correlation functions based on Ando’s auditory-brain model, Proceedings of Audio Engineering Society 137th Convention, http://www.aes.org/elib/browse.cfm?elib=17504
18. Satongar D., Dunn C., Lam Y., Li F. (2013), Localisation performance of higher-order Ambisonics for off-centre listening, White Paper, WHP254, https://www.bbc.co.uk/rd/publications/whitepaper254
19. Scaini D., Arteaga D. (2014), Decoding of higher order ambisonics to irregular periphonic loudspeaker arrays, Proceedings of Audio Engineering Society 55th Convention, http://www.aes.org/e-lib/browse.cfm?elib =17364.
20. Wersényi G. (2009), Effect of emulated head-tracking for reducing localization errors in virtual audio simulation, IEEE Transactions on Audio, Speech, and Language Processing, 17(2): 247–252, https://doi.org/10.1109/TASL 2008.2006720.
21. Yao S.-N. (2014), Driver filter design for software-implemented loudspeaker crossovers, Archives of Acoustics, 39(4): 591–597, https://doi.org/10.2478/aoa-2014-0063
22. Yao S.-N. (2017), Headphone-based immersive audio for virtual reality headsets, IEEE Transactions on Consumer Electronics, 63(3): 300–308, doi: 10.1109/ TCE.2017.014951.
23. Yao S.-N. (2018), Equalization in ambisonics, Applied Acoustics, 139: 129–139, https://doi.org/10.1016/j.apacoust 2018.04.027.
24. Yao S.-N., Chen L.J. (2013), HRTF adjustments with audio quality assessments, Archives of Acoustics, 38(1): 55–62, https://doi.org/10.2478/aoa-2013-0007
25. Yao S.-N., Collins T., Jančovič P. (2015), Timbral and spatial fidelity improvement in ambisonics, Applied Acoustics, 93: 1–8, https://doi.org/10.1016/j.apacoust 2015.01.005.
26. Yao S.-N., Collins T., Liang C. (2017), Head-related transfer function selection using neural networks, Archives of Acoustics, 42(3): 365–373, doi: 10.1515/ aoa-2017-0038.
2. Blauert J., Rabenstein R. (2012), Providing surround sound with loudspeakers: a synopsis of current methods, Archives of Acoustics, 37(1): 55–62, https://doi.org/10.2478/v10168-012-0002-y
3. Clark H.A.M., Dutton G.F., Vanderlyn P.B. (1958), The ‘stereosonic’ recording and reproducing system: a two-channel systems for domestic tape records, Journal of the Audio Engineering Society, 6(2): 102–117, https://doi.org/10.1049/pi-b-1.1957.0180
4. Collins T. (2013), Binaural ambisonic decoding with enhanced lateral localization, Proceedings of Audio Engineering Society 134th Convention, http://www.aes org/e-lib/browse.cfm?elib=16779.
5. D’Orazio D., Guidorzi P., Garai M. (2009), A Matlab Toolbox for the analysis of Ando’s factors, Proceedings of Audio Engineering Society 126th Convention, http://www.aes.org/e-lib/browse.cfm?elib=14994
6. Gaik W. (1993), Combined evaluation of interaural time and intensity differences: psychoacoustic results and computer modelling, The Journal of the Acoustical Society of America, 94(1): 98–110, https://doi.org/10.1121/1.406947
7. Gardenfors D. (2003), Designing sound-based computer games, Digital Creativity, 14(2): 111–114, https://doi.org/10.1076/digc.14.2.111.27863
8. Gaudy T., Natkin S., Archambault D. (2009), Pyvox 2: An audio game accessible to visually impaired people playable without visual nor verbal instructions, Transactions on Edutainment II, 5660: 176–186, https://doi.org/10.1007/978-3-642-03270-7_12
9. Gerzon M.A. (1975), The design of precisely coincident microphone arrays for stereo and surround sound, Proceedings of Audio Engineering Society 50th Convention, London, http://www.aes.org/e-lib/ browse.cfm?elib=2466.
10. IRCAM (2002), Listen HRTF database, http://recher che.ircam.fr/equipes/salles/listen/.
11. Kleczkowski P., Krol A., Malecki P. (2015), Reproduction of phantom sources improves with separation of direct and reflected sounds, Archives of Acoustics, 40(4): 575–584, https://doi.org/10.1515/aoa-2015-0057
12. Matsumura T., Iwanaga N., Kobayashi W., Onoye T., Shirakawa I. (2005), Embedded 3D sound movement system based on feature extraction of headrelated transfer function, IEEE Transactions on Consumer Electronics, 51(1): 262–267, https://doi.org/10.1109/TCE 2005.1405730.
13. McKeag A., McGrath D. (1996), Sound field format to binaural decoder with head-tracking, Proceedings of 6th Australian Regional Audio Engineering Society Convention, http://www.aes.org/e-lib/ browse.cfm?elib=7477.
14. Monro G. (2000), In-phase corrections for ambisonics, Proceedings of International Computer Music Conference, http://hdl.handle.net/2027/spo.bbp2372 2000.194.
15. Ozga A. (2017), Scientific ideas included in the concepts of bioacoustics, acoustic ecology, ecoacoustics, soundscape ecology and vibroacoustics, Archives of Acoustics, 42(3): 415–421, https://doi.org/10.1515/aoa-2017-0043
16. Rumsey F. (2001), Spatial Audio, Jordan Hill, Oxford: Focal Press.
17. Sato S. (2014), MATLAB program for calculating the parameters of the autocorrelation and interaural cross-correlation functions based on Ando’s auditory-brain model, Proceedings of Audio Engineering Society 137th Convention, http://www.aes.org/elib/browse.cfm?elib=17504
18. Satongar D., Dunn C., Lam Y., Li F. (2013), Localisation performance of higher-order Ambisonics for off-centre listening, White Paper, WHP254, https://www.bbc.co.uk/rd/publications/whitepaper254
19. Scaini D., Arteaga D. (2014), Decoding of higher order ambisonics to irregular periphonic loudspeaker arrays, Proceedings of Audio Engineering Society 55th Convention, http://www.aes.org/e-lib/browse.cfm?elib =17364.
20. Wersényi G. (2009), Effect of emulated head-tracking for reducing localization errors in virtual audio simulation, IEEE Transactions on Audio, Speech, and Language Processing, 17(2): 247–252, https://doi.org/10.1109/TASL 2008.2006720.
21. Yao S.-N. (2014), Driver filter design for software-implemented loudspeaker crossovers, Archives of Acoustics, 39(4): 591–597, https://doi.org/10.2478/aoa-2014-0063
22. Yao S.-N. (2017), Headphone-based immersive audio for virtual reality headsets, IEEE Transactions on Consumer Electronics, 63(3): 300–308, doi: 10.1109/ TCE.2017.014951.
23. Yao S.-N. (2018), Equalization in ambisonics, Applied Acoustics, 139: 129–139, https://doi.org/10.1016/j.apacoust 2018.04.027.
24. Yao S.-N., Chen L.J. (2013), HRTF adjustments with audio quality assessments, Archives of Acoustics, 38(1): 55–62, https://doi.org/10.2478/aoa-2013-0007
25. Yao S.-N., Collins T., Jančovič P. (2015), Timbral and spatial fidelity improvement in ambisonics, Applied Acoustics, 93: 1–8, https://doi.org/10.1016/j.apacoust 2015.01.005.
26. Yao S.-N., Collins T., Liang C. (2017), Head-related transfer function selection using neural networks, Archives of Acoustics, 42(3): 365–373, doi: 10.1515/ aoa-2017-0038.

