Archives of Acoustics, 46, 1, pp. 67–78, 2021

Acoustic Source Localization Using Kernel-based Extreme Learning Machine in Distributed Microphone Array

Dalian University of Technology

Dalian University of Technology

Fuliang YIN
Dalian University of Technology

Acoustic source localization using distributed microphone array is a challenging task due to the influences of noise and reverberation. In this paper, acoustic source localization using kernel-based extreme learning machine in distributed microphone array is proposed. Specifically, the space of interest is divided into some labeled positions, and the candidate generalized cross correlation function in each node is treated as the feature mapped into the hidden nodes of extreme learning machine. During the training phase, by the implementation of kernel function, the output weights of the classifier are calculated and do not need to be tuned. After the kernel-based extreme learning machine (K-ELM) is well trained, the measured generalized cross correlation data are fed into the K-ELM classifier, and the output is the estimated acoustic source position. The proposed method needs less human intervention for both training and testing and it does not need to calibrate the node in advance. Simulation and real-world experimental results reveal that the proposed method has extremely fast training and testing speeds, and can obtain better localization performance than steered response power, K-nearest neighbor, and support vector machine methods.
Keywords: extreme learning machine; acoustic source localization; distributed microphone array; generalized cross correlation function
Full Text: PDF
Copyright © The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).


Allen J.B., Berkley D.A. (1979), Image method for efficiently simulating small room acoustics, The Journal of the Acoustical Society of America, 65(4): 943–950, doi: 10.1121/1.382599.

Canclini A., Bestagini P., Antonacci F., Compagnoni M., Sarti A, Tubaro, S. (2015), A robust and low-complexity source localization algorithm for asynchronous distributed microphone networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23, 10, 1563–1575, doi: 10.1109/taslp.2015.2439040.

Chang C.C., Lin C.J. (2011), LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2(3): 27, doi: 10.1145/1961189.1961199.

Cheng S., Xu Y., Zong R, Wang C. (2019), A fast decision making method for mandatory lane change using kernel extreme learning machine, International Journal of Machine Learning and Cybernetics, 10(12): 3363–3369, doi:10.1007/s13042-019-00923-8.

Crocco M., Bue A.D., Murino V. (2012), A bilinear approach to the position self-calibration of multiple sensors, IEEE Transactions on Signal Processing, 60(2): 660–673, doi: 10.1109/tsp.2011.2175387.

Ferguson E.L., Williams S.B., Jin C.T. (2018), Sound source localization in a multipath environment using convolutional neural networks, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2386–2390, Seoul, South Korea, doi: 10.1109/ICASSP.2018.8462024.

Gu Y., Chen Y., Liu J., Jiang X. (2015), Semi-supervised deep extreme learning machine for Wi-Fi based localization, Neurocomputing, 166: 282–293, doi: 10.1016/j.neucom.2015.04.011.

Hengy S., Duffne, P., DeMezzo S., HeckS., Gross, L., Naz P. (2016), Acoustic shooter localisation using a network of asynchronous acoustic nodes, IET Radar, Sonar & Navigation, 10(9): 1528–1535, doi: 10.1109/ICASSP.2018.8462024.

Ho K.C. (2012), Bias reduction for an explicit solution of source localization using TDOA, IEEE Transactions on Signal Processing, 60(5): 2101–2114, doi: 10.1109/tsp.2012.2187283.

Huang G.B., Wang D.H., Lan Y. (2011), Extreme learning machines: a survey, International Journal of Machine Learning and Cybernetics, 2(2): 107–122, doi: 10.1007/s13042-011-0019-y.

Huang, G. B., Zhou, H., Ding, X., Zhang, R. (2012), Extreme learning machine for regression and multiclass classification, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(2): 513–529, doi: 10.1109/TSMCB.2011.2168604.

Huang G.B., Zhu Q.Y., Siew C.K. (2006), Extreme learning machine: Theory and applications. Neurocomputing, 70(1–3): 489–501, doi: 10.1016/j.neucom.2005.12.126.

Kan Y., Wang P., Zha F., Li M., Gao W., Song B. (2015), Passive acoustic source localization at a low sampling rate based on a five-element cross microphone array, Sensors (Basel), 15(6): 13326–13347, doi: 10.3390/s150613326.

Khanal S., Silverman H.F., Shakya R.R. (2013), A free-source method (FrSM) for calibrating a large-aperture microphone array, IEEE Transactions on Audio, Speech, and Language Processing, 21(8): 1632–1639, doi: 10.1109/tasl.2013.2256896.

Knapp C.H., Carter G.C. (1976), The generalized correlation method for estimation of time delay, IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(4): 320–327, doi: 10.1109/TASSP.1976.1162830.

Kongsorot Y., Horata P., Musikawan P., Sunat K. (2019), Kernel extreme learning machine based on fuzzy set theory for multi-label classification, International Journal of Machine Learning and Cybernetics, 10(5): 979–989, doi: 10.1007/s13042-017-0776-3.

Lim H., Yoo I.-C., Cho Y., Yook D. (2015), Speaker localization in noisy environments using steered response voice power, IEEE Transactions on Consumer Electronics, 61(1): 112–118, doi: 10.1109/TCE.2015.7064118.

Lima M.V.S. et al. (2015), A volumetric SRP with refinement step for sound source localization, IEEE Signal Processing Letters, 22(8) 1098–1102, doi: 10.1109/lsp.2014.2385864.

Nakano A.Y., Nakagawa S., Yamamoto K. (2009), Automatic estimation of position and orientation of an acoustic source by a microphone array network, The Journal of the Acoustical Society of America, 126(6): 3084–3094, doi: 10.1121/1.3257548.

Nunes LO. et al. (2014), A steered-response power algorithm employing hierarchical search for acoustic source localization using microphone arrays, IEEE Transactions on Signal Processing, 62(19): 5171–5183, doi: 10.1109/tsp.2014.2336636.

Principi E., Squartini S., Cambria E., Piazza F. (2015), Acoustic template-matching for automatic emergency state detection: An ELM based algorithm, Neurocomputing, 149: 426–434, doi: 10.1016/j.neucom.2014.01.067.

Salvati D., Drioli C., Foresti G.L. (2016), A weighted MVDR beamformer based on SVM learning for sound source localization, Pattern Recognition Letters, 84: 15–21, doi: 10.1016/j.patrec.2016.07.003

Salvati D., Drioli C., Foresti G.L. (2018), Exploiting CNNs for improving acoustic source localization in noisy and reverberant conditions, IEEE Transactions on Emerging Topics in Computational Intelligence, 2(2): 103–116, doi: 10.1109/tetci.2017.2775237.

Stone M. (1974), Cross-validatory choice and assessment of statistical predictions, Journal of the Royal Statistical Society. Series B (Methodological), 36(2): 111–147, doi: 10.1111/j.2517-6161.1974.tb00994.x.

Tian Y., Chen Z., Yin F. (2015), Distributed IMM-Unscented Kalman filter for speaker tracking in microphone array networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(10): 1637–1647, doi: 10.1109/taslp.2015.2442418.

Vera-Diaz J.M., Pizarro D., Macias-Guarasa J. (2018), Towards end-to-end acoustic localization using deep learning: from audio signals to source position coordinates, Sensors (Basel), 18(10): 3418, doi: 10.3390/s18103418.

Wan X., Wu Z. (2013), Sound source localization based on discrimination of cross-correlation functions, Applied Acoustics, 74(1): 28–37, doi: 10.1016/j.apacoust.2012.06.006.

Xiao X., Zhao S., Zhong X., Jones D.L., Chng E.S., Li H. (2015), A learning-based approach to direction of arrival estimation in noisy and reverberant environments, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2814–2818, Brisbane, Australia, doi: 10.1109/ICASSP.2015.7178484.

Zhang Q., Chen Z., Yin F. (2016), Distributed marginalized auxiliary particle filter for speaker tracking in distributed microphone networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(11): 1921–1934, doi: 10.1109/taslp.2016.2590146.

Zhang Q., Chen Z., Yin F. (2013), Microphone clustering and BP network based acoustic source localization in distributed microphone arrays, Advances in Electrical and Computer Engineering, 13(4): 33–40, doi: 10.4316/aece.2013.04006.

DOI: 10.24425/aoa.2021.136561