A Signal Subspace Speech Enhancement Approach Based on Joint Low-Rank and Sparse Matrix Decomposition

Chengli SUN; Jianxiao XIE; Yan LENG

doi:10.1515/aoa-2016-0024

Authors

Chengli SUN Nanchang Hangkong University, China
Jianxiao XIE Nanchang Hangkong University, China
Yan LENG Shandong Normal University, China

Abstract

Subspace-based methods have been effectively used to estimate enhanced speech from noisy speech samples. In the traditional subspace approaches, a critical step is splitting of two invariant subspaces associated with signal and noise via subspace decomposition, which is often performed by singular-value decomposition or eigenvalue decomposition. However, these decomposition algorithms are highly sensitive to the presence of large corruptions, resulting in a large amount of residual noise within enhanced speech in low signal-to-noise ratio (SNR) situations. In this paper, a joint low-rank and sparse matrix decomposition (JLSMD) based subspace method is proposed for speech enhancement. In the proposed method, we firstly structure the corrupted data as a Toeplitz matrix and estimate its effective rank value for the underlying clean speech matrix. Then the subspace decomposition is performed by means of JLSMD, where the decomposed low-rank part corresponds to enhanced speech and the sparse part corresponds to noise signal, respectively. An extensive set of experiments have been carried out for both of white Gaussian noise and real-world noise. Experimental results show that the proposed method performs better than conventional methods in many types of strong noise conditions, in terms of yielding less residual noise and lower speech distortion.

Keywords:

subspace speech enhancement, singular value decomposition, joint low-rank and sparse ma- trix decomposition.

References

1. Abolhassani A.H., Selouani S.-A., O’Shaughnessy D. (2007), Speech enhance-ment using PCA and variance of the reconstruction error model identification, Automatic Speech Recognition & Understanding.

2. Bakamides S., Dendrinos M., Carayannis G. (1991), SVD analysis by synthesis of harmonic signals, IEEE Trans. Signal Processing, 39, 472–477.

3. Boll S.F. (1979), Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust. Speech Signal Process, 27, 113–120.

4. Candes E.J., Plan Y. (2010), Matrix Completion With Noise.

5. Candes E.J., Terence T. (2010), The power of convex relaxation: near-optimal matrix completion, IEEE Transactions on Information Theory, 56, 2053–2080.

6. Candes E.J., Li X., Ma Y., Wright J. (2011), Robust Principal Component Analysis?, Journal of the ACM, 58, 1–37.

7. Chang S.G., Yu B., Vetterli M. (2000), Adaptive Wavelet Thresholding for Image Denoising and Compression, IEEE Transactions on Information Theory, 9, 1532–1547.

8. Chambers J. (1977), Computational method for data analysis, New York, Wiley.

9. Dendrinos M., Bakamides S., Carayannis G. (1991), Speech enhancement from noise: A regenerative approach, Speech Communication, 10, 45–57.

10. Ephraim Y., Malah D. (1984), Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process, ASSP-32, 109–1121.

11. Ephraim Y., Van Trees H. (1995), A signal subspace approach for speech enhancement, IEEE Trans. Speech Audio Process., 3, 251–266.

12. Fazel M., Candes E., Recht B., Parrilo P. (2008), Compressed sensing and robust recovery of low rank matrices, [in:] Asilomar Conf. Signals, Systems, and Computers, Pacific Grove, CA.

13. Gannot S., Burshtein D., Weinstein E. (1998), Iterative and Sequential Kalman filter based speech enhancement algorithms, IEEE Trans. Acoust. Speech Signal Process, 6, 373–385.

14. Golub G., Van Loan C. (1989), Matrix computations, 2nd ed, Baltimore, MD: The Johns Hopkins University Press.

15. Hu Y., Loizou P.C. (2003), A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise, IEEE Trans. on Speech and Audio Processing, 11, 334–341.

16. Hu Y., Loizou P. (2008), Evaluation of objective quality measures for speech enhancement, IIEEE Trans. Speech Audio Process., 16, 229–238.

17. Jax P., Vary P. (2003), Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov medol, TEEE International Conference on Acoudtics, Speech, and Signal Processing, 8, 680–683.

18. Jin W., Scordilis M.S. (2006), Speech enhancement by residual domain constrained optimization, Speech Communication, 148.

19. Jolliffe I.T. (2002), Principal Component Analysis, Springer, New York.

20. Kim J.B., Lee K.Y., Lee C.W. (2000), On the applications of the interacting multiple model algorthm for enhancing noisy speech, IEEE Trans. Acoust. Speech Signal Process, 8, 349–352.

21. Mallat S. (1999), A Wavelet Tour of Signal Processing, California: Academic press 2nd Edition.

22. Mardani M., Mateos G. (2013), Recovery of low-rank plus compressed sparse matrices with application to unveiling traffic anomalies, IEEE Trans. Inf. Theory, 59.

23. Moor B. (1993), The singular value decomposition and long and short spaces of noisy matrix, IEEE Transactions on Signal Processing, 41, 9, 2826–2838.

24. Peng Y., Ganesh A., Wright J., Xu W., Ma Y. (2012), RASL: Robust Alignment by Sparse and Low-rank Decomposition for Linearly Correlated Images, IEEE Transactions on Pattern Analysis and Machine Intelligence.

25. Plapous C., Marro C., Scalart P. (2006), Improved Signal-to-Noise Ratio Estimation for Speech Enhancement, IEEE Transactions on Acoustics, Speech, and Signal Processing, 14, 2098–2108.

26. Quatieri T. (2002), Discrete-Time Speech Signal Processing: Principles and Practice, Prentice Hall, Upper Saddle River, NJ.

27. Saadoune A., Selouani A., Selouani S.A. (2014), Perceptual subspace speech enhancement using variance of the reconstruction error, Digital Signal Processing, 24.

28. Sun C., Zhang Q., Wang M. (2014), A novel speech enhancement method based on constrained low-rank and sparse matrix decomposition, Speech Communication, pp. 44–55.

29. Toh K., Yun S. (2010), An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems, Pacific J. Optim., pp. 615–640.

30. Tufts D., Kumaresan R. (1982), Esimation of frequencies of multiple sinusoids: Making linear prediction perform like maximum likelihood, Proc. IEEE, 70.

31. Tufts D., Kumaresan R., Kirsteins I. (1982), Data adaptive signal estimation by singular value decomposition of a data matrix, Proc. IEEE, 70, 684–685.

32. Vaseghi S.V. (2006), Advanced Digital Signal Processing and Noise Reduction, Third Edition, John Wiley & Sons Ltd.

33. Virag N. (1999), Single channel speech enhancement based on masking properties of the human auditory system[J], IEEE Trans. Acoust. Speech Signal Process, 7, 126–323.

34. Wright J., Peng Y., Ma Y. (2009), Robust Principal Component Analysis: Exact Recovery of Corrupted Low-rank Matrices by Convex Optimization, [in:] NIPS. 35. Xu H., Caramanis C., Sanghavi S. (2012), Robust PCA via outlier pursuit, IEEE Transactions on Information Theory, 58, 3047–3064.

36. Zehtabian A., Hassanpour H., Zehtabian S. (2010), A novel speech enhancement approach based on singular value decomposition and genetic algorithm, International Conference of Soft Computing and Pattern Recognition, pp. 430–435.

37. Zhou X., Yang C., Yu W. (2013), Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation, IEEE Trans. on Pattern Analysis and Machine Intelligence, 35, 597–610.

38. Zhou T., Tao D. (2011), GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case, [in:] Proceedings of the 28 th International Conference on Machine Learning, Bellevue, WA, USA.

Online first
2025, Vol 50
	No 1	No 2
2024, Vol 49
	No 1	No 2	No 3	No 4
2023, Vol 48
	No 1	No 2	No 3	No 4
2022, Vol 47
	No 1	No 2	No 3	No 4
2021, Vol 46
	No 1	No 2	No 3	No 4
2020, Vol 45
	No 1	No 2	No 3	No 4
2019, Vol 44
	No 1	No 2	No 3	No 4
2018, Vol 43
	No 1	No 2	No 3	No 4
2017, Vol 42
	No 1	No 2	No 3	No 4
2016, Vol 41
	No 1	No 2	No 3	No 4
2015, Vol 40
	No 1	No 2	No 3	No 4
2014, Vol 39
	No 1	No 2	No 3	No 4
2013, Vol 38
	No 1	No 2	No 3	No 4
2012, Vol 37
	No 1	No 2	No 3	No 4
2011, Vol 36
	No 1	No 2	No 3	No 4
2010, Vol 35
	No 1	No 2	No 3	No 4
2009, Vol 34
	No 1	No 2	No 3	No 4
2008, Vol 33
	No 1	No 2	No 3	No 4	No 4(S)
2007, Vol 32
	No 1	No 2	No 3	No 4	No 4(S)
2006, Vol 31
	No 1	No 2	No 3	No 4	No 4(S)
2005, Vol 30
	No 1	No 2	No 3	No 4
2004, Vol 29
	No 1	No 2	No 3	No 4
2003, Vol 28
	No 1	No 2	No 3	No 4
2002, Vol 27
	No 1	No 2	No 3	No 4
2001, Vol 26
	No 1	No 2	No 3	No 4
2000, Vol 25
	No 1	No 2	No 3	No 4
1999, Vol 24
	No 1	No 2	No 3	No 4
1998, Vol 23
	No 1	No 2	No 3	No 4
1997, Vol 22
	No 1	No 2	No 3	No 4
1996, Vol 21
	No 1	No 2	No 3	No 4
1995, Vol 20
	No 1	No 2	No 3	No 4
1994, Vol 19
	No 1	No 2	No 3	No 4
1993, Vol 18
	No 1	No 2	No 3	No 4
1992, Vol 17
	No 1	No 2	No 3	No 4
1991, Vol 16
	No 1	No 2	No 3-4
1990, Vol 15
	No 1-2		No 3-4
1989, Vol 14
	No 1-2		No 3-4
1988, Vol 13
	No 1-2		No 3-4
1987, Vol 12
	No 1	No 2	No 3-4
1986, Vol 11
	No 1	No 2	No 3	No 4
1985, Vol 10
	No 1	No 2	No 3	No 4
1984, Vol 9
	No 1-2		No 3	No 4
1983, Vol 8
	No 1	No 2	No 3	No 4
1982, Vol 7
	No 1	No 2	No 3-4
1981, Vol 6
	No 1	No 2	No 3	No 4
1980, Vol 5
	No 1	No 2	No 3	No 4
1979, Vol 4
	No 1	No 2	No 3	No 4
1978, Vol 3
	No 1	No 2	No 3	No 4
1977, Vol 2
	No 1	No 2	No 3	No 4
1976, Vol 1
	No 1	No 2	No 3	No 4

A Signal Subspace Speech Enhancement Approach Based on Joint Low-Rank and Sparse Matrix Decomposition

Downloads

Authors

Abstract

Keywords:

References

Most read articles by the same author(s)

cover

ippt-pan

Issue

Pages

Section

DOI

License

How to Cite

Principal Contact

Address

Support Contact