A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.
Published in | Science Journal of Circuits, Systems and Signal Processing (Volume 4, Issue 1) |
DOI | 10.11648/j.cssp.20150401.12 |
Page(s) | 1-8 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2015. Published by Science Publishing Group |
Speech Enhancement, Ensemble Empirical Mode Decomposition, Source Separation, Independent Subspace Analysis, Hilbert Spectrum, Wavelet Packet Decomposition
[1] | H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind Source Separation Combining Independent Component Analysis and Beamforming.” EURASIP Journal on Applied Signal Processing, vol. 11, pp. 1135-1146, 2003. |
[2] | J. M. Valin, J. Rouat, and F. Michaud, “Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter,” Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2004. |
[3] | Y. Ephraim, and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 32, pp. 1109-1121, 1984. |
[4] | O. Cappe, “Estimation of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 2, pp. 345-349, 1994. |
[5] | S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 27, pp. 113-120, 1979. |
[6] | G. J. Brown, and M. Cooke,“Computational auditory scene analysis,” Computer Speech Language, vol. 8(4), pp. 297-336, 1994. |
[7] | M. A.Casey, and A. Westner, “Separation of mixed audio sources by independent subspace analysis,” Proc. of International Computer Music Conference, pp. 154-161, 2000. |
[8] | M. K. I. Molla, and K. Hirose, “Single mixture audio source separation by subspace decomposition of Hilbert spectrum,” IEEE transactions on audio, speech and language processing, vol. 15(3), pp. 893-900, 2007. |
[9] | Y. Ghanbari, and M. R. K. Mollaei, “A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets”, Speech Communications, Elsevier, vol. 48, pp. 927-940, 2006. |
[10] | N. E. Huang, Z.Shen, S. R Long, et al. “The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. Roy. Soc. London A, vol. 454, pp. 903-995, 1998. |
[11] | Z. Wu, and N. E. Huang, “Ensemble empirical mode decomposition: a noise-assisted data analysis method,” Advances in Adaptive Data Analysis, vol. 1(1), 2009. |
[12] | A. Hyvärinen, and E. Oja, “Independent component analysis: algorithms and applications,”Neural Networks, vol.13(4-5), pp. 411-430, 2000. |
[13] | J. F. Cardoso, and A. Souloumiac, “Blind beamforming for nongaussian signals,” IEE Proceedings-F,pp. 362-370, 1993. |
[14] | J. Rosca, D.Erdogmus, J. Princip, and S. Haykin, Independent component analysis and blind signal separation, Springer, 2006. |
[15] | R. A. Singer, R. G. Sea, “A new filter for optimal tracking in dense multi-target environment,” Proceedings of the ninth Allerton Conference Circuit and System Theory. Urbana-Champaign, USA: Univ. of Illinois, pp. 201-211,1971. |
[16] | N. E. Huang, et al.,“Application of Hilbert-Huang transform to non-stationary financial time series analysis,” Applied Stochastic Model in Business and Industry, vol. 19, pp. 245-268, 2003. |
APA Style
Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. (2015). Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Science Journal of Circuits, Systems and Signal Processing, 4(1), 1-8. https://doi.org/10.11648/j.cssp.20150401.12
ACS Style
Md. Ekramul Hamid; Md. Khademul Islam Molla; Md. Iqbal Aziz Khan; Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci. J. Circuits Syst. Signal Process. 2015, 4(1), 1-8. doi: 10.11648/j.cssp.20150401.12
AMA Style
Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci J Circuits Syst Signal Process. 2015;4(1):1-8. doi: 10.11648/j.cssp.20150401.12
@article{10.11648/j.cssp.20150401.12, author = {Md. Ekramul Hamid and Md. Khademul Islam Molla and Md. Iqbal Aziz Khan and Takayoshi Nakai}, title = {Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding}, journal = {Science Journal of Circuits, Systems and Signal Processing}, volume = {4}, number = {1}, pages = {1-8}, doi = {10.11648/j.cssp.20150401.12}, url = {https://doi.org/10.11648/j.cssp.20150401.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cssp.20150401.12}, abstract = {A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.}, year = {2015} }
TY - JOUR T1 - Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding AU - Md. Ekramul Hamid AU - Md. Khademul Islam Molla AU - Md. Iqbal Aziz Khan AU - Takayoshi Nakai Y1 - 2015/04/29 PY - 2015 N1 - https://doi.org/10.11648/j.cssp.20150401.12 DO - 10.11648/j.cssp.20150401.12 T2 - Science Journal of Circuits, Systems and Signal Processing JF - Science Journal of Circuits, Systems and Signal Processing JO - Science Journal of Circuits, Systems and Signal Processing SP - 1 EP - 8 PB - Science Publishing Group SN - 2326-9073 UR - https://doi.org/10.11648/j.cssp.20150401.12 AB - A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement. VL - 4 IS - 1 ER -