next up previous contents
Next: Acknowledgments Up: wan_manuscript Previous: Noise-Regularized Adaptive Filtering

Summary and Conclusions


In this chapter we have provided an overview of a number of different neural network approaches to speech enhancement. We can summarize the techniques and their assumptions as follows:





While considerable progress has been made with these techniques, a number of key areas must still be addressed before we can expect widespread acceptance. Most important is the establishment of consistent evaluations to allow proper benchmarking between different approaches. Standardized databases should be used, with a variety of noise sources that include real world examples and go beyond the simple white Gaussian noise assumption. Performance should be determined from established metrics (improvement in SNR, segmental SNR, Itakura distance, weighted spectral slope measures, mean-opinion-scores, recognition accuracy, etc.). In addition, the basic techniques presented here must evolve to better incorporate perceptually relevant metrics for optimization. This is an area where research on neural networks still lags considerably behind the traditional speech community.

Finally, the accurate estimation of the corrupting noise statistics remains a weak link in the algorithms that require these estimates as inputs. Research must be conducted to improve these estimates, or new techniques developed which avoid the need for explicit knowledge of the noise statistics.

In spite of the decades of work that has gone into understanding speech signals and issues in speech enhancement, the seemingly simple task of removing noise remains a formidable challenge. While it is still too early to draw definite conclusions, neural networks appear to offer an appropriate and powerful tool for further progress in this challenge.




next up previous contents
Next: Acknowledgments Up: wan_manuscript Previous: Noise-Regularized Adaptive Filtering   Contents