Extraction of Weighted Saliency Maps in Modelling Bottom-Up Auditory Attention

Document Type : Original Article

Authors

1 Tabriz University Faculty of Electrical and Computer Eng. Dept. of Biomedical Eng.

2 Tabriz University

Abstract

Hearing is an important part of human daily life. Although humans are exposed to various sounds from different sources and the numbers of receptors of the neural system are limited, they can process complex auditory scenes well. One of the reasons for this human ability is the phenomenon of attention. Auditory attention can be divided into two categories: bottom-up attention and top-down attention. In this paper, a model for simulating the bottom-up attention using weighted saliency maps in the auditory system is proposed. The dataset in this research work is obtained by combining different background noises with the sounds in the ESC database as salient regions, at different SNRs. To evaluate the model, the mean-error criterion was used, which is defined as the time difference between the actual salient point and the salient point detected by the model. The weighted combination of the conspicuity maps of the features using the Genetic algorithm makes the proposed model with an average error of 0.92 seconds to perform better than the baseline model having an average error of 1.91 seconds.

Keywords


[1] Desimone, J. Duncan, “Neural mechanisms of selective visual attention,” Annual Review of Neuroscience, vol. 18, no. 1, pp. 193-222, 1995.
[2] M. Kaya, M. Elhilali, “Modelling auditory attention,” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 372, no. 1714, p. 20160101, 2017.
[3] Koch, S. Ullman, “Shifts in selective visual attention: towards the underlying neural circuitry,” Matters of Intelligence: Springer, pp. 115-141, 1987.
[4] Itti, C. Koch, E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, 1998.
[5] Kayser, C. I. Petkov, M. Lippert, N. K. Logothetis, “Mechanisms for allocating auditory attention: an auditory saliency map,” Current Biology, vol. 15, no. 21, pp. 1943-1947, 2005.
[6] Kalinli, S. S. Narayanan, “A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech,” Eighth Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August, 2007.
[7] Duangudom, D. V. Anderson, “Using auditory saliency to understand complex auditory scenes,” 15th European Signal Processing Conference, Poznan, Poland, pp. 1206-1210, September, 2007.
[8] Wang, G. J. Brown, “Fundamentals of computational auditory scene analysis,” John Wiley and Sons, 2006.
[9] Slaney, “Auditory toolbox,” Interval Research Corporation, Tech. Rep, vol. 10, 1998.
Meddis, M. J. Hewitt, T. M. Shackleton, “Implementation details of a computation model of the inner hair‐cell auditory‐nerve  synapse,”  Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1813-1816, 199
Mehrotra, K. R. Namuduri, N. Ranganathan, “Gabor filter-based edge detection,” Pattern Recognition, vol. 25, no. 12, pp. 1479-1494, 1992.
Whitley, “A genetic algorithm tutorial,” Statistics and Computing, vol. 4, no. 2, pp. 65-85, 1994.
Font, G. Roma, X. Serra, “Freesound technical demo,” 21st ACM International Conference on Multimedia, Barcelona, Spain, pp. 411-412, October, 2013.
J. Piczak, “ESC: Dataset for environmental sound classification,” 23rd ACM International Conference on Multimedia, Brisbane, Australia, pp. 1015-1018, 2015.
Kalinli, Biologically inspired auditory attention models with applications in speech and audio processing, PhD thesis, University of Southern California, 2009.
S. G. de Almeida, V. C. Leite, Particle Swarm Optimization: A Powerful Technique for Solving Engineering Problems, Swarm Intelligence-Recent Advances, New Perspectives and Applications: IntechOpen, 2019.
Ghaemi, M.-R. Feizi-Derakhshi, “Forest optimization algorithm,” Expert Systems with Applications, vol. 41, no. 15, pp. 6676-6687, 2014.