Nava: A Persian Traditional Music Database for the Dastgah and Instrument Recognition Tasks

Document Type : Original Article

Authors

School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran

Abstract

Extensive research has been conducted in the field of music signal processing which targeted context-based music information retrieval. Unfortunately, research on the computer-based processing of the traditional Persian music is rare, which is due to lack of standard databases. In this paper, a database, named Nava, is introduced for two basic tasks of the traditional Persian music field, Dastgah classification and instrument recognition. In terms of instrument, Dastgah and artist, Nava has enough comprehensiveness and variety. It contains the sound of five common traditional instruments played by 40 artists in seven Dastgahs. In order to address the two mentioned basic tasks, a system is proposed which extracts a sequence of Mel frequency cepstral coefficients (MFCC) feature vectors from input music signal and then converts it to a fixed-length feature vector using i-vector technique. In the classification stage, the extracted i-vector is fed into a support vector machine classifier. The best obtained accuracy on the Nava database for the Dastgah classification and instrument recognition are about 34% and 98% respectively, which indicates the difficulty of the former in comparison with the latter.

Keywords

Main Subjects


[1] Y. V. Murthy, and S.G. Koolagudi, “Content-based music information retrieval (cb-mir) and its applications toward the music industry: A review,” ACM Computing Surveys, vol. 51, no. 3, 2018.
[2] A. Bonjyotsna and M. Bhuyan, “Signal processing for segmentation of vocal and non-vocal regions in songs: A review,” International Conference on Signal Processing Image Processing and Pattern Recognition, Los Alamitos, CA, pp. 87–91, 2013.
[3] T. Zhang, “Automatic singer identification,” International Conference on Multimedia and Expo., vol. 1, pp. 1-33, 2003.
[4] N. Scaringella, G. Zoia, and D. Mlynek, “Automatic genre classification of music content: A survey,” IEEE Signal Process. Mag. vol. 23, no. 2, pp. 133–141, 2006.
[5] A. Kotsifakos, P. Papapetrou, J. Hollmen, D. Gunopulos, and V. Athitsos,  “A survey of query-by-humming similarity methods, ” In Proceedings of the 5th International Conference on Pervasive Technologies Related to Assistive Environments. ACM, New York, NY, 5. 2012
[6] B. Y. Chua,  Automatic Extraction of Perceptual Features and Categorization of Music Emotional Expressionsfrom Polyphonic Music Audio Signals, Ph.D. Thesis, Monash University, 2008.
[7] A. Eronen, Automatic Musical Instrument Recognition, Master’s Thesis, Department of Information Technology, Tampere University of Technology, 2001
[8] Z. Fu, G. Lu, K. M. Ting, and D. Zhang, ,  “A survey of audio-based music classification and annotation,” IEEE Trans. Multimed. Vol. 13, no. 2, pp. 303–319, 2011
[9] L. Miller, Music and Song in Persia (RLE Iran B): The Art of Avaz. Routledge, 2012.
[10] سار محمودان، ایوب بنوشی،  «دسته‌بندی خودکار گام ماهور موسیقی ایرانی توسط یک شبکه عصبی مصنوعی»،  دومین کنفرانس بین المللی آکوستیک و ارتعاشات، دانشگاه صنعتی شریف، دیماه 1391
[11] صابر عبداله زادگان، شهرام جعفری، مرتضی دیرند، «تشخیص خودکار دستگاه و گام موسیقی سنتی ایرانی مبتنی بر تکنوازی سازهای تار و سنتور به وسیله استخراج نت هوشمند»، بیستمین کنفرانس ملی سالانه انجمن کامپیوتر ایران، دانشگاه فردوسی مشهد، اسفند 1393
[12] M. A. Layegh, S. Haghipour, & Y. N. Sarem, “Classification of the Radif of Mirza Abdollah a canonic repertoire of Persian music using SVM method,” Gazi University Journal of Science Part A: Engineering and Innovation, vol. 1, no. 4, pp. 57-66, 2013.
[13] S. R. Azar, A. Ahmadi, S. Malekzadeh, and M. Samami, “Instrument-Independent Dastgah Recognition of Iranian Classical Music Using AzarNet,” arXiv preprint arXiv:1812.07017, 2018.
[14] S. Abdoli, “Iranian Traditional Music Dastgah Classification,” In ISMIR, pp. 275-280, 2011.
[15] B. Beigzadeh, M. Belali Koochesfahani, “Classification of Iranian traditional musical modes (DASTGÄH) with artificial neural network,” Journal of Theoretical and Applied Vibration and Acoustics, vol. 2, no. 2, pp.107-118, 2016.
[16] H. Hajimolahoseini, R. Amirfattahi, and M. Zekri, “Real-time classification of Persian musical Dastgahs using artificial neural network,” In The 16th CSI International Symposium on Artificial Intelligence and Signal Processing, pp. 157-160,  2012.
[17] H. Farhat, The dastgah concept in Persian music, Cambridge University Press, 2004.
[18] B. Logan, “Mel Frequency Cepstral Coefficients for Music Modeling,” In ISMIR, vol. 270,  pp. 1-11, 2000
[19] N. Dehak,  P. Kenny, R. Dehak, and et al. “Front-end factor analysis for speaker verification,” IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 4, pp. 788–798, 2011.
[20] P. Kenny, G. Boulianne, P. Ouellet, and et al. “Joint factor analysis versus eigenchannels in speaker recognition,” IEEE Trans. Audio Speech Lang. Process., vol 15, no. 4, pp. 1435–1447, 2007.
[21] N. Dehak, P. A. Torres-Carrasquillo, D. A. Reynolds, and et al. “Language recognition via i-vectors and dimensionality reduction,” InterSpeech, pp. 857–860, 2011.
[22] M. H. Bahari, R. Saeidi, D. Van Leeuwen, and et al. “Accent recognition using i-vector, Gaussian mean supervector and Gaussian posterior probability supervector for spontaneous telephone speech,” IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 7344–7348, 2013.
[23] R. Xia, Y. Liu, “Using i-vector space model for emotion recognition,” InterSpeech, 2012.
[24] H. Eghbal-zadeh, B. Lehner, M. Dorfer, and et al. “CP-JKU submissions for DCASE-2016: a hybrid approach using binaural i-vectors and deep convolutional neural networks,” 2016.
[25] X. Wei, and L. Wenju, “Multilingual I-Vector based Statistical Modeling for Music Genre Classification,” InterSpeech, 2017.
[26] H. Zeinali, B. BabaAli, and H. Hadian,“Online signature verification using i-vector representation. IET Biometrics,” vol. 7, no. 5, pp. 405-414, 2017.