QDAT: A data set for Reciting the Quran
Abstract
Dataset are considered as an important part of any audio research and an important resource for speech processing. Availability of dataset in speech processing field is important. The effort and time needed to build a complete good dataset are very long. The available public dataset in Arabic language are very little. This paper presents the "QDAT" dataset of audio Arabic speech files. The audio files are manually annotated by expert to show the correctness of the Reciting the Quran with Tajwid according to three rules of recitation of Quran. The dataset can be used for training and classification models based on machine learning and deep learning algorithms.
Full Text:
PDFReferences
• Abushariah, M. A. A. M., Ainon, R. N., Zainuddin, R., Elshafei, M., & Khalifa, O. O. (2012). Arabic speaker-independent continuous automatic speech recognition based on a phonetically rich and balanced speech corpus. Int. Arab J. Inf. Technol., 9(1), 84-93.
• Ahamad, A., Anand, A., & Bhargava, P. (2020). AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition. arXiv preprint arXiv:2005.07973.
• Alsulaiman, M., Muhammad, G., Bencherif, M. A., Mahmood, A., & Ali, Z. (2013). KSU rich Arabic speech database. Information (Japan), 16(6 B), 4231-4253..
• Elmadany, A., Mubarak, H., & Magdy, W. (2018). Arsas: An arabic speech-act and sentiment corpus of tweets. OSACT, 3, 20.
• Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic Visual Dataset for Visual Speech Recognition. Procedia Computer Science, 163, 400-409.
• Iakushkina, O., Fedoseev, G., & Shaleva, A. (2018). Building corpora of transcribed speech from open access sources. Advisory committee, 140.
• Oo, Y. M., Wattanavekin, T., Li, C., De Silva, P., Sarin, S., Pipatsrisawat, K., ... & Gutkin, A. (2020, May). Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 6328-6339).
• Selouani, S. A., & Boudraa, M. (2010). Algerian Arabic speech database (ALGASD): corpus design and automatic speech recognition application. Arabian Journal for Science and Engineering, 35(2), 157-166..
Refbacks
- There are currently no refbacks.