A first study on age classification of costa rican speakers based on acoustic vowel analysis
Main Article Content
Abstract
According to several studies, children’s speech is more dynamic and inconsistent compared to an adult’s speech. This aspect can be considered in the task of recognizing the age of the person who speaks and of great importance in many applications, such as humancomputer interaction, security on Internet and education assistants. Those applications have a dependency on language and accent, due to the different sounds and styles that characterize the speakers. This paper presents the initial results on the identification of Costa Rican children’s speech, in a database created for this purpose, consisting of words pronounced by adults and children of several ages. For this first study we chose the most common vowel of the language, and extract a set of common acoustic features to determine its applicability in distinguishing between adults and children of an age range. The outcome results shows promising results in the classification using a single vowel, that improves according to the number of vowels used to extract the acoustic features. This means that an automatic system could be able to improve its capacity to identify age as more speech information is received and transcribed, but cannot be very accurate in short interactions.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Los autores conservan los derechos de autor y ceden a la revista el derecho de la primera publicación y pueda editarlo, reproducirlo, distribuirlo, exhibirlo y comunicarlo en el país y en el extranjero mediante medios impresos y electrónicos. Asimismo, asumen el compromiso sobre cualquier litigio o reclamación relacionada con derechos de propiedad intelectual, exonerando de responsabilidad a la Editorial Tecnológica de Costa Rica. Además, se establece que los autores pueden realizar otros acuerdos contractuales independientes y adicionales para la distribución no exclusiva de la versión del artículo publicado en esta revista (p. ej., incluirlo en un repositorio institucional o publicarlo en un libro) siempre que indiquen claramente que el trabajo se publicó por primera vez en esta revista.
References
Safavi, Saeid, Martin Russell, and Peter Janˇcoviˇc. “Automatic speaker, age-group and gender identification from children’s speech.” Computer Speech & Language 50 (2018): 141-156.
Schuller, Bjorn W., et al. “Covid-19 and computer audition: An overview on what speech & sound analysis could contribute in the SARS-CoV-2 Corona crisis.” arXiv preprint arXiv:2003.11117 (2020).
Imran, Ali, et al. “AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app.” Informatics in Medicine Unlocked (2020)
Safavi, Saeid, et al. “Identification of gender from children’s speech by computers and humans.” INTERSPEECH. 2013.
Yildirim, Serdar, et al. “Acoustic analysis of preschool children’s speech.” Proc.15th ICPhS. 2003.
Lyakso, Elena E., Olga V. Frolova, and Aleks S. Grigoriev. “The acoustic char- acteristics of Russian vowels in children of 6 and 7 years of age.” Tenth Annual Conference of the International Speech Communication Association. 2009.
Gerosa, Matteo, et al. “Analyzing children’s speech: An acoustic study of consonants and consonant-vowel transition.” 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. Vol. 1. IEEE, 2006.
Lee, Sungbok, Alexandros Potamianos, and Shrikanth Narayanan. “Analysis of children’s speech: Duration, pitch and formants.” Fifth European Conference on Speech Communication and Technology. 1997.
Ting, Hua Nong, and Jasmy Yunus. “Speaker-independent Malay vowel recognition of children using multilayer perceptron.” 2004 IEEE Region 10 Conference TENCON 2004. IEEE, 2004.
Katz, William F., and Sneha Bharadwaj. “Coarticulation in fricative-vowel syllables produced by children and adults: A preliminary report.” Clinical linguistics & phonetics 15.1-2 (2001): 139-143.
Gerosa, Matteo, Diego Giuliani, and Fabio Brugnara. “Acoustic variability and automatic recognition of children’s speech.” Speech Communication 49.10-11 (2007): 847-860.
Zeng, Yumin, and Yi Zhang. “Robust children and adults speech classification.”Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007). Vol. 4. IEEE, 2007.
Massarente, Enrico. “Classificazione automatica della voce in ambito logopedico: training e testing di un algoritmo per discriminare la voce adulta da quella dei bambini.” (2015).
Martins, Rui, et al. “Detection of Children’s Voices.” I Iberian SLTech 2009: 77.
Goldman, Jean-Philippe. “EasyAlign: an automatic phonetic alignment tool un- der Praat.” Interspeech’11, 12th Annual Conference of the International Speech Communication Association. 2011.
Guirao, Miguelina, and Mar´ıa Garc´ıa Jurado. “Frequency of occurence of phonemes in American Spanish.” Revue quebecoise de linguistique 19.2 (1990): 135-149.
Boersma, P. Weenink, D. “Praat: doing phonetics by computer” [Computer pro- gram]. Version 6.0.37, retrieved May 2020 from http://www.praat.org/.