Spectrogram Window Comparison: Cough Sound Recognition using Convolutional Neural Network
Dzikri Rahadian Fudholi(1*), Muhammad Auzan(2), Novia Arum Sari(3)
(1) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(2) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(3) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author
Abstract
Cough is one of the most common symptoms of diseases, especially respiratory diseases. Quick cough detection can be the key to the current pandemic of COVID-19. Good cough recognition is the one that uses non-intrusive tools such as a mobile phone microphone that does not disable human activities like stick sensors. To do sound-only detection, Deep Learning current best method Convolutional Neural Network (CNN) is used. However, CNN needs image input while sound input differs (one dimension rather than two). An extra process is needed, converting sound data to image data using a spectrogram. When building a spectrogram, there is a question about the best size. This research will compare the spectrogram's size, called Spectrogram Window, by the performance. The result is that windows with 4 seconds have the highest F1-score performance at 92.9%. Therefore, a window of around 4 seconds will perform better for sound recognition problems.
Keywords
Full Text:
PDFReferences
[1] J. Monge-Alvarez, C. Hoyos-Barceló, K. Dahal, and P. Casaseca-de-la-Higuera, “Audio-cough event detection based on moment theory,” Appl. Acoust., vol. 135, pp. 124–135, 2018.
[2] D. Fudholi and H. Suominen, "The Importance of Recommender and Feedback Features in a Pronunciation Learning Aid," 2019, pp. 83–87, doi: 10.18653/v1/w18-3711.
[3] F. Barata, K. Kipfer, M. Weber, P. Tinschert, E. Fleisch, and T. Kowatsch, "Towards device-agnostic mobile cough detection with convolutional neural networks," in 2019 IEEE International Conference on Healthcare Informatics (ICHI), 2019, pp. 1–11.
[4] B. R. Ismanto, T. M. Kusuma, and D. Anggraini, "Indonesian Music Classification on Folk and Dangdut Genre Based on Rolloff Spectral Feature Using Support Vector Machine (SVM) Algorithm," IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 15, no. 1, pp. 11–20.
[5] P. R. Amalia, "Aspect-Based Sentiment Analysis on Indonesian Restaurant Review Using a Combination of Convolutional Neural Network and Contextualized Word Embedding," IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 15, no. 3.
[6] S. A. H. Tabatabaei, G. Augustinov, V. Gross, K. Sohrabi, P. Fischer, and U. Koehler, "Automatic Detection and Classification of Cough Events Based on Deep Learning," Curr. Dir. Biomed. Eng., vol. 6, no. 3, pp. 322–325, 2020.
[7] J. Monge-Álvarez, C. Hoyos-Barceló, L. M. San-José-Revuelta, and P. Casaseca-de-la-Higuera, “A machine hearing system for robust cough detection based on a high-level representation of band-specific audio features,” IEEE Trans. Biomed. Eng., vol. 66, no. 8, pp. 2319–2330, 2018.
[8] Q. Zhou et al., "Cough Recognition Based on Mel-Spectrogram and Convolutional Neural Network," Front. Robot. AI, vol. 8, 2021.
[9] S. Matos, S. S. Birring, I. D. Pavord, and H. Evans, "Detection of cough signals in continuous audio recordings using hidden Markov models," IEEE Trans. Biomed. Eng., vol. 53, no. 6, pp. 1078–1083, 2006.
[10] L. Orlandic, T. Teijeiro, and D. Atienza, "The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms," Sci. Data, vol. 8, no. 1, pp. 1–10, 2021.
[11] N. Sharma et al., "Coswara--A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis," arXiv Prepr. arXiv2005.10548, 2020.
[12] A. R. Isnain, N. S. Marga, and D. Alita, "Sentiment Analysis Of Government Policy On Corona Case Using Naive Bayes Algorithm," IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 15, no. 1, pp. 55–64, 2021.
[13] M. Abadi et al., "Tensorflow: A system for large-scale machine learning," in 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 2016, pp. 265–283.
[14] B. McFee et al., "librosa: Audio and music signal analysis in python," in Proceedings of the 14th Python in science conference, 2015, vol. 8, pp. 18–25.
[15] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.
[16] A. Gulli and S. Pal, Deep learning with Keras. Packt Publishing Ltd, 2017.
[17] D. M. W. Powers, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation," arXiv Prepr. arXiv2010.16061, 2020.
DOI: https://doi.org/10.22146/ijccs.75697
Article Metrics
Abstract views : 1521 | views : 996Refbacks
- There are currently no refbacks.
Copyright (c) 2022 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1