Audio-Visual CNN using Transfer Learning for TV Commercial Break Detection

https://doi.org/10.22146/ijccs.76058

Muhammad Zha'farudin Pudya Wardana(1*), Moh. Edi Wibowo(2)

(1) Master Program in Computer Science, FMIPA UGM, Yogyakarta
(2) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author

Abstract


The TV commercial detection problem is a hard challenge due to the variety of programs and TV channels. The usage of deep learning methods to solve this problem has shown good results. However, it takes a long time with many training epochs to get high accuracy.

    This research uses transfer learning techniques to reduce training time and limits the number of training epochs to 20. From video data, the audio feature is extracted with Mel-spectrogram representation, and the visual features are picked from a video frame. The datasets were gathered by recording programs from various TV channels in Indonesia. Pre-trained CNN models such as MobileNetV2, InceptionV3, and DenseNet169 are re-trained and are used to detect commercials at the shot level. We do post-processing to cluster the shots into segments of commercials and non-commercials.

    The best result is shown by Audio-Visual CNN using transfer learning with an accuracy of 93.26% with only 20 training epochs. It is faster and better than the CNN model without using transfer learning with an accuracy of 88.17% and 77 training epochs. The result by adding post-processing increases the accuracy of Audio-Visual CNN using transfer learning to 96.42%.


Keywords


Commercial, TV, CNN, Transfer Learning, InceptionV3, MobileNetV2, DenseNet169, Video

Full Text:

PDF


References

[1] S. Li Yujuns and Luo, “A TV Commercial Detection System,” in Web Information Systems and Mining, 2011, pp. 35–43.

[2] X. Wu and S. Satoh, “Ultrahigh-Speed TV Commercial Detection, Extraction, and Matching,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 6, pp. 1054–1069, 2013, doi: 10.1109/TCSVT.2013.2248991.

[3] Z. Feng and C. Lab, “Real Time Commercial Detection in Videos,” 2013.

[4] A. Vyas, R. Kannao, V. Bhargava, and P. Guha, “Commercial Block Detection in Broadcast News Videos,” 2014. doi: 10.1145/2683483.2683546.

[5] A. Gomes, M. P. Queluz, and F. Pereira, “Automatic detection of TV commercial blocks: A new approach based on digital on-screen graphics classification,” in 2017 11th International Conference on Signal Processing and Communication Systems (ICSPCS), 2017, pp. 1–6.

[6] M. Li, Y. Guo, and Y. Chen, “CNN-Based Commercial Detection in TV Broadcasting,” in Proceedings of the 2017 VI International Conference on Network, Communication and Computing, 2017, pp. 48–53. doi: 10.1145/3171592.3171619.

[7] S. Minaee, I. Bouazizi, P. Kolan, and H. Najafzadeh, “Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos,” ArXiv, vol. abs/1806.08612, 2018.

[8] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, 2016.

[9] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, 2017.

[10] M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520, 2018.

[11] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” CoRR, vol. abs/1412.6980, 2015.



DOI: https://doi.org/10.22146/ijccs.76058

Article Metrics

Abstract views : 1258 | views : 726

Refbacks

  • There are currently no refbacks.




Copyright (c) 2023 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2