Crack Detection on Concrete Surfaces Using Deep Encoder-Decoder Convolutional Neural Network: A Comparison Study Between U-Net and DeepLabV3+

Patrick Nicholas Hadinata(1*), Djoni Simanta(2), Liyanto Eddy(3), Kohei Nagai(4)

(1) Universitas Katolik Parahyangan
(2) Universitas Katolik Parahyangan
(3) Universitas Katolik Parahyangan
(4) The University of Tokyo
(*) Corresponding Author


Maintenance of infrastructures is a crucial activity to ensure safety using crack detection methods on concrete structures. However, most practice of crack detection is carried out manually, which is unsafe, highly subjective, and time-consuming. Therefore, a more accurate and efficient system needs to be implemented using artificial intelligence. Convolutional neural network (CNN), a subset of artificial intelligence, is used to detect cracks on concrete surfaces through semantic image segmentation. The purpose of this research is to compare the effectiveness of cutting-edge encoder-decoder architectures in detecting cracks on concrete surfaces using U-Net and DeepLabV3+ architectures with potential in biomedical, and sparse multiscale image segmentations, respectively. Neural networks were trained using cloud computing with a high-performance Graphics Processing Unit NVIDIA Tesla V100 and 27.4 GB of RAM. This study used internal and external data. Internal data consisted of simple cracks and were used as the training and validation data. Meanwhile, external data consisted of more complex cracks, which were used for further testing. Both architectures were compared based on four evaluation metrics in terms of accuracy, F1, precision, and recall. U-Net achieved segmentation accuracy = 96.57%, F1 = 87.55%, precision = 88.15%, and recall = 88.94%, while DeepLabV3+ achieved segmentation accuracy = 96.47%, F1 = 85.29%, precision = 92.07%, and recall = 81.84%. Experiment results (internal and external data) indicated that both architectures were accurate and effective in segmenting cracks. Additionally, U-Net and DeepLabV3+ exceeded the performance of previously tested architecture, namely FCN.


Convolutional Neural Network; U-Net; DeepLabV3+; Crack Detection; Maintenance of Infrastructures.

Full Text:



Chen, L. C., Papandreou, G., Schroff, F., & Adam, H., 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587.

Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H., 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European conference on computer vision (ECCV), pp. 801-818.

Chollet, F., 2017. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251-1258.

Goutte, C., & Gaussier, E., 2005. A Probabilistic Interpretation of Precision, Recall And F-Score, With Implication for Evaluation. In European conference on information retrieval, pp. 345-359. Springer, Berlin, Heidelberg.

He, K., Zhang, X., Ren, S., & Sun, J., 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on Imagenet Classification. In Proceedings of the IEEE international conference on computer vision, pp. 1026-1034.

Ioffe, S., & Szegedy, C., 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International conference on machine learning, pp. 448-456. PMLR.

Kingma, D. P., & Ba, J., 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.

Long, J., Shelhamer, E., & Darrell, T., 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431-3440.

Milletari, F., Navab, N., & Ahmadi, S. A., 2016. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In 2016 fourth international conference on 3D vision (3DV), pp. 565-571. IEEE.

Nair, V., & Hinton, G. E., 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Icml.

Qian, N., 1999. On the Momentum Term in Gradient Descent Learning Algorithms. Neural networks, 12(1), pp. 145-151.

Ronneberger, O., Fischer, P., & Brox, T., 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp. 234-241. Springer, Cham.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R., 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The journal of machine learning research, 15(1), pp. 1929-1958.

Yang, X., Li, H., Yu, Y., Luo, X., Huang, T., & Yang, X., 2018. Automatic Pixel‐Level Crack Detection and Measurement Using Fully Convolutional Network. Computer‐Aided Civil and Infrastructure Engineering, 33(12), pp. 1090-1109.


Article Metrics

Abstract views : 5643 | views : 2207


  • There are currently no refbacks.

Copyright (c) 2022 The Author(s)

The content of this website is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
ISSN 5249-5925 (online) | ISSN 2581-1037 (print)
Jl. Grafika No.2 Kampus UGM, Yogyakarta 55281
Email :
Web Analytics JCEF Stats