In:
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, The Royal Society, Vol. 381, No. 2254 ( 2023-09-04)
Abstract:
Prior convolution-based road crack detectors typically learn more abstract visual representation with increasing receptive field via an encoder–decoder architecture. Despite the promising accuracy, progressive spatial resolution reduction causes semantic feature blurring, leading to coarse and incontiguous distress detection. To these ends, an alternative sequence-to-sequence perspective with a transformer network termed TransCrack is introduced for road crack detection. Specifically, an image is decomposed into a grid of fixed-size crack patches, which is flattened with position embedding into a sequence. We further propose a pure transformer-based encoder with multi-head reduced self-attention modules and feed-forward networks for explicitly modelling long-range dependencies from the sequential input in a global receptive field. More importantly, a simple decoder with cross-layer aggregation architecture is developed to incorporate global with local attentions across different regions for detailed feature recovery and pixel-wise crack mask prediction. Empirical studies are conducted on three publicly available damage detection benchmarks. The proposed TransCrack achieves a state-of-the-art performance over all counterparts by a substantialmargin, and qualitative results further demonstrate its superiority in contiguous crack recognition and fine-grained profile extraction. This article is part of the theme issue ‘Artificial intelligence in failure analysis of transportation infrastructure and materials’.
Type of Medium:
Online Resource
ISSN:
1364-503X
,
1471-2962
DOI:
10.1098/rsta.2022.0172
Language:
English
Publisher:
The Royal Society
Publication Date:
2023
detail.hit.zdb_id:
208381-4
detail.hit.zdb_id:
1462626-3
SSG:
11
SSG:
5,1
SSG:
5,21
Permalink