Automatic concrete dam defect image description generation method based on graph attention network

US12493752B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12493752-B2
Application numberUS-202318327074-A
CountryUS
Kind codeB2
Filing dateJun 1, 2023
Priority dateJun 13, 2022
Publication dateDec 9, 2025
Grant dateDec 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An automatic concrete dam defect image description generation method based on graph attention network, including: 1) extract the local grid features and whole image features of the defect image and conduct image coding by using multi-layer convolutional neural network; 2) construct the grid feature interaction graph, and fuse and encode the grid visual features and global image features of the defect image; 3) update and optimize the global and local features through the graph attention network, and fully utilize the improved visual features for defect description. The invention constructs the grid feature interaction graph, updates the node information by using the graph attention network, and realizes the feature extraction task as the graph node classification task. The invention can capture the global image information of the defect image and the potential interaction of local grid features, and the generated description text can accurately and coherently describe the defect information.

First claim

Opening claim text (preview).

What is claimed is: 1 . An automatic concrete dam defect image description generation method based on graph attention network, characterized by including the following steps: 1) extracting global features and grid features of a defect image respectively by using a multi-layer convolutional neural network; 2) constructing a grid feature interaction graph, and inputting the global features and grid features as nodes; 3) updating and optimizing information of the nodes in the grid feature interaction graph constructed in Step 2) by using the graph attention network to get updated global features and grid features; 4) automatically generating an image description by using a sequence of the updated global features and grid features based on a Transformer decoding module. 2 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 1), a Faster R-CNN model pre-trained on a Visual Genome data set is used to extract the global features and grid features, and uses a convolutional layer C5 with a stride of 1 and a 1×1 RoIPool with two FC layers as detection heads, and the output of the convolutional layer C5 is used as the grid features of the defect image. 3 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 2), the dependency among grid features and the global features are introduced, the grid feature interaction graph is established by means of a global node mechanism, and the grid feature interaction graph is constructed as follows: the global features and grid features obtained in Step 1) are used as node inputs of the grid feature interaction graph to obtain one global node and multiple local nodes; the global node serves as a virtual center and is connected to all nodes in the grid feature interaction graph; the local nodes are connected according to the relative center coordinates of grids, namely, a value (i, j) of two adjacent grid nodes i and j in an adjacency matrix A is assigned as 1, indicating direct interaction, while a value of non-adjacent nodes is assigned as 0, indicating no interaction; the global node collects and distributes general information from the local nodes. 4 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 2 , characterized in that in Step 1), the grid features and global features of the defect image are extracted for image coding by using the multi-layer convolutional neural network by: adding the global features of the grid feature interaction graph on the basis of fusing the grid features, and extracting the global features and grid features of the defect image; defining the input as a defect image p 0 =full_image and n fixed size Grids=(p 1 , p 2 , . . . , p n ), extracting defect features by using the Faster R-CNN model pre-trained on the Visual Genome data set, and using a convolutional layer C5 with a stride of 1 and a 1×1 RoIPool with two FC layers as detection heads, in which the output of the convolutional layer C5 is embedded in IE 1 =CNN(p 0:n ;⊕ CNN ) as the extracted defect image; ⊕ CNN represents the parameters of the Faster R-CNN model, IE includes global image embedding IE Global =IE 0 and local image embedding IE Local =[IE 1 , IE 2 , . . . , IE n ], and p 0:n means that p 0 =full_image and Grids=(p 1 , p 2 , . . . , p n ) are connected together, representing the input of the full_image and n grids. 5 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 3), the nodes of the graph attention network based on the grid feature interaction graph correspond to grids of the defect image according to the grid feature interaction graph and the graph attention network, the features of the nodes are local image embeddings, the edges of the graph attention network correspond to the edges of the grid feature interaction graph, and a multi-head self-attention mechanism is used to fuse and update defect information of adjacent nodes in the grid feature interaction graph. 6 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 4), the Transformer decoding module comprises a reference decoding module and an optimized decoding module, the training of the reference decoding module and the optimized decoding module is divided into two stages: cross-entropy loss optimization stage and reinforcement learning stage, in which the cross-entropy loss optimization stage is based on the loss function of the negative log-likelihood estimation, and the reinforcement learning stage is based on the reinforcement learning optimization strategy and takes CIDEr score as a reward function, where CIDEr refers to Consensus-based Image Description Evaluation, a metric for evaluating image description quality. 7 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 3), the steps for updating the nodes by using the graph attention network are as follows: (3.1) the grid features obtained by defining the multi-layer convolutional neural network are expressed as h=(h 1 , h 2 , . . . , h n ), h i ∈ F , where n represents the number of grids, and F is a feature dimension outputted by a CNN hidden layer; (3.2) feature vectors of two grids are connected according to an adjacency matrix A, and a self-attention calculation is conducted for each grid through a nonlinear layer of a Leakey ReLU function, as shown in Equation (1): e ij =Leakey ReLU( V T [Wh i ⊕Wh j ])  (1) where e ij represents the importance of the features of a grid j to a grid i, V and W are learnable parameter matrices, and ⊕ represents connection; (3.3) a softmax function is used to normalize all neighborhood grid features of the grid i to obtain an attention coefficient α ij so that it is easy to compare the coefficients between different nodes, as shown in Equation (2): α ij = soft ⁢ max j ( e ij ) = exp ⁡ ( e ij ) ∑ k ∈ N i

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12493752B2 cover?
An automatic concrete dam defect image description generation method based on graph attention network, including: 1) extract the local grid features and whole image features of the defect image and conduct image coding by using multi-layer convolutional neural network; 2) construct the grid feature interaction graph, and fuse and encode the grid visual features and global image features of the …
Who is the assignee on this patent?
Huaneng Lancang River Hydropower Inc, Univ Hohai, Huaneng Group R&D Center Co Ltd, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).