Self-Supervised System for Learning a User Interface Language
US-2023305863-A1 · Sep 28, 2023 · US
US12493752B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12493752-B2 |
| Application number | US-202318327074-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 1, 2023 |
| Priority date | Jun 13, 2022 |
| Publication date | Dec 9, 2025 |
| Grant date | Dec 9, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An automatic concrete dam defect image description generation method based on graph attention network, including: 1) extract the local grid features and whole image features of the defect image and conduct image coding by using multi-layer convolutional neural network; 2) construct the grid feature interaction graph, and fuse and encode the grid visual features and global image features of the defect image; 3) update and optimize the global and local features through the graph attention network, and fully utilize the improved visual features for defect description. The invention constructs the grid feature interaction graph, updates the node information by using the graph attention network, and realizes the feature extraction task as the graph node classification task. The invention can capture the global image information of the defect image and the potential interaction of local grid features, and the generated description text can accurately and coherently describe the defect information.
Opening claim text (preview).
What is claimed is: 1 . An automatic concrete dam defect image description generation method based on graph attention network, characterized by including the following steps: 1) extracting global features and grid features of a defect image respectively by using a multi-layer convolutional neural network; 2) constructing a grid feature interaction graph, and inputting the global features and grid features as nodes; 3) updating and optimizing information of the nodes in the grid feature interaction graph constructed in Step 2) by using the graph attention network to get updated global features and grid features; 4) automatically generating an image description by using a sequence of the updated global features and grid features based on a Transformer decoding module. 2 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 1), a Faster R-CNN model pre-trained on a Visual Genome data set is used to extract the global features and grid features, and uses a convolutional layer C5 with a stride of 1 and a 1×1 RoIPool with two FC layers as detection heads, and the output of the convolutional layer C5 is used as the grid features of the defect image. 3 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 2), the dependency among grid features and the global features are introduced, the grid feature interaction graph is established by means of a global node mechanism, and the grid feature interaction graph is constructed as follows: the global features and grid features obtained in Step 1) are used as node inputs of the grid feature interaction graph to obtain one global node and multiple local nodes; the global node serves as a virtual center and is connected to all nodes in the grid feature interaction graph; the local nodes are connected according to the relative center coordinates of grids, namely, a value (i, j) of two adjacent grid nodes i and j in an adjacency matrix A is assigned as 1, indicating direct interaction, while a value of non-adjacent nodes is assigned as 0, indicating no interaction; the global node collects and distributes general information from the local nodes. 4 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 2 , characterized in that in Step 1), the grid features and global features of the defect image are extracted for image coding by using the multi-layer convolutional neural network by: adding the global features of the grid feature interaction graph on the basis of fusing the grid features, and extracting the global features and grid features of the defect image; defining the input as a defect image p 0 =full_image and n fixed size Grids=(p 1 , p 2 , . . . , p n ), extracting defect features by using the Faster R-CNN model pre-trained on the Visual Genome data set, and using a convolutional layer C5 with a stride of 1 and a 1×1 RoIPool with two FC layers as detection heads, in which the output of the convolutional layer C5 is embedded in IE 1 =CNN(p 0:n ;⊕ CNN ) as the extracted defect image; ⊕ CNN represents the parameters of the Faster R-CNN model, IE includes global image embedding IE Global =IE 0 and local image embedding IE Local =[IE 1 , IE 2 , . . . , IE n ], and p 0:n means that p 0 =full_image and Grids=(p 1 , p 2 , . . . , p n ) are connected together, representing the input of the full_image and n grids. 5 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 3), the nodes of the graph attention network based on the grid feature interaction graph correspond to grids of the defect image according to the grid feature interaction graph and the graph attention network, the features of the nodes are local image embeddings, the edges of the graph attention network correspond to the edges of the grid feature interaction graph, and a multi-head self-attention mechanism is used to fuse and update defect information of adjacent nodes in the grid feature interaction graph. 6 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 4), the Transformer decoding module comprises a reference decoding module and an optimized decoding module, the training of the reference decoding module and the optimized decoding module is divided into two stages: cross-entropy loss optimization stage and reinforcement learning stage, in which the cross-entropy loss optimization stage is based on the loss function of the negative log-likelihood estimation, and the reinforcement learning stage is based on the reinforcement learning optimization strategy and takes CIDEr score as a reward function, where CIDEr refers to Consensus-based Image Description Evaluation, a metric for evaluating image description quality. 7 . The automatic concrete dam defect image description generation method based on graph attention network according to claim 1 , characterized in that in Step 3), the steps for updating the nodes by using the graph attention network are as follows: (3.1) the grid features obtained by defining the multi-layer convolutional neural network are expressed as h=(h 1 , h 2 , . . . , h n ), h i ∈ F , where n represents the number of grids, and F is a feature dimension outputted by a CNN hidden layer; (3.2) feature vectors of two grids are connected according to an adjacency matrix A, and a self-attention calculation is conducted for each grid through a nonlinear layer of a Leakey ReLU function, as shown in Equation (1): e ij =Leakey ReLU( V T [Wh i ⊕Wh j ]) (1) where e ij represents the importance of the features of a grid j to a grid i, V and W are learnable parameter matrices, and ⊕ represents connection; (3.3) a softmax function is used to normalize all neighborhood grid features of the grid i to obtain an attention coefficient α ij so that it is easy to compare the coefficients between different nodes, as shown in Equation (2): α ij = soft max j ( e ij ) = exp ( e ij ) ∑ k ∈ N i
using neural networks · CPC title
Artificial neural networks [ANN] · CPC title
Infrastructure · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Training; Learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.