Sensor data fusion for prognostics and health monitoring
US-2018217585-A1 · Aug 2, 2018 · US
US12293273B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12293273-B2 |
| Application number | US-202016928094-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 14, 2020 |
| Priority date | Jul 14, 2020 |
| Publication date | May 6, 2025 |
| Grant date | May 6, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, a computer program product, and a computer system fuse features for multi-modal classifications for a plurality of modality inputs. The method includes receiving a request indicative of the modality inputs to be selected. The method includes performing an embeddings level fusion operation to concatenate features from the modality inputs. The method includes performing a multi-modal discriminative feature level fusion operation that integrates feature representations learned by applying different network structures on the modality inputs. The method includes determining weights of the concatenated features and the feature representations based on a measure of the concatenated features and the feature representations indicative of affecting a final prediction performance. The method includes generating fused features for the modality inputs based on the concatenated features, the feature representations, and the weights. The method includes generating a response to the request based on the fused features. The method includes transmitting the response.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for fusing data for multi-modal classifications for a plurality of modality inputs, the computer-implemented method comprising: receiving, by a system operatively coupled to a processor, via the Internet, from an entity employing a service client at a first location remote from a second location, a request indicative of the modality inputs to be selected, the modality inputs including text, audio, and video; performing, by the system, via a multi-modal embeddings level fusion of automatic weighted deep fusion (AWD) process, features from the modality inputs employing a similarity matrix by aligning representative feature maps across the modality inputs; applying, by the system, different network structures on the modality inputs; integrating, by the system, via a multi-modal discriminative feature level fusion of the AWD process, feature representations learned by the applying the different network structures on the modality inputs; generating, by the system, weights of the concatenated features and the feature representations based on a measure of the concatenated features and the feature representations indicative of affecting a final prediction performance; generating, by the system, fused features for the modality inputs based on the concatenated features, the feature representations, and the weights; generating, by the system, a response to the request based on the fused features; and transmitting, by the system, via the Internet, the response to the service client associated with the entity. 2. The computer-implemented method of claim 1 , wherein the modality inputs have a deep architecture including a convolution neural network, a recurrent neural network, or a combination thereof. 3. The computer-implemented method of claim 1 , wherein the concatenated features in the modality inputs are concatenated based on a distribution, an embedding, or a combination thereof of the feature in the modality inputs. 4. The computer-implemented method of claim 1 , wherein the multi-modal discriminative level feature fusion operation includes a deep correlation fusion operation that determines contributions of correlations of the feature representations. 5. The computer-implemented method of claim 4 , wherein the deep correlation fusion operation determines a degree of correlation of a first one of the feature representations to a second one of the feature representations. 6. The computer-implemented method of claim 5 , wherein the deep correlation fusion operation determines a corresponding contribution for each of the modality inputs through a weighted sum of each degree of correlation of the feature representations. 7. The computer-implemented method of claim 1 , wherein the multi-modal discriminative level feature fusion operation includes a pair-wise matching fusion operation is indicative of a pair-wise matching degree of the feature representations according to embeddings obtained for different modality inputs. 8. A computer program product for fusing data for multi-modal classifications for a plurality of modality inputs, the computer program product comprising: one or more non-transitory computer-readable storage media and program instructions stored on the one or more non-transitory computer-readable storage media capable of performing a method, the method comprising: receiving, via the Internet, from an entity employing a service client at a first location remote from a second location of the computer program product, a request indicative of the modality inputs to be selected, the modality inputs including text, audio, and video; concatenating, via a multi-modal embeddings level fusion of automatic weighted deep fusion (AWD) process, features from the modality inputs employing a similarity matrix by aligning representative feature maps across the modality inputs; applying different network structures on the modality inputs; integrating, via a multi-modal discriminative feature level fusion of the AWD process, feature representations learned by the applying the different network structures on the modality inputs; generating weights of the concatenated features and the feature representations based on a measure of the concatenated features and the feature representations indicative of affecting a final prediction performance; generating fused features for the modality inputs based on the concatenated features, the feature representations, and the weights; generating a response to the request based on the fused features; and transmitting, via the Internet, the response to the service client associated with the entity. 9. The computer program product of claim 8 , wherein the modality inputs have a deep architecture including a convolution neural network, a recurrent neural network, or a combination thereof. 10. The computer program product of claim 8 , wherein the concatenated features in the modality inputs are concatenated based on a distribution, an embedding, or a combination thereof of the feature in the modality inputs. 11. The computer program product of claim 8 , wherein the multi-modal discriminative level feature fusion operation includes a deep correlation fusion operation that determines contributions of correlations of the feature representations. 12. The computer program product of claim 11 , wherein the deep correlation fusion operation determines a degree of correlation of a first one of the feature representations to a second one of the feature representations. 13. The computer program product of claim 12 , wherein the deep correlation fusion operation determines a corresponding contribution for each of the modality inputs through a weighted sum of each degree of correlation of the feature representations. 14. The computer program product of claim 8 , wherein the multi-modal discriminative level feature fusion operation includes a pair-wise matching fusion operation is indicative of a pair-wise matching degree of the feature representations according to embeddings obtained for different modality inputs. 15. A computer system for fusing data for multi-modal classifications for a plurality of modality inputs, the computer system comprising: one or more computer processors, one or more computer-readable storage media, and program instructions stored on the one or more of the computer-readable storage media for execution by at least one of the one or more computer processors capable of performing a method, the method comprising: receiving via the Internet, from an entity employing a service client at a first location remote from a second location, a request indicative of the modality inputs to be selected, the modality inputs including text, audio, and video; performing, via a multi-modal embeddings level fusion operation to concatenate of automatic weighted deep fusion (AWD) process, features from the modality inputs based on a similarity matrix that aligns representative feature maps across the modality inputs; applying, by the system, different network structures on the modality inputs; integrating, via a multi-modal discriminative feature level fusion of the AWD process operation that integrates feature representations learned by the applying the different network structures on the modality inputs; generating weights of the concatenated features and the feature representations based on a measure of the concatenated features and the feature representations indicative of affecting a final prediction performance; generating fused features for the modality inputs based on the concatenated features, the feature representations, and the we
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Learning methods · CPC title
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.