Method for multi-modal retrieval and clustering using deep cca and active pairwise queries

US2021056127A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021056127-A1
Application numberUS-202016996110-A
CountryUS
Kind codeA1
Filing dateAug 18, 2020
Priority dateAug 21, 2019
Publication dateFeb 25, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for embedding learning and clustering for paired multi-modal data using deep canonical correlation analysis and active learning with pairwise queries is presented. The method includes collecting time-series data from a plurality of sensors, training, in an unsupervised manner, a cross-modal retrieval system by using the time-series data and relevant comment texts, depending on a modality of a query, retrieving the relevant comment texts from a time-series segment of the time-series data, the relevant comment texts used as human-readable explanations of a query segment, retrieving relevant time-series segments given a sentence or a set of keywords such that the relevant time-series segments match the sentence or set of keywords, and retrieving the relevant time-series segments given the time-series segment and the sentence or set of keywords such that a first subset of attributes match the set of keywords and a second subset of attributes resembles the time-series segment.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method executed on a processor for embedding learning and clustering for paired multi-modal data using deep canonical correlation analysis (CCA) and active learning with pairwise queries, the method comprising: collecting time-series data from a plurality of sensors; training, in an unsupervised manner, a cross-modal retrieval system by using the time-series data and relevant comment texts; depending on a modality of a query: retrieving the relevant comment texts from a time-series segment of the time-series data, the relevant comment texts used as human-readable explanations of a query segment; retrieving relevant time-series segments given a sentence or a set of keywords such that the relevant time-series segments match the sentence or set of keywords; and retrieving the relevant time-series segments given the time-series segment and the sentence or set of keywords such that a first subset of attributes match the set of keywords and a second subset of attributes resembles the time-series segment. 2 . The method of claim 1 , wherein the time-series segment and the relevant comment texts are transformed into points in a common latent space. 3 . The method of claim 2 , wherein the cross-modal retrieval system finds nearest neighbors of the query in the common latent space. 4 . The method of claim 1 , wherein the cross-modal retrieval system uses multi-modal neural networks to encode the time-series data and the relevant comment texts into vector representations. 5 . The method of claim 4 , wherein the multi-modal neural networks are trained by a two-stage training algorithm employing examples from a user-provided database of time-series text pairs. 6 . The method of claim 5 , wherein the first stage of the training algorithm is a deep CCA-based pre-training. 7 . The method of claim 6 , wherein the second stage of the training algorithm is active clustering. 8 . The method of claim 7 , wherein the active clustering includes query pair selection based on Gaussian mixture modeling (GMM) and query-based selection using active spectral clustering. 9 . A non-transitory computer-readable storage medium comprising a computer-readable program for embedding learning and clustering for paired multi-modal data using deep canonical correlation analysis (CCA) and active learning with pairwise queries, wherein the computer-readable program when executed on a computer causes the computer to perform the steps of: collecting time-series data from a plurality of sensors; training, in an unsupervised manner, a cross-modal retrieval system by using the time-series data and relevant comment texts; depending on a modality of a query: retrieving the relevant comment texts from a time-series segment of the time-series data, the relevant comment texts used as human-readable explanations of a query segment; retrieving relevant time-series segments given a sentence or a set of keywords such that the relevant time-series segments match the sentence or set of keywords; and retrieving the relevant time-series segments given the time-series segment and the sentence or set of keywords such that a first subset of attributes match the set of keywords and a second subset of attributes resembles the time-series segment. 10 . The non-transitory computer-readable storage medium of claim 9 , wherein the time-series segment and the relevant comment texts are transformed into points in a common latent space. 11 . The non-transitory computer-readable storage medium of claim 10 , wherein the cross-modal retrieval system finds nearest neighbors of the query in the common latent space. 12 . The non-transitory computer-readable storage medium of claim 9 , wherein the cross-modal retrieval system uses multi-modal neural networks to encode the time-series data and the relevant comment texts into vector representations. 13 . The non-transitory computer-readable storage medium of claim 12 , wherein the multi-modal neural networks are trained by a two-stage training algorithm employing examples from a user-provided database of time-series text pairs. 14 . The non-transitory computer-readable storage medium of claim 13 , wherein the first stage of the training algorithm is a deep CCA-based pre-training. 15 . The non-transitory computer-readable storage medium of claim 14 , wherein the second stage of the training algorithm is active clustering. 16 . The non-transitory computer-readable storage medium of claim 15 , wherein the active clustering includes query pair selection based on Gaussian mixture modeling (GMM) and query-based selection using active spectral clustering. 17 . A system for embedding learning and clustering for paired multi-modal data using deep canonical correlation analysis (CCA) and active learning with pairwise queries, the system comprising: a memory; and one or more processors in communication with the memory configured to: collect time-series data from a plurality of sensors; train, in an unsupervised manner, a cross-modal retrieval system by using the time-series data and relevant comment texts; depending on a modality of a query: retrieve the relevant comment texts from a time-series segment of the time-series data, the relevant comment texts used as human-readable explanations of a query segment; retrieve relevant time-series segments given a sentence or a set of keywords such that the relevant time-series segments match the sentence or set of keywords; and retrieve the relevant time-series segments given the time-series segment and the sentence or set of keywords such that a first subset of attributes match the set of keywords and a second subset of attributes resembles the time-series segment. 18 . The system of claim 17 , wherein the time-series segment and the relevant comment texts are transformed into points in a common latent space. 19 . The system of claim 18 , wherein the cross-modal retrieval system finds nearest neighbors of the query in the common latent space. 20 . The system of claim 17 , wherein the cross-modal retrieval system uses multi-modal neural networks to encode the time-series data and the relevant comment texts into vector representations.

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Combinations of networks · CPC title

  • G06N3/088Primary

    Non-supervised learning, e.g. competitive learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021056127A1 cover?
A method for embedding learning and clustering for paired multi-modal data using deep canonical correlation analysis and active learning with pairwise queries is presented. The method includes collecting time-series data from a plurality of sensors, training, in an unsupervised manner, a cross-modal retrieval system by using the time-series data and relevant comment texts, depending on a modali…
Who is the assignee on this patent?
Nec Lab America Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/088. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 25 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).