Unsupervised feature learning for relational data

US11416469B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11416469-B2
Application numberUS-202017102707-A
CountryUS
Kind codeB2
Filing dateNov 24, 2020
Priority dateNov 24, 2020
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an approach to unsupervised feature learning for relational data, a computer trains one or more entity aware autoencoders on one or more tables in a relational database, where each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, and where each of the one or more entity aware autoencoders are comprised of an encoder and a decoder. A computer transforms each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder. A computer joins a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables. A computer aggregates the one or more joined tables. A computer outputs one or more feature representations.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: training, by one or more computer processors, one or more entity aware autoencoders on one or more tables corresponding to an entity in a relational database, wherein each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, wherein each of the one or more entity aware autoencoders are comprised of an encoder and a decoder, and wherein each of the one or more entity aware autoencoders uses a group of rows of the one or more tables as an input to predict each row in the group; transforming, by one or more computer processors, each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder; joining, by one or more computer processors, a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables; aggregating, by one or more computer processors, the one or more joined tables; and outputting, by one or more computer processors, one or more feature representations. 2. The computer-implemented method of claim 1 , wherein joining the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database further comprises: performing, by one or more computer processors, a groupby operation on the one or more joined tables. 3. The computer-implemented method of claim 1 , wherein joining the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database further comprises: receiving, by one or more computer processors, a specified maximum depth for a joining operation; and joining, by one or more computer processors, the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database along one or more joining paths, wherein the one or more joining paths traverse a depth less than or equal to the maximum depth, and wherein the maximum depth is a number of steps between the first transformed table and another transformed table of the one or more tables. 4. The computer-implemented method of claim 1 , wherein aggregating the one or more joined tables further comprises: responsive to determining the one or more joined tables include a timestamp column, using, by one or more computer processors, a most-recent aggregation function. 5. The computer-implemented method of claim 1 , wherein aggregating the one or more joined tables further comprises: responsive to determining the one or more joined tables do not include a timestamp column, using, by one or more computer processors, a mean aggregation function. 6. The computer-implemented method of claim 1 , wherein the relational database includes structured data. 7. The computer-implemented method of claim 1 , wherein the one or more joined tables are connected by a column of data. 8. A computer program product comprising: one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media, the stored program instructions comprising: program instructions to train one or more entity aware autoencoders on one or more tables corresponding to an entity in a relational database, wherein each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, wherein each of the one or more entity aware autoencoders are comprised of an encoder and a decoder, and wherein each of the one or more entity aware autoencoders uses a group of rows of the one or more tables to predict each row in the group; program instructions to transform each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder; program instructions to join a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables; program instructions to aggregate the one or more joined tables; and program instructions to output one or more feature representations. 9. The computer program product of claim 8 , wherein the program instructions to join the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database comprise: program instructions to perform a groupby operation on the one or more joined tables. 10. The computer program product of claim 8 , wherein the program instructions to join the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database comprise: program instructions to receive a specified maximum depth for a joining operation; and program instructions to join the first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database along one or more joining paths, wherein the one or more joining paths traverse a depth less than or equal to the maximum depth, wherein the one or more joining paths traverse a depth less than or equal to the maximum depth, and wherein the maximum depth is a number of steps between the first transformed table and another transformed table of the one or more tables. 11. The computer program product of claim 8 , wherein the program instructions to aggregate the one or more joined tables comprise: responsive to determining the one or more joined tables include a timestamp column, program instructions to use a most-recent aggregation function. 12. The computer program product of claim 8 , wherein the program instructions to aggregate the one or more joined tables comprise: responsive to determining the one or more joined tables do not include a timestamp column, program instructions to use a mean aggregation function. 13. The computer program product of claim 8 , wherein the relational database includes structured data. 14. The computer program product of claim 8 , wherein the one or more joined tables are connected by a column of data. 15. A computer system comprising: one or more computer processors; one or more computer readable storage media; program instructions collectively stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the stored program instructions comprising: program instructions to train one or more entity aware autoencoders on one or more tables corresponding to an entity in a relational database, wherein each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, wherein each of the one or more entity aware autoencoders are comprised of an encoder and a decoder, and wherein each of the one or more entity aware autoencoders uses a group of rows of the one or more tables to predict each row in the group; program instructions to transform each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder; program instructions to join a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to fo

Assignees

Inventors

Classifications

  • Probabilistic or stochastic networks · CPC title

  • Combinations of networks · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11416469B2 cover?
In an approach to unsupervised feature learning for relational data, a computer trains one or more entity aware autoencoders on one or more tables in a relational database, where each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, and where each of the one or more entity aware autoencoders are comprised of an encoder and a d…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/2282. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).