Who is the assignee on this patent?

American Express Travel Related Services Co Inc

What technology area does this patent fall under?

Primary CPC classification G06F40/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Identifying data of interest using machine learning

US11288456B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11288456-B2
Application number	US-201816215961-A
Country	US
Kind code	B2
Filing date	Dec 11, 2018
Priority date	Dec 11, 2018
Publication date	Mar 29, 2022
Grant date	Mar 29, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system may input the unstructured data into a Naïve Bayes machine learning model, a long short-term memory (LSTM) machine learning model, a named entity recognition (NER) model, a semantic role labeling (SRL) model, a sentiment scoring algorithm, and/or a gradient boosted regression tree (GBRT) machine learning model. Based on determining that the unstructured data is of interest, a data alert may be generated and transmitted for manual review or as part of an automated decisioning process.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: training, by a processor, at least a first machine learning model and a second machine learning model to identify data of interest from unstructured datasets based on a training dataset, generated by filtering public business data based on a training keyword, identified from a dataset that is known to be of interest; retrieving, by the processor, unstructured data from an internet data source, wherein the unstructured data is retrieved directly from a subscriber database or from a web link hosting the unstructured data; inputting, by the processor, the unstructured data into the first machine learning model, the second machine learning model, a named entity recognition (NER) model, and a semantic role labeling (SRL) model; calculating, by the processor, a sentiment score indicating whether the unstructured data qualifies as data of interest by inputting the unstructured data into a sentiment scoring algorithm; identifying, by the processor, the unstructured data to be of interest to a business in response to an output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, or the sentiment score indicating that the unstructured data has a probability of being of interest; generating, by the processor, a data alert in response to identifying the unstructured data to be of interest, wherein the data alert comprises at least one of the unstructured data, the web link, or the output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, a gradient boosted regression tree (GBRT) machine learning model, or the sentiment score; and transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of the business, wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 2. The method of claim 1 , further comprising: inputting, by the processor, the output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, or the sentiment score into the GBRT machine learning model; and identifying, by the processor, the unstructured data to be of interest based on a final output from the GBRT machine learning model. 3. The method of claim 1 , further comprising preprocessing, by the processor, the unstructured data by performing a part-of-speech tagging process or by removing at least one of embedded web links, email links, or numbers. 4. The method of claim 1 , wherein the first machine learning model comprises a Naïve Bayes machine learning model and the second machine learning model comprises a long short-term memory (LSTM) machine learning model. 5. The method of claim 1 , wherein the training keyword is identified by analyzing prefiltered training data using at least one of a latent Dirichlet allocation ( LOA ) model, a correlated topic model, a word2vec processing algorithm, a word frequency analysis, or a phrase frequency analysis. 6. The method of claim 1 , wherein the training dataset is prefiltered by at least one of a parts-of-speech tagging process, a lemmatization process, removing stop words, generating n-grams, normalizing or filtering email IDs, numbers, and URLs, or replacing proper nouns with common nouns. 7. The method of claim 1 , wherein the unstructured data is input in the first machine learning model, the second machine learning model, the NER model, and the SRL model in series. 8. The method of claim 1 , wherein the unstructured data is input in the first machine learning model, the second machine learning model, the NER model, and the SRL model in parallel. 9. A system comprising: a processor; and a tangible, non-transitory memory configured to communicate with the processor, the tangible, non-transitory memory having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations comprising: training at least a first machine learning model and a second model learning model to identify data of interest from unstructured datasets based on a training dataset, wherein the training dataset is generated by filtering public business data based on a training keyword, wherein the training keyword is generated by identifying keywords from a dataset that is known to be of interest; retrieving unstructured data from an internet data source, wherein the unstructured data is retrieved directly from a subscriber database or from a web link hosting the unstructured data; inputting the unstructured data into the first machine learning model, the second machine learning model, a named entity recognition (NER) model, and a semantic role labeling (SRL) model; calculating a sentiment score indicating whether the unstructured data qualifies as data of interest by inputting the unstructured data into a sentiment scoring algorithm; inputting an output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, or the sentiment score into a gradient boosted regression tree (GBRT) machine learning model; identifying the unstructured data to be of interest to a business based on the output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, the sentiment score, or the GBRT machine learning model; generating a data alert in response to identifying the unstructured data to be of interest, wherein the data alert comprises at least one of the unstructured data, the web link, or the output of at least one of the first machine learning model, the second machine learning model, the NER model, the SRL model, the sentiment score, or the GBRT machine learning model; and transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of the business, wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 10. The system of claim 9 , wherein the first machine learning model comprises a Nave Bayes machine learning model and the second machine learning model comprises a long short-term memory (LSTM) machine learning model. 11. The system of claim 9 , wherein the training keyword is identified by analyzing prefiltered training data using at least one of a latent Dirichlet allocation (LDA) model, a correlated topic model, a word2vec processing algorithm, a word frequency analysis, or a phrase frequency analysis. 12. The system of claim 9 , wherein the training dataset is prefiltered by at least one of a parts-of-speech tagging process, a lemmatization process, removing stop words, generating n-grams, normalizing or filtering email IDs, numbers, and URLs, or replacing proper nouns with common nouns. 13. The system of claim 9 , wherein the unstructured data is input in the first machine learning model, the second machine learning model, the NER model, and the SRL model in series. 14. The system of claim 9 , wherein the unstructured data is input in the first machine learning model, the second machine learning model, the NER model, and the SRL model in parallel. 15. An article of manufacture including a non-transitory, tangible computer readable storage medium having instructions stored thereon that, in response to execution by a computer-based system

Assignees

American Express Travel Related Services Co Inc

Inventors

Classifications

G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 70971009

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11288456B2 cover?: Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system …
Who is the assignee on this patent?: American Express Travel Related Services Co Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).