Identifying data of interest using machine learning

US11714968B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11714968-B2
Application numberUS-202217679498-A
CountryUS
Kind codeB2
Filing dateFeb 24, 2022
Priority dateDec 11, 2018
Publication dateAug 1, 2023
Grant dateAug 1, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system may input the unstructured data into a Naïve Bayes machine learning model, a long short-term memory (LSTM) machine learning model, a named entity recognition (NER) model, a semantic role labeling (SRL) model, a sentiment scoring algorithm, and/or a gradient boosted regression tree (GBRT) machine learning model. Based on determining that the unstructured data is of interest, a data alert may be generated and transmitted for manual review or as part of an automated decisioning process.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: retrieving, by a processor, unstructured data from an internet data source, wherein the retrieval is performed as a parallel process to evaluate data from various data sources; preprocessing, by the processor, the unstructured data by performing a part-of-speech tagging process; inputting, by the processor, the preprocessed unstructured data into a machine learning model and a sentiment scoring engine, wherein the machine learning model and the sentiment scoring engine are trained to identify data of interest to be used in a decisioning process; identifying, by the processor, the data of interest from the preprocessed unstructured data in response to an output of the machine learning model and the sentiment scoring engine indicating that the preprocessed unstructured data has a probability of being of interest; and generating, by the processor, a data alert in response to identifying the data of interest, wherein the data alert comprises at least one of the preprocessed unstructured data, a web link, or an output of at least one of the machine learning model or the sentiment scoring engine. 2. The method of claim 1 , further comprising transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business, wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 3. The method of claim 1 , further comprising: inputting, by the processor, the preprocessed unstructured data into a named entity recognition (NER) model, wherein the NER model is trained to identify the data of interest; identifying, by the processor, the data of interest from the preprocessed unstructured data in response to an output of the NER model indicating that the preprocessed unstructured data has a probability of being of interest; and generating, by the processor, the data alert in response to identifying the data of interest, wherein the data alert comprises at least one of the preprocessed unstructured data, the web link, or the output of at least one of the machine learning model, the sentiment scoring engine, or the NER model. 4. The method of claim 3 , further comprising: inputting, by the processor, the output of the machine learning model, the sentiment scoring engine, and the NER model into a gradient boosted regression tree (GBRT) machine learning model; and identifying, by the processor, the data of interest based on an output of the GBRT machine learning model indicating that the preprocessed unstructured data has a probability of being of interest. 5. The method of claim 1 , further comprising: inputting, by the processor, the preprocessed unstructured data into a semantic role labeling (SRL) model, wherein the SRL model is trained to identify the data of interest; identifying, by the processor, the data of interest from the preprocessed unstructured data in response to an output of the SRL indicating that the preprocessed unstructured data has a probability of being of interest; and generating, by the processor, the data alert in response to identifying the data of interest, wherein the data alert comprises at least one of the preprocessed unstructured data, the web link, or the output of at least one of the machine learning model, the sentiment scoring engine, or the SRL model. 6. The method of claim 5 , further comprising: inputting, by the processor, the output of the machine learning model, the sentiment scoring engine, and the SRL model into a gradient boosted regression tree (GBRT) machine learning model; and identifying, by the processor, the data of interest based on an output of the GBRT machine learning model indicating that the preprocessed unstructured data has a probability of being of interest. 7. The method of claim 1 , further comprising: inputting, by the processor, the output of the machine learning model and the sentiment scoring engine into a gradient boosted regression tree (GBRT) machine learning model; and identifying, by the processor, the data of interest based on an output of the GBRT machine learning model indicating that the preprocessed unstructured data has a probability of being of interest. 8. A non-transitory computer readable medium including instructions for causing a computing system to perform operations comprising: retrieving, by a processor, unstructured data from an internet data source, wherein the retrieval is performed as a parallel process to evaluate data from various data sources; preprocessing, by the processor, the unstructured data by performing a part-of-speech tagging process; inputting, by the processor, the preprocessed unstructured data into a machine learning model and a sentiment scoring engine, wherein the machine learning model and the sentiment scoring engine are trained to identify data of interest to be used in a decisioning process; identifying, by the processor, the data of interest from the preprocessed unstructured data in response to an output of the machine learning model and the sentiment scoring engine indicating that the preprocessed unstructured data has a probability of being of interest; generating, by the processor, a data alert in response to identifying the data of interest, wherein the data alert comprises at least one of the preprocessed unstructured data, a web link, or an output of at least one of the machine learning model or the sentiment scoring engine; and transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business, wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 9. The non-transitory computer readable medium of claim 8 , the operations further comprising transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business, wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 10. The non-transitory computer readable medium of claim 8 , the operations further comprising: inputting, by the processor, the preprocessed unstructured data into a named entity recognition (NER) model, wherein the NER model is trained to identify the data of interest; identifying, by the processor, the data of interest from the preprocessed unstructured data in response to an output of the NER model indicating that the preprocessed unstructured data has a probability of being of interest; and generating, by the processor, the data alert in response to identifying the data of interest, wherein the data alert comprises at least one of the preprocessed unstructured data, the web link, or the output of at least one of the machine learning model, the sentiment scoring engine, or the NER model. 11. The non-transitory computer readable medium of claim 10 , the operations further comprising: inputting, by the processor, the output of the machine learning model, the sentiment scoring engine, and the NER model into a gradient boosted regression tree (GBRT) machine learning model; and identifying, by the processor, the data of interest based on an output of the GBRT machine learning model indicating that the preprocessed unstructured data has a probability of being of interest. 12. The non-transitory computer readable medium of claim 8 ,

Assignees

Inventors

Classifications

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • G06F40/30Primary

    Semantic analysis · CPC title

  • Retrieval from the web · CPC title

  • Named entity recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11714968B2 cover?
Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system …
Who is the assignee on this patent?
American Express Travel Related Services Co Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 01 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).