Identifying data of interest using machine learning

US12210836B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12210836-B2
Application numberUS-202318206296-A
CountryUS
Kind codeB2
Filing dateJun 6, 2023
Priority dateDec 11, 2018
Publication dateJan 28, 2025
Grant dateJan 28, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system may input the unstructured data into a Naïve Bayes machine learning model, a long short-term memory (LSTM) machine learning model, a named entity recognition (NER) model, a semantic role labeling (SRL) model, a sentiment scoring algorithm, and/or a gradient boosted regression tree (GBRT) machine learning model. Based on determining that the unstructured data is of interest, a data alert may be generated and transmitted for manual review or as part of an automated decisioning process.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: retrieving, by a processor, unstructured data from an internet data source, wherein the retrieval is performed as a parallel process to evaluate data from various data sources; preprocessing, by the processor, the unstructured data by performing a part-of-speech tagging process; inputting, by the processor, the preprocessed unstructured data into a plurality of machine learning models and a sentiment scoring engine in parallel to process the preprocessed unstructured data to identify a given set of topics; generating, a plurality of confidence scores as outputs of the plurality of machine learning models and the sentiment scoring engine, representing probabilities indicating whether the preprocessed unstructured data is of interest to an entity; combining the plurality of confidence scores to obtain a combined confidence score representing a probability indicating whether the preprocessed unstructured data is of interest to the entity; and generating, by the processor, a data alert in response to identifying the data of interest. 2. The method of claim 1 , further comprising transmitting, by the processor, the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business. 3. The method of claim 2 , wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 4. The method of claim 1 , wherein the internet data source comprises: news articles, blogs, social media, or forums. 5. The method of claim 1 , wherein the plurality of machine learning models comprise: a Naïve Bayes model, a long-short term memory (LSTM) model, a name entity recognition (NER) model, or a semantic role labeling (SRL) model. 6. The method of claim 1 , further comprising: inputting, by the processor, the preprocessed unstructured data into the plurality of machine learning models and the sentiment scoring engine in a stage wise manner. 7. A non-transitory computer readable medium that when executed by one or more processors or a computing system, cause the computing system to perform operations comprising: retrieving unstructured data from an internet data source, wherein the retrieval is performed as a parallel process to evaluate data from various data sources; preprocessing the unstructured data by performing a part-of-speech tagging process; inputting the preprocessed unstructured data into a plurality of machine learning models and a sentiment scoring engine in parallel to process the preprocessed unstructured data to identify a given set of topics; generating, a plurality of confidence scores as outputs of the plurality of machine learning models and the sentiment scoring engine, representing probabilities indicating whether the preprocessed unstructured data is of interest to an entity; combining the plurality of confidence scores to obtain a combined confidence score representing a probability indicating whether the preprocessed unstructured data is of interest to the entity; and generating a data alert in response to identifying the data of interest. 8. The non-transitory computer readable medium of claim 7 , wherein the operations further comprise transmitting the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business. 9. The non-transitory computer readable medium of claim 8 , wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 10. The non-transitory computer readable medium of claim 7 , wherein the internet data source comprises: news articles, blogs, social media, or forums. 11. The non-transitory computer readable medium of claim 7 , wherein the plurality of machine learning models comprise: a Naïve Bayes model, a long-short term memory (LSTM) model, a name entity recognition (NER) model, or a semantic role labeling (SRL) model. 12. The non-transitory computer readable medium of claim 7 , wherein the operations further comprise: inputting the preprocessed unstructured data into the plurality of machine learning models and the sentiment scoring engine in a stage wise manner. 13. A computing system comprising: a memory storing instructions; and one or more processors, coupled to the memory, and configured to process the stored instructions to: retrieve unstructured data from an internet data source, wherein the retrieval is performed as a parallel process to evaluate data from various data sources; preprocess the unstructured data by performing a part-of-speech tagging process; input the preprocessed unstructured data into a plurality of machine learning models and a sentiment scoring engine in parallel to process the preprocessed unstructured data to identify a given set of topics; generate, a plurality of confidence scores as outputs of the plurality of machine learning models and the sentiment scoring engine, representing probabilities indicating whether the preprocessed unstructured data is of interest to an entity; combine the plurality of confidence scores to obtain a combined confidence score representing a probability indicating whether the preprocessed unstructured data is of interest to the entity; and generate a data alert in response to identifying the data of interest. 14. The computing system of claim 13 , wherein the one or more processors are further configured to: transmit the data alert to a financial decisioning system to be used in a financial decisioning process of an account of a business, and wherein the financial decisioning process comprises: closing or limiting credit accounts, extending lines of credit, opening transaction accounts, or closing transaction accounts. 15. The computing system of claim 13 , wherein the internet data source comprises: news articles, blogs, social media, or forums. 16. The computing system of claim 13 , wherein the plurality of machine learning models comprise: a Naïve Bayes model, a long-short term memory (LSTM) model, a name entity recognition (NER) model, or a semantic role labeling (SRL) model. 17. The computing system of claim 13 , wherein the one or more processors are further configured to: input the preprocessed unstructured data into the plurality of machine learning models and the sentiment scoring engine in a stage wise manner.

Assignees

Inventors

Classifications

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Named entity recognition · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12210836B2 cover?
Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system …
Who is the assignee on this patent?
American Express Travel Related Services Co Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 28 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).