Deep User Modeling by Behavior
US-2021231449-A1 · Jul 29, 2021 · US
US12062059B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12062059-B2 |
| Application number | US-202016930279-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 15, 2020 |
| Priority date | May 25, 2020 |
| Publication date | Aug 13, 2024 |
| Grant date | Aug 13, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure herein describes a system for generating embeddings representing sequential human activity by self-supervised, deep learning models capable of being utilized by a variety of machine learning prediction models to create predictions and recommendations. An encoder-decoder is provided to create user-specific journeys, including sequenced events, based on human activity data from a plurality of tables, a customer data platform, or other sources. Events are represented by sequential feature vectors. A user-specific embedding representing user activities in relationship to activities of one or more other users is created for each user in a plurality of users. The embeddings are updated in real-time as new activity data is received. The embeddings can be fine-tuned using labeled data to customize the embeddings for a specific predictive model. The embeddings are utilized by predictive models to create product recommendations and predictions, such as customer churn, next steps in a customer journey, etc.
Opening claim text (preview).
What is claimed is: 1. A system for generating embeddings representing sequential human activity, the system comprising: a plurality of data sources associated with at least one data storage device storing unlabeled activity data associated with a plurality of users and a set of time indicators, the unlabeled activity data describing human activity-related events having order over time, and unlabeled non-sequential data associated with the plurality of users, the unlabeled non-sequential data representing non-sequential user-specific data; a computer-readable medium storing instructions that are operative upon execution by a processor to: create a plurality of non-sequential feature vectors based on the unlabeled non-sequential data; create, by a sequencing component associated with a neural network, a plurality of user-specific journeys based on the unlabeled activity data, a user-specific journey in the plurality of user-specific journeys comprising a plurality of sequential feature vectors corresponding to a set of events associated with selected user placed into a sequence in accordance with the set of time indicators; combine, by an embedding component, the plurality of sequential feature vectors in the plurality of user-specific journeys and the plurality of non-sequential feature vectors; and generate, by the embedding component, a plurality of embeddings based on the combination of the plurality of sequential feature vectors in the plurality of user-specific journeys and the plurality of non-sequential feature vectors and a set of weights, an embedding in the plurality of embeddings comprising a set of fixed length vectors representing sequential human activity of the selected user, wherein the plurality of embeddings are suitable for utilization by a plurality of prediction models configured to generate user-specific activity predictions and recommendations; generate, by a decoder component, a plurality of regenerated sequential feature vectors and a plurality of regenerated non-sequential feature vectors based on the plurality of embeddings; compare, by a comparison component, the plurality of regenerated sequential feature vectors to the plurality of sequential feature vectors and the plurality of regenerated non-sequential feature vectors to the plurality of non-sequential feature vectors to identify a set of errors, wherein the set of errors are used to update the set of weights; and generate explainability data, the explainability data indicating contribution of a dimension value of the embedding to the user-specific predictions generated by the plurality of prediction models. 2. The system of claim 1 , further comprises: an encoder-decoder framework of the neural network providing a self-supervised activity sequencing model configured to generate the plurality of embeddings based on the unlabeled activity data obtained from the plurality of data sources, wherein the sequencing component and the embedding component are part of the self-supervised activity sequencing model. 3. The system of claim 2 , wherein the instructions are further operative to: analyze, by an encoder component, unlabeled input data for training the self-supervised activity sequencing model, wherein the unlabeled input data comprises historical human activity data; generate, by the encoder component, an embedding representing sequenced human activity; and perform, by the comparison component, back propagation. 4. The system of claim 1 , wherein the selected user is a first user and wherein instructions are further operative to: receive, from a customer data platform, updated activity data representing a set of new activities by a second user in the plurality of users in real-time; generate, by the sequencing component, an updated user-specific journey for the second user; and create, by the embedding component, an updated user-specific embedding for the second user, wherein the updated user-specific embedding comprises a set of fixed length embeddings representing human activities associated with the second user, including the set of new activities described in the updated activity data. 5. The system of claim 1 , wherein the explainability data is presented as a separate output from the plurality of embeddings. 6. The system of claim 1 , wherein the embedding component comprises at least one long short-term memory (LSTM) artificial recurrent neural network architecture for generating the plurality of embeddings. 7. The system of claim 1 , wherein the instructions are further operative to: fine-tune the embedding component using labeled input data to generate embeddings for a predictive model selected from a set of machine learning (ML) prediction models. 8. A method of generating embeddings representing sequential human activity, the method comprising: creating, by a sequencing component, a plurality of user-specific journeys based on unlabeled activity data obtained from a plurality of data sources, a user-specific journey in the plurality of user-specific journeys comprising a plurality of sequential feature vectors corresponding to a set of events associated with selected user placed into a sequence in accordance with a set of time indicators; creating a plurality of non-sequential feature vectors based on unlabeled non-sequential data obtained from the plurality of data sources, the unlabeled non-sequential data representing non-sequential user-specific data; combining, by an embedding component, the plurality of sequential feature vectors in the plurality of user-specific journeys and the plurality of non-sequential feature vectors; generating, by the embedding component, a plurality of embeddings based on the combination of the plurality of sequential feature vectors in the plurality of user-specific journeys and the plurality of non-sequential feature vectors and a set of weights, an embedding in the plurality of embeddings comprising a set of fixed length vectors representing sequential human activity of the selected user; outputting the plurality of embeddings to a set of machine learning prediction models for generating user-specific activity predictions and recommendations based on the unlabeled activity data and the unlabeled non-sequential data associated with a plurality of users; generating, by a decoder component, a plurality of regenerated sequential feature vectors and a plurality of regenerated non-sequential feature vectors based on the plurality of embeddings; comparing, by a comparison component, the plurality of regenerated sequential feature vectors to the plurality of sequential feature vectors and the plurality of regenerated non-sequential feature vectors to the plurality of non-sequential feature vectors to identify a set of errors, wherein the set of errors are used to update the set of weights; and generating explainability data, the explainability data indicating contribution of a dimension value of the embedding to the user-specific predictions generated by the plurality of prediction models. 9. The method of claim 8 , further comprising: analyzing, by an encoder component, input data for training a self-supervised activity sequencing model, wherein the input data comprises unlabeled historical human activity data; generating, by the encoder component, an embedding representing sequenced human activity; and performing, by the comparison component, back propagation. 10. The method of claim 8 , further comprising: receiving, from a customer data platform, updated activity data representing a set of new activities by a second user in the plurality of users in real-time; generating, by the sequencing component, an updated user-specific journey for the second user; and
Auto-encoder networks; Encoder-decoder networks · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Supervised learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.