Data imputation of unknown-unknown data and use thereof

US2025086501A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025086501-A1
Application numberUS-202318541768-A
CountryUS
Kind codeA1
Filing dateDec 15, 2023
Priority dateSep 12, 2023
Publication dateMar 13, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure provide for improved data imputation and use of imputed data in processing of downstream models. Some embodiments specially train a model that performs improved data imputation utilizing a specially-configured attention mechanism. Some embodiments train a model utilizing stratified masking. Some embodiments train a particular pre-processing layer of a downstream task-specific model to adaptively learn threshold values for imputing particular data. The pre-processing layer is usable to improve accuracy training and/or use of a downstream task-specific model based at least in part on the imputed data.

First claim

Opening claim text (preview).

1 . A computer-implemented method comprising: identifying, by one or more processors, a truth source data set associated with a plurality of data parameters; generating, by the one or more processors, an updated truth source data set by augmenting the truth source data set utilizing a stratified masking algorithm that masks at least one data parameter from the truth source data set; generating, by the one or more processors and using a trained model that comprises at least one attention layer, a probability data set that comprises a particular probability that a particular data parameter should be present in the updated truth source data set; and generating, by the one or more processors and based at least in part on the probability data set, a probability threshold set corresponding to each data parameter represented in the probability data set. 2 . The computer-implemented method of claim 1 , wherein generating the probability threshold set comprises: training, by the one or more processors, a task-specific model to generate at least the probability threshold set, wherein the task-specific model comprises at least one pre-processing layer that learns a particular probability threshold for each data parameter of the plurality of data parameters. 3 . The computer-implemented method of claim 2 , further comprising: determining, by the one or more processors, that the particular probability corresponding to the particular data parameter satisfies the particular probability threshold corresponding to the particular data parameter; and in response to determining that the particular probability threshold is satisfied, skipping, by the one or more processors, updating of the particular probability data in an updated probability data set. 4 . The computer-implemented method of claim 2 , further comprising: determining, by the one or more processors, that the particular probability corresponding to the particular data parameter does not satisfy the particular probability threshold corresponding to the particular data parameter; and generating, by the one or more processors, updated probability data corresponding to the particular probability threshold by updating the particular probability data corresponding to the particular probability threshold to zero in response to determining that the particular probability will not satisfy. 5 . The computer-implemented method of claim 4 , further comprising: applying, by the one or more processors, a second data set to the task-specific model, wherein the task-specific model is configured to ignore at least one non-imputed data parameter based at least in part on the updated probability data. 6 . The computer-implemented method of claim 2 , wherein the task-specific model is determined based at least in part on a machine learning task determined to be performed. 7 . The computer-implemented method of claim 2 , further comprising: training, by the one or more processors, a second task-specific model to generate at least a second probability threshold set, wherein the second task-specific model comprises at least one second pre-processing layer that learns a second particular probability threshold for each data parameter of the plurality of data parameters; and generating, by the one or more processors and based at least in part on the probability data set, the second probability threshold set corresponding to each data parameter represented in the probability data set. 8 . The computer-implemented method of claim 1 , wherein identifying the truth source data set comprises combining a first set of data and a second set of data based at least in part on identifiers shared between the first set of data and the second set of data. 9 . The computer-implemented method of claim 1 , further comprising training the trained model by at least: applying, by the one or more processors, at least a subset of the updated truth source data set corresponding to a particular identifier to a transformer model. 10 . The computer-implemented method of claim 1 , wherein the at least one attention layer comprises a set attention block comprising a plurality of layers, wherein at least a subset of the updated truth source data set is processed via the plurality of layers of the set attention block, and wherein attention block output from the set attention block is provided to a parallel linear block that generates a tensor corresponding to the attention block output. 11 . The computer-implemented method of claim 10 , wherein the tensor is applied to a sigmoid activation function that outputs the probability that the particular data parameter should be present in the updated truth source data set for each data parameter of the any number of data parameters. 12 . The computer-implemented method of claim 1 , further comprising: receiving, by the one or more processors, a second parameter data set; and generating, by the one or more processors, task-specific results by applying the second parameter data set to a task-specific model trained based at least in part on the probability threshold set. 13 . A system comprising memory and one or more processors communicatively coupled to the memory, the one or more processors configured to: identify a truth source data set associated with a plurality of data parameters; generate an updated truth source data set by augmenting the truth source data set utilizing a stratified masking algorithm that masks at least one data parameter from the truth source data set; generate, using a trained model that comprises at least one attention layer, a probability data set that comprises a particular probability that a particular data parameter should be present in the updated truth source data set; and generate, based at least in part on the probability data set, a probability threshold set corresponding to each data parameter represented in the probability data set. 14 . The system of claim 13 , wherein to generate the probability threshold set the one or more processors are further configured to: train a task-specific model to generate at least the probability threshold set, wherein the task-specific model comprises at least one pre-processing layer that learns a particular probability threshold for each data parameter of the plurality of data parameters. 15 . The system of claim 14 , wherein the one or more processors are further configured to: determine that the particular probability corresponding to the particular data parameter satisfies the particular probability threshold corresponding to the particular data parameter; and in response to determining that the particular probability threshold is satisfied, skip updating of the particular probability data in an updated probability data set. 16 . The system of claim 14 , wherein the one or more processors are further configured to: determine that the particular probability corresponding to the particular data parameter does not satisfy the particular probability threshold corresponding to the particular data parameter; and generate updated probability data corresponding to the particular probability threshold by updating the probability data corresponding to the particular probability threshold to zero in response to determining that the particular probability will not satisfy. 17 . The system of claim 16 , wherein the one or more processors are further configured to: apply a second data set to the task-specific model, wherein the task-specific model is configured to ignore at least one non-imputed data parameter based at least in part on the

Assignees

Inventors

Classifications

  • Backpropagation, e.g. using gradient descent · CPC title

  • Combinations of networks · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025086501A1 cover?
Embodiments of the present disclosure provide for improved data imputation and use of imputed data in processing of downstream models. Some embodiments specially train a model that performs improved data imputation utilizing a specially-configured attention mechanism. Some embodiments train a model utilizing stratified masking. Some embodiments train a particular pre-processing layer of a downs…
Who is the assignee on this patent?
Optum Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).