Automated machine-learning dataset preparation
US-2021342640-A1 · Nov 4, 2021 · US
US2021398024A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021398024-A1 |
| Application number | US-202016908051-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 22, 2020 |
| Priority date | Jun 22, 2020 |
| Publication date | Dec 23, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device may receive a data input that is associated with an event. The device may parse the data input to identify an input value that is associated with the event. The device may determine a probability that the input value corresponds to a feature of the event based on a configuration of the input value. The device may classify the input value as being associated with an element of the event based on the probability. The device may determine a rule profile of the input value based on the feature and the element. The device may determine a profile score associated with the data input based on the rule profile. The device may ingest, based on the profile score, the data input into a data structure. The device may determine a validation score based on a random factorization analysis of the rule profile and the input value.
Opening claim text (preview).
What is claimed is: 1 . A method, comprising: receiving, by a device and from a source device, a data input that is associated with an event that is associated with the source device; parsing, by the device, the data input to identify an input value that is associated with the event; determining, by the device and using a first model, a probability that the input value corresponds to a feature of the event based on a configuration of the input value; classifying, by the device and using the first model, the input value as being associated with an element of the event based on the probability; determining, by the device and using the first model, a profiling rule to profile of the input value based on the feature and the element; profiling, by the device and using the first model, the data input based on the profiling rule; ingesting, by the device and based on a result of profiling the data input and an ingest rule determined by the first model, the data input into a data structure associated with an entity involved in the event; validating, by the device and based on ingesting the data input and a validation rule, the data input; and performing, by the device and based on validating the data input, an action associated with the data input and the entity. 2 . The method of claim 1 , wherein parsing the data input comprises: determining a data format of the data input based on a characteristic of the data input; and parsing the data input based on the data format. 3 . The method of claim 1 , further comprising, prior to determining the probability: selecting, based on the configuration, the feature from a plurality of features based on the configuration being mapped to the feature, wherein the probability is determined based on the input value and the configuration being mapped to the feature. 4 . The method of claim 1 , wherein determining the probability comprises: using a deep learning by semantic detection model, of the first model, to: analyze the configuration of the input value based on text types of the input value; determine that the probability corresponds to a highest probability that the feature corresponds to the input value relative to a plurality of other probabilities of other potential features associated with the event; and select the probability and the feature as being associated with the input value based on determining that the probability corresponds to the highest probability. 5 . The method of claim 1 , wherein the input value is one of a plurality of input values of the data input, and wherein profiling the data input comprises: determining corresponding probabilities that the plurality of input values correspond to certain features of the element; and determining a profile score of the data input based on the corresponding probabilities. 6 . The method of claim 1 , wherein performing the action comprises: determining a validation score associated with validating the data input; determining that the validation score satisfies a threshold validation score; and updating the first model based on the result of profiling the data input. 7 . The method of claim 6 , wherein performing the action comprises: processing, based on the validation score satisfying the threshold validation score, the data input based on the feature or the element; and providing, via a user interface, a data output corresponding to a result of processing the data input. 8 . A non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors of a device, cause the device to: receive a data input that is associated with an event; parse the data input to identify an input value that is associated with the event; determine a probability that the input value corresponds to a feature of the event based on a configuration of the input value; classify the input value as being associated with an element of the event based on the probability; determine a rule profile of the input value based on the feature and the element; determine, using a data analysis model, a profile score associated with the data input based on the feature, the element, and the rule profile, wherein the data analysis model is trained to validate data inputs based on historical data inputs and historical information associated with features of input values of the historical data inputs; and perform, based on the profile score, an action associated with the input value and the data input. 9 . The non-transitory computer-readable medium of claim 8 , wherein the one or more instructions, that cause the one or more processors to parse the data input, cause the one or more processors to: determine, using a data resolution model, a data format of the data input, wherein the data resolution model is trained to identify various data formats of data inputs, associated with various data sources, based on historical data inputs associated with a variety of configurations of data the data inputs; and parse the data input based on the data format. 10 . The non-transitory computer-readable medium of claim 8 , wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to, prior to causing the one or more processors to determine the probability: analyze the configuration of the input value based on text types of the input value; determine that the probability corresponds to a highest probability that the feature corresponds to the input value relative to a plurality of other probabilities of other potential features associated with the event; and select the probability and the feature as being associated with the input value based on determining that the probability corresponds to the highest probability. 11 . The non-transitory computer-readable medium of claim 8 , wherein the one or more instructions, that cause the one or more processors to perform the action, cause the one or more processors to: determine that the profile score satisfies a threshold associated with ingesting the data input; and store, based on the profile score satisfying the threshold, the data input in a data structure associated with an entity involved in the event. 12 . The non-transitory computer-readable medium of claim 11 , further comprising, wherein the action is a first action: determine, based on storing the data input, a validation score associated with storing the data input based on applying a random factorization analysis to the rule profile and the input value; and perform, based on the validation score, a second action associated with the data input. 13 . The non-transitory computer-readable medium of claim 12 , wherein the one or more instructions, that cause the one or more processors to perform the second action, cause the one or more processors to: process, according to the rule profile, the data input based on the feature or the element; and update the data analysis model or a feature analysis model based on a result of processing the data input, wherein the feature analysis model was used to determine the probability. 14 . The non-transitory computer-readable medium of claim 12 , wherein the one or more instructions, that cause the one or more processors to perform the second action, cause the one or more processors to: process, according to the rule profile, the data input based on the feature or the element; and provide, via a user interface, a data output corresponding to a result of processing the data input.
Machine learning · CPC title
Parsing · CPC title
Semantic analysis · CPC title
Ensemble learning · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.