What technology area does this patent fall under?

Primary CPC classification G06Q30/0201. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems, methods, and apparatuses for implementing data upload, processing, and predictive query API exposure

US9690815B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9690815-B2
Application number	US-201615181256-A
Country	US
Kind code	B2
Filing date	Jun 13, 2016
Priority date	Mar 13, 2013
Publication date	Jun 27, 2017
Grant date	Jun 27, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are systems and methods for implementing data upload, processing, and predictive query API exposure including means for receiving a dataset in a tabular form, the dataset having a plurality of rows and a plurality of columns; processing the dataset to generate indices representing probabilistic relationships between the rows and the columns of the dataset; storing the indices in a database; exposing an Application Programming Interface (API) to query the indices in the database; receiving a request for a predictive query or a latent structure query against the indices in the database; querying the database for a prediction based on the request via the API; and returning the prediction responsive to the request. Other related embodiments are further disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a dataset in a tabular form, the dataset having a plurality of rows and a plurality of columns; processing the dataset to generate indices representing probabilistic relationships between the rows and the columns of the dataset; storing the indices in a database; exposing an Application Programming Interface (API) to query the indices in the database; receiving a request for a predictive query against the indices in the database; querying the database for a predictive result based on the request via the API; returning the predictive result responsive to the request, the predictive result being probabilistically related to rows or the columns of the dataset or both the rows and the columns of the dataset according to the generated indices representing the probabilistic relationships between the rows and the columns of the dataset; and returning a confidence indicator with the predictive result, wherein the confidence indicator ranges from a minimum of 0 indicating a lowest possible confidence in the accuracy of the predictive result returned to a maximum of 1 indicating a highest possible confidence in the accuracy of the predictive result returned. 2. The method of claim 1 , wherein processing the dataset comprises learning a joint probability distribution over the dataset to identify and describe the probabilistic relationships between elements of the dataset. 3. The method of claim 2 , wherein the processing is triggered automatically responsive to receiving the dataset, and wherein learning the joint probability distribution is controlled by a default set of configuration parameters. 4. The method of claim 2 , wherein learning the joint probability distribution is controlled by specified configuration parameters, the specified configuration parameters including one or more of: a maximum period of time for processing the dataset; a maximum number of iterations for processing the dataset; a minimum number of iterations for processing the dataset; a maximum amount of customer resources to be consumed by processing the dataset; a maximum subscriber fee to be expended processing the dataset; a minimum threshold predictive quality level to be attained by the processing of the dataset; a minimum improvement to a confidence quality measure required for the processing to continue; and a minimum or maximum number of the indices to be generated by the processing. 5. The method of claim 1 , wherein: processing the dataset to generate indices comprises iteratively learning joint probability distributions over the dataset to generate the indices; and wherein the method further comprises: periodically determining a confidence quality measure of the indices generated by the processing of the dataset; and terminating processing of the dataset when the confidence quality measure attains a specified threshold. 6. The method of claim 5 , further comprising: receiving a predictive query or a latent structure query requesting a result from the indices generated by processing the dataset; and executing the query against the generated indices prior to terminating processing of the dataset. 7. The method of claim 6 , further comprising: returning a predictive record set responsive to the predictive query or the latent structure query requesting the result; and returning a notification with the result indicating processing of the dataset has not yet completed or a notification with the result indicating the confidence quality measure is below the specified threshold, or both. 8. The method of claim 5 , wherein the confidence quality measure is determined by comparing a known result corresponding to observed and present values within the dataset with a predictive result obtained by querying the indices generated by the processing of the dataset. 9. The method of claim 5 , wherein the confidence quality measure is determined by comparing ground truth data from the data set with one or more predictive results obtained by querying the indices generated by the processing of the dataset. 10. The method of claim 1 , wherein processing the dataset comprises at least one of: learning a Dirichlet Process Mixture Model (DPMM) of the dataset; learning a cross categorization of the dataset; learning an Indian buffet process model of the dataset; and learning a mixture model or a mixture of finite mixtures model of the dataset. 11. Non-transitory computer readable storage media having instructions stored thereupon that, when executed by a system having at least a processor and a memory therein, the instructions cause the system to perform operations including: receiving a dataset in a tabular form, the dataset having a plurality of rows and a plurality of columns; processing the dataset to generate indices representing probabilistic relationships between the rows and the columns of the dataset; storing the indices in a database; exposing an Application Programming Interface (API) to query the indices in the database; receiving a request for a predictive query against the indices in the database; querying the database for a predictive result based on the request via the API; returning the predictive result responsive to the request, the predictive result being probabilistically related to rows or the columns of the dataset or both the rows and the columns of the dataset according to the generated indices representing the probabilistic relationships between the rows and the columns of the dataset; and returning a confidence indicator with the predictive result, wherein the confidence indicator ranges from a minimum of 0 indicating a lowest possible confidence in the accuracy of the predictive result returned to a maximum of 1 indicating a highest possible confidence in the accuracy of the predictive result returned. 12. The non-transitory computer readable storage media of claim 11 : wherein processing the dataset comprises learning a joint probability distribution over the dataset to identify and describe the probabilistic relationships between elements of the dataset. 13. The non-transitory computer readable storage media of claim 12 , wherein the processing is triggered automatically responsive to receiving the dataset, and wherein learning the joint probability distribution is controlled by a default set of configuration parameters. 14. The non-transitory computer readable storage media of claim 12 , wherein learning the joint probability distribution is controlled by specified configuration parameters, the specified configuration parameters including one or more of: a maximum period of time for processing the dataset; a maximum number of iterations for processing the dataset; a minimum number of iterations for processing the dataset; a maximum amount of customer resources to be consumed by processing the dataset; a maximum subscriber fee to be expended processing the dataset; a minimum threshold predictive quality level to be attained by the processing of the dataset; a minimum improvement to a confidence quality measure required for the processing to continue; and a minimum or maximum number of the indices to be generated by the processing. 15. The non-transitory computer readable storage media of claim 12 , wherein: processing the dataset to generate indices comprises iteratively learning joint probability distributions over the dataset to generate the indices; and wherein the method further comprises: periodically determining a confidence quality measure of the indices generated by the processing of the dataset; terminating processing of the dataset when

Assignees

Salesforce Com Inc

Inventors

Classifications

G06F16/244
Grouping and aggregation · CPC title
G06F16/316
Indexing structures · CPC title
G06F16/2458
Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries · CPC title
G06F17/18
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
G06F16/2445
Data retrieval commands; View definitions · CPC title

Patent family

Related publications grouped by family.

View patent family 51532089

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9690815B2 cover?: Disclosed herein are systems and methods for implementing data upload, processing, and predictive query API exposure including means for receiving a dataset in a tabular form, the dataset having a plurality of rows and a plurality of columns; processing the dataset to generate indices representing probabilistic relationships between the rows and the columns of the dataset; storing the indices i…
Who is the assignee on this patent?: Salesforce Com Inc
What technology area does this patent fall under?: Primary CPC classification G06Q30/0201. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).