Privacy against inference attacks for large data
US-2015379275-A1 · Dec 31, 2015 · US
US10176245B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10176245-B2 |
| Application number | US-56688209-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 25, 2009 |
| Priority date | Sep 25, 2009 |
| Publication date | Jan 8, 2019 |
| Grant date | Jan 8, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method, system, and computer program product for producing a semantic query by example are provided. The method includes receiving examples of potential results from querying a database table with an associated ontology, and extracting features from the database table and the examples based on the associated ontology. The method further includes training a classifier based on the examples and the extracted features, and applying the classifier to the database table to obtain a semantic query result. The method also includes outputting the semantic query result to a user interface, and requesting user feedback of satisfaction with the semantic query result. The method additionally includes updating the classifier and the semantic query result iteratively in response to the user feedback.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for producing a semantic query by example, comprising: receiving a database table and an ontology associated with the database table; receiving, from a user, examples of positive query results for an undefined database query, wherein the examples of positive query results are a user-generated list of examples of known related words; extracting features from the database table and the examples of positive query results based on the associated ontology, wherein the extracted features of the database table define associations between records in tuples of the database table and concepts of the associated ontology; training a classifier based at least in part on training data, the training data comprising the examples received from the user, the extracted features, and a negative example that is a randomly selected tuple of the database table that is not included in the examples received from the user; refining an accuracy of the classifier by executing active learning logic to: generate a support vector machine for a candidate pool of examples, wherein the support vector machine comprises a decision boundary and a negative margin; present a first candidate example outside the negative margin to a user for labeling; receive an indication that the first candidate example has been labeled as negative by the user; present a second candidate example to the user for labeling, wherein the second candidate example is closer to the decision boundary than the first candidate example and within the negative margin; receive an indication that the second candidate example has been labeled negative by the user; in response to the user having labeled the second candidate example as negative and the second candidate example being closer to the decision boundary than the first candidate example, shift the negative margin closer to the decision boundary; select the second candidate example as an optimized example; and include the optimized example in the training data; applying the classifier to the database table to obtain a semantic query result; and outputting the semantic query result to a user interface. 2. The method of claim 1 further comprising: storing the extracted features from the database table as classifying feature vectors defined in a feature space; and creating training data in the feature space from the examples, wherein the training of the classifier is performed by machine learning in the feature space using the classifying feature vectors and the training data. 3. The method of claim 2 wherein the classifying feature vectors are stored during an off-line phase prior to receiving the examples, and the training data are created in an on-line phase in response to receiving the examples. 4. The method of claim 2 further comprising: generating negative examples as samples selected from the database table; and incorporating the negative examples into the training data. 5. The method of claim 2 wherein each tuple in the database table is associated with one or more concept nodes in the ontology based on mapping between values in the records and concept names in the ontology. 6. The method of claim 2 wherein unique concepts in the ontology are dimensions in the feature space, the value of each dimension is the shortest distance between a selected record associated with one of the unique concepts and other nodes in the ontology representing other unique concepts, and the number of dimensions is limited to a maximum value. 7. The method of claim 6 further comprising: finding concept nodes associated with the selected record by matching concept node labels to a value of the selected record; using one of a depth-first search and a breadth-first search to find concept nodes in a neighborhood, wherein the search is limited to the maximum value; and constructing the classifying feature vectors as a union of neighborhood concept nodes and associated distances. 8. A system for producing a semantic query by example, comprising: semantic query by example logic configured to execute on a processing unit and output to a user interface, wherein the semantic query by example logic is configured to perform a method comprising: receiving a database table and an ontology associated with the database table; receiving, from a user, examples of positive query results for an undefined database query, wherein the examples of positive query results are a user-generated list of examples of known related words; extracting features from the database table and the examples of positive query results based on the associated ontology, wherein the extracted features of the database table define associations between records in tuples of the database table and concepts of the associated ontology; training a classifier based at least in part on the examples received from the user, the extracted features, and a negative example that is a randomly selected tuple of the database table that is not included in the examples received from the user; refining an accuracy of the classifier by executing active learning logic to: generate a support vector machine for a candidate pool of examples, wherein the support vector machine comprises a decision boundary and a negative margin; present a first candidate example outside the negative margin to a user for labeling; receive an indication that the first candidate example has been labeled as negative by the user; present a second candidate example to the user for labeling, wherein the second candidate example is closer to the decision boundary than the first candidate example and within the negative margin; receive an indication that the second candidate example has been labeled negative by the user; in response to the user having labeled the second candidate example as negative and the second candidate example being closer to the decision boundary than the first candidate example, shift the negative margin closer to the decision boundary; select the second candidate example as an optimized example; and include the optimized example in the training data; applying the classifier to the database table to obtain a semantic query result; and outputting the semantic query result to the user interface. 9. The system of claim 8 wherein the semantic query by example logic is further configured to perform: storing the extracted features from the database table as classifying feature vectors defined in a feature space; and creating training data in the feature space from the examples, wherein the training of the classifier is performed by machine learning in the feature space using the classifying feature vectors and the training data. 10. The system of claim 9 wherein the classifying feature vectors are stored during an off-line phase prior to receiving the examples, and the training data are created in an on-line phase in response to receiving the examples. 11. The system of claim 9 wherein unique concepts in the ontology are dimensions in the feature space, the value of each dimension is the shortest distance between a selected record associated with one of the unique concepts and other nodes in the ontology representing other unique concepts, and the number of dimensions is limited to a maximum value. 12. The system of claim 11 wherein the semantic query by example logic is further configured to perform: finding concept nodes associated with the selected record by matching concept node labels to a value of the selected record; using one of a depth-first search and a breadth-first search to find concept nodes in a neighborhood, wherein the search is limited to the maxim
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
using natural language analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.