Learning systems and methods
US-2015055855-A1 · Feb 26, 2015 · US
US9424530B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9424530-B2 |
| Application number | US-201514604765-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 26, 2015 |
| Priority date | Jan 26, 2015 |
| Publication date | Aug 23, 2016 |
| Grant date | Aug 23, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, computing systems and computer program products implement embodiments of the present invention that include selecting a training dataset including training instances having respective training features, and applying a classifier to the training dataset, thereby generating a training classification that assigns, to each of the training instances, one of a plurality of categories, the classifier having an expected classification. A classification bias is detected in the training classification relative to the expected classification, and in response to the classification bias, a calibration matrix is defined based on the training features, and the classification bias. A production dataset including production instances is selected, and the classifier and the calibration matrix are applied to the production dataset, thereby generating a production classification quantification that assigns, to each of the production instances, one of the plurality of categories.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: selecting a training dataset comprising training instances having respective training features; applying a classifier to the training dataset, thereby generating a training classification that assigns, to each of the training instances, one of a plurality of categories, the classifier having an expected classification; detecting a classification bias in the training classification relative to the expected classification; defining, in response to the classification bias, a calibration matrix based on conditional probabilities of features of a production dataset given occurrences of corresponding training features; selecting a production dataset comprising production instances; and applying the classifier and the calibration matrix to the production dataset, thereby generating a production classification quantification that assigns, to each of the production instances, one of the plurality of categories. 2. The method according to claim 1 , wherein the training set comprises an annotated training set and the production dataset comprises an unseen dataset. 3. The method according to claim 1 , wherein the calibration matrix comprises multiple matrix entries, each of the matrix entries comprising a given feature and an adjustment factor. 4. The method according to claim 3 , wherein the classification bias comprises a respective individual bias for each of the features. 5. The method according to claim 4 , wherein defining the calibration matrix comprises calculating the respective adjustment factor for each of the matrix entries based on the respective individual bias. 6. The method according to claim 1 , wherein applying the classifier and the calibration matrix comprises applying the classifier to the production dataset, thereby generating an intermediate classification, and applying the calibration matrix to the intermediate classification, thereby generating the production classification quantification. 7. The method according to claim 6 , wherein the classification bias comprises a training classification bias, and wherein the intermediate classification has an intermediate classification bias relative to the expected classification, and wherein the production classification quantification has a production classification bias that is less than the intermediate classification bias. 8. An apparatus, comprising: a memory configured to store a classifier, a training dataset comprising training instances having respective training features, and a production dataset comprising production instances; and a processor configured: to apply a classifier to the training dataset, thereby generating a training classification that assigns, to each of the training instances, one of a plurality of categories, the classifier having an expected classification, to detect a classification bias in the training classification relative to the expected classification, to define, in response to the classification bias, a calibration matrix based on conditional probabilities of features of the production dataset given occurrences of corresponding training features; to apply the classifier and the calibration matrix to the production dataset, thereby generating a production classification quantification that assigns, to each of the production instances, one of the plurality of categories. 9. The apparatus according to claim 8 , wherein the training set comprises an annotated training set and the production dataset comprises an unseen dataset. 10. The apparatus according to claim 8 , wherein the calibration matrix comprises multiple matrix entries, each of the matrix entries comprising a given feature and an adjustment factor. 11. The apparatus according to claim 10 , wherein the classification bias comprises a respective individual bias for each of the features. 12. The apparatus according to claim 11 , wherein the processor is configured to define the calibration matrix by calculating the respective adjustment factor for each of the matrix entries based on the respective individual bias. 13. The apparatus according to claim 8 , wherein the processor is configured to apply the classifier and the calibration matrix by applying the classifier to the production dataset, thereby generating an intermediate classification, and applying the calibration matrix to the intermediate classification, thereby generating the production classification quantification. 14. The apparatus according to claim 13 , wherein the classification bias comprises a training classification bias, and wherein the intermediate classification has an intermediate classification bias relative to the expected classification, and wherein the production classification quantification has a production classification bias that is less than the intermediate classification bias. 15. A computer program product, the computer program product comprising: a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising: computer readable program code configured to select a training dataset comprising training instances having respective training features; computer readable program code configured to apply a classifier to the training dataset, thereby generating a training classification that assigns, to each of the training instances, one of a plurality of categories, the classifier having an expected classification; computer readable program code configured to detect a classification bias in the training classification relative to the expected classification; computer readable program code configured to define, in response to the classification bias, a calibration matrix based on conditional probabilities of features of a production dataset given occurrences of corresponding the training features computer readable program code configured to select a production dataset comprising production instances; and computer readable program code configured to apply the classifier and the calibration matrix to the production dataset, thereby generating a production classification quantification that assigns, to each of the production instances, one of the plurality of categories. 16. The computer program product according to claim 15 , wherein the training set comprises an annotated training set and the production dataset comprises an unseen dataset. 17. The computer program product according to claim 15 , wherein the calibration matrix comprises multiple matrix entries, each of the matrix entries comprising a given feature and an adjustment factor. 18. The computer program product according to claim 17 , wherein the classification bias comprises a respective individual bias for each of the features, and wherein the computer readable program code configured to define the calibration matrix by calculating the respective adjustment factor for each of the matrix entries based on the respective individual bias. 19. The computer program product according to claim 15 , wherein the computer readable program code is configured to apply the classifier and the calibration matrix by applying the classifier to the production dataset, thereby generating an intermediate classification, and applying the calibration matrix to the intermediate classification, thereby generating the production classification quantification. 20. The computer program product according to claim 19 , wherein the classification bias comprises a training classification bias, and wherein the inter
Probabilistic graphical models, e.g. probabilistic networks · CPC title
by evaluating different subsets according to an optimisation criterion, e.g. class separability, forward selection or backward elimination · CPC title
Physics · mapped topic
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.