Method and apparatus for managing recommendation models
US-9218605-B2 · Dec 22, 2015 · US
US9536208B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9536208-B1 |
| Application number | US-201615096552-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 12, 2016 |
| Priority date | Feb 10, 2016 |
| Publication date | Jan 3, 2017 |
| Grant date | Jan 3, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-readable medium is configured to determine a support vector data description (SVDD). For each of a plurality of values for a kernel parameter, an optimal value of an objective function defined for an SVDD model using a kernel function, a read plurality of data points, and a respective value for the kernel parameter is computed to define a plurality of sets of support vectors. A plurality of first derivative values are computed for the objective function as a difference between the computed optimal values associated with successive values for the kernel parameter. A plurality of second derivative values are computed for the objective function as a difference between the computed plurality of first derivative values associated with successive values for the kernel parameter. A kernel parameter value is selected where the computed plurality of second derivative values first exceeds zero.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to: read a plurality of data points from a training dataset; select a plurality of values for a kernel parameter used by a kernel function; for each of the selected plurality of values for the kernel parameter, compute an optimal value of an objective function defined for a support vector data description (SVDD) model using the kernel function, the read plurality of data points, and a respective value for the kernel parameter to define a plurality of sets of support vectors, where each set of support vectors defines a boundary for the read plurality of data points in association with the respective value for the kernel parameter; for each of the selected plurality of values for the kernel parameter, store the computed optimal value and the defined set of support vectors in association with the respective value for the kernel parameter; compute a plurality of first derivative values for the objective function as a difference between the computed optimal values associated with successive values for the kernel parameter; compute a plurality of second derivative values for the objective function as a difference between the computed plurality of first derivative values associated with successive values for the kernel parameter; select a value for the kernel parameter where the computed plurality of second derivative values first crosses zero; and output the selected value for the kernel parameter for identifying an outlier in a scoring dataset. 2. The non-transitory computer-readable medium of claim 1 , wherein the objective function defined for the SVDD model is max(Σ i=1 n α i K(x i ,x i )−Σ i=1 n Σ j=1 n α i α j K(x i ,x j )), subject to Σ i=1 n α i =1 and 0≦α i ≦C, ∀i=1, . . . , n, where K(x i ,x j ) is the kernel function, n is a number of data points of the read plurality of data points, C=1/nf where f is an expected outlier fraction, x i are the read plurality of data points, α i are Lagrange constants, and ∀x k ε SV <C , where SV <C , is the set of support vectors that have α k <C. 3. The non-transitory computer-readable medium of claim 1 , wherein the kernel function is a Gaussian kernel function, and the kernel parameter is a Gaussian bandwidth parameter. 4. The non-transitory computer-readable medium of claim 1 , wherein the computer-readable instructions further cause the computing device to compute a threshold using the stored, defined set of support vectors associated with the selected value for the kernel parameter and the read plurality of data points. 5. The non-transitory computer-readable medium of claim 4 , wherein the threshold is computed using R 2 =K(x k ,x k )−2Σ i=1 n α i K (x i ,x k )+Σ i=1 n Σ j=1 n α i α j K (x i ,x k ), wherein the objective function defined for the SVDD model is max(Σ i=1 n α i K(x i ,x i )−Σ i=1 n Σ j=1 n α i α j K(x i ,x j )), subject to Σ i=1 n α i =1 and 0≦α i ≦C, ∀i=1, . . . , n, where K(x i ,x j ) is the kernel function, n is a number of data points of the read plurality of data points, C=1/nf where f is an expected outlier fraction, x i are the read plurality of data points, α i are Lagrange constants using any x k ε SV <C , where SV <C , is the set of support vectors that have 0<α k <C. 6. The non-transitory computer-readable medium of claim 4 , wherein after outputting the selected value for the kernel parameter, the computer-readable instructions further cause the computing device to: read a second plurality of data points from a scoring dataset; select a first data point from the read second plurality of data points; compute a distance value using the stored, defined set of support vectors associated with the selected value for the kernel parameter and the selected first data point; and when the computed distance value is greater than the computed threshold, identify the selected first data point as an outlier. 7. The non-transitory computer-readable medium of claim 6 , wherein the threshold is computed using R 2 =K(x k ,x k )−2 Σ i=1 n α i K(x i ,x k )+Σ i=1 n Σ j=1 n α i α j K(x i ,x k ), wherein the objective function defined for the SVDD model is max(Σ i=1 n α i K(x i ,x i )−Σ i=1 n Σ j=1 n α i α j K(x i ,x j )), subject to Σ i=1 n α i =1 and 0≦α i ≦C, ∀i=1, . . . , n, where K(x i ,x j ) is the kernel function, n is a number of data points of the read plurality of data points, C=1/nf where f is an expected outlier fraction, x i are the read plurality of data points, α i are Lagrange constants using any x k ε SV <C , where SV <C , is the set of support vectors that have 0<α k <C. 8. The non-transitory computer-readable medium of claim 7 , wherein the distance value is computed using dist 2 (z)=K (z,z)−2 Σ i=1 n α i K(x i ,z)+Σ i=1 n Σ j=1 n α i α j K(x i ,x 1 ), where z is the selected first data point. 9. The non-transitory computer-readable medium of claim 6 , wherein when the computed distance value is not greater than the computed threshold, the selected first data point is not identified as an outlier. 10. The non-transitory computer-readable medium of claim 9 , wherein the computer-readable instructions further cause the computing device to: (a) select a next data point from the read second plurality of data points; (b) compute a second distance value using the stored, defined set of support vectors associated with the selected value for the kernel parameter and the selected next data point; and (c) when the computed second distance value is greater than the computed threshold, identify the selected next data point as an outlier; and repeat (a) to (c) until each data point of the read second plurality of data points is selected as the next data point. 11. The non-transitory computer-readable medium of claim 1 , wherein after outputting the selected value for the kernel parameter, the computer-readable instructions further cause the computing device to: read a second plurality of data points from a scoring dataset; select a first data point from the read second plurality of data points; compute a distance value using the defined second set of support vectors and the selected first data point; and when the computed distance value is greater than a threshold computed based on the kernel function, identify the selected first data point as an outlier. 12. The non-transitory computer-readable medium of claim 1 , wherein after computing the plurality of second derivative values, the computer-readable instructions further cause the computing device to fit a curve to the computed plurality of second derivative values, wherein when the computed second derivative crosses zero is determined using the fit curve. 13. The non-transitory computer-readable medium of claim 12 , wherein the curve is fit to the computed plurality of second derivative values using a penalized B-spline curve fit function. 14. The non-transitory computer-readable medium of claim 12 , wherein when the computed second derivative crosses zero is determined when a curve fit value first crosses zero within a confidence limit. 15. The non-transitory computer-readable medium of claim 14 , wherein when the computed second derivative crosses zero is determined for a range of kernel parameter values while the curve fit value remains zero within the confidence limit. 16. The non-transitory computer-readable medium of claim 15 , wherein the value selected for the kernel parameter is an average of the range of kernel parameter values.
Physics · mapped topic
using kernel methods, e.g. support vector machines [SVM] · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.