Patching Auto-Stop
US-2015378710-A1 · Dec 31, 2015 · US
US9804838B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9804838-B2 |
| Application number | US-201514985060-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 30, 2015 |
| Priority date | Sep 29, 2011 |
| Publication date | Oct 31, 2017 |
| Grant date | Oct 31, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system, method, and computer-readable medium, is described that finds similarities among programming applications based on semantic anchors found within the source code of such applications. The semantic anchors may be API calls, such as Java's package and class calls of the JDK. Latent Semantic Indexing may be used to process the application and semantic anchor data and automatically develop a similarity matrix that contains numbers representing the similarity of one program to another.
Opening claim text (preview).
What is claimed is: 1. A device comprising: a processor, at least partially implemented in hardware, to: generate a similarity matrix defining a similarity between a plurality of computer applications according to a categorization of application programming interface (API) calls, the similarity matrix being generated from a term document matrix using singular value decomposition, the term document matrix including a first dimension of first entries corresponding to the plurality of computer applications and a second dimension of second entries corresponding to categories of the categorization, elements of the term document matrix having values based on a quantity of API calls in a computer application corresponding to a first entry of the first dimension, and in a category, of the categories, corresponding to a second entry of the second dimension, and at least one of the API calls corresponding to one of the categories, the similarity being based on weights for the API calls contained in the plurality of computer applications, a respective weight for a respective API call in a respective computer application being based on a quantity of API calls in the respective computer application and a quantity of computer applications, of the plurality of computer applications, that contain the respective API call; receive a selection of a first computer application of the plurality of computer applications; and provide an indication of at least one second computer application, of the plurality of computer applications, using the similarity matrix and based on the selection of the first computer application. 2. The device of claim 1 , where the similarity matrix defines the similarity between the plurality of computer applications as numerical values based on the API calls in source code of the plurality of computer applications. 3. The device of claim 1 , where the processor, when generating the similarity matrix, is to: generate the similarity matrix from a plurality of vectors corresponding to the plurality of computer applications using a vector space model, the plurality of vectors including elements corresponding to the categories of the categorization, the elements including values based on a number of the API calls in source code and documentation for a computer application corresponding to a vector and in the category corresponding to one of the elements, at least one of the API calls corresponding to one of the categories. 4. The device of claim 1 , where the processor is further to: receive the plurality of computer applications from a computer application archive via a network. 5. The device of claim 1 , where the processor, when generating the similarity matrix, is to: utilize an inverse document frequency calculation to find common API calls; and filter out the common API calls from the API calls prior to the categorization of the API calls. 6. The device of claim 1 , where the first dimension of first entries are columns and the second dimension of second entries are rows. 7. The device of claim 1 , where different API calls have different weights. 8. A non-transitory computer-readable medium for storing instructions, the instructions comprising: a plurality of instructions which, when executed by one or more processors, cause the one or more processors to: generate a similarity matrix defining a similarity between a plurality of computer applications according to a categorization of application programming interface (API) calls, the similarity matrix being generated from a term document matrix using singular value decomposition, the term document matrix including a first dimension of first entries corresponding to the plurality of computer applications and a second dimension of second entries corresponding to categories of the categorization, elements of the term document matrix having values based on a quantity of API calls in a computer application corresponding to a first entry of the first dimension, and in a category, of the categories, corresponding to a second entry of the second dimension, and at least one of the API calls corresponding to one of the categories, the similarity being based on weights for the API calls contained in the plurality of computer applications, a respective weight for a respective API call in a respective computer application being based on a quantity of API calls in the respective computer application and a quantity of computer applications, of the plurality of computer applications, that contain the respective API call; receive a selection of a first computer application of the plurality of computer applications; and provide an indication of at least one second computer application, of the plurality of computer applications, using the similarity matrix and based on the selection of the first computer application. 9. The non-transitory computer-readable medium of claim 8 , where the similarity matrix defines the similarity between the plurality of computer applications as numerical values based on the API calls in source code of the plurality of computer applications. 10. The non-transitory computer-readable medium of claim 8 , where the plurality of instructions, when executed by the one or more processors to generate the similarity matrix, cause the one or more processors to: generate the similarity matrix from a plurality of vectors corresponding to the plurality of computer applications using a vector space model, the plurality of vectors including elements corresponding to the categories of the categorization, the elements including values based on a number of the API calls in source code and documentation for a computer application corresponding to a vector and in the category corresponding to one of the elements, at least one of the API calls corresponding to one of the categories. 11. The non-transitory computer-readable medium of claim 8 , where the plurality of instructions, when executed by the one or more processors to generate the similarity matrix, further cause the one or more processors to: extract the API calls from source code of the plurality of computer applications. 12. The non-transitory computer-readable medium of claim 8 , where the plurality of instructions, when executed by the one or more processors to generate the similarity matrix, cause the one or more processors to: utilize an inverse document frequency calculation to find common API calls; and filter out the common API calls from the API calls prior to the categorization of the API calls. 13. The non-transitory computer-readable medium of claim 8 , where the first dimension of first entries are columns and the second dimension of second entries are rows. 14. The non-transitory computer-readable medium of claim 8 , where different API calls have different weights. 15. A method comprising: generating, by a device, a similarity matrix defining a similarity between a plurality of computer applications according to a categorization of application programming interface (API) calls, the similarity matrix being generated from a term document matrix using singular value decomposition, the term document matrix including a first dimension of first entries corresponding to the plurality of computer applications and a second dimension of second entries corresponding to categories of the categorization, elements of the term document matrix having values based on a quantity of API calls in a computer application corresponding to a first entry of the first dimension, and in a category, of the categories, corresponding to a second entry of the second dimens
using natural language analysis · CPC title
Version control (security arrangements therefor G06F21/57); Configuration management · CPC title
Software maintenance or management · CPC title
Calculation of difference between files · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.