Machine learning-based universal software component identification
US-12175241-B1 · Dec 24, 2024 · US
US9785430B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9785430-B2 |
| Application number | US-201414198878-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 6, 2014 |
| Priority date | Mar 21, 2013 |
| Publication date | Oct 10, 2017 |
| Grant date | Oct 10, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention provides a method and system for detecting a partial commit of software. A dependency information of the software is extracted from a version history and a bug database. A dimensional matrix containing a set of commit, and relationship information with a set of files with each commit is created from the dependency information. A centrality matrix is computed by performing a first set of matrix transformations on the dimensional matrix. A set of missing files of a partial commit, is identified by performing a second set of matrix transformations on the centrality matrix and a file vector, the file vector including a file dependency information of the partial commit.
Opening claim text (preview).
What is claimed: 1. A software version analysis device comprising a memory having programmed instructions stored thereon and at least one processor coupled to the memory and configured to execute the stored programmed instructions to: extract historical dependency information from a version repository and a bug database for a software application; generate a dimensional matrix based on the historical dependency information, wherein the dimensional matrix comprises at least an indication of a set of commits and an indication of a set of files included in each of the commits; generate a centrality matrix from the dimensional matrix based on one or more matrix transformations performed on the dimensional matrix; obtain file dependency information of a received another commit associated with the software application, the another commit identifying a plurality of files to be committed; determine when the another commit is a partial commit having one or more missing files, which were not included in the plurality of files to be committed, based on one or more matrix transformations performed using the file dependency information and the centrality matrix; and determine a set of plausible files for the another commit, generate a ranking of the set of plausible files according to a probability of occurrence in the another commit, determine the missing files based on the ranking or a predetermined threshold, and update a knowledge database or the predetermined threshold, when the determining indicates that the another commit is a partial commit. 2. The software version analysis device of claim 1 , wherein the dependency information comprises one or more of an indication of one or more transactions performed on the software application, a set of bugs in the software application, or fix information available for the set of bugs. 3. The software version analysis device of claim 1 , wherein the centrality matrix is a right singular matrix and the one or more matrix transformations comprises a singular vector decomposition of the dimensional matrix. 4. The software version analysis device of claim 1 , wherein the file dependency information comprises a vector comprising one or more of an author of the commit, a set of committed files, or a type of modification. 5. The software version analysis device of claim 1 , wherein the missing files are represented as a weighted vector, wherein a weight of each of the missing files in the weighted vector indicates a probability that the missing file should have been included in the plurality of files to be committed of the another commit. 6. The software version analysis device of claim 1 , wherein the processor is further configured to execute the stored programmed instructions to: receive feedback on an inconsistency on the missing files; and update the knowledge database or the predetermined threshold based on the feedback. 7. A method of detecting software application partial commits implemented by one or more software version analysis devices, the method comprising: extracting, historical dependency information from a version repository and a bug database for a software application; generating a dimensional matrix based on the historical dependency information, wherein the dimensional matrix comprises at least an indication of a set of commits and an indication of a set of files included in each of the commits; generating a centrality matrix from the dimensional matrix based on one or more matrix transformations performed on the dimensional matrix; obtaining file dependency information of a received another commit associated with the software application, the another commit identifying a plurality of files to be committed; determining when the another commit is a partial commit having one or more missing files, which were not included in the plurality of files to be committed, based on one or more matrix transformations performed using the file dependency information and the centrality matrix; and determining a set of plausible files for the another commit, generating a ranking of the set of plausible files according to a probability of occurrence in the another commit, determining the missing files based on the ranking or a predetermined threshold, and updating a knowledge database or the predetermined threshold, when the determining indicates that the another commit is a partial commit. 8. The method of claim 7 , wherein the dependency information comprises one or more of an indication of one or more transactions performed on the software application, a set of bugs in the software application, or fix information available for the set of bugs. 9. The method of claim 7 , wherein the centrality matrix is a right singular matrix and the one or more matrix transformations comprises a singular vector decomposition of the dimensional matrix. 10. The method of claim 7 , wherein the file dependency information comprises a vector comprising one or more of an author of the commit, a set of committed files, or a type of modification. 11. The method of claim 7 , wherein the missing files are represented as a weighted vector, wherein a weight of each of the missing files in the weighted vector indicates a probability that the missing file should have been included in the plurality of files to be committed of the another commit. 12. The method of claim 7 , further comprising: receiving feedback on an inconsistency on the missing files; and updating the knowledge database or the predetermined threshold based on the feedback. 13. A non-transitory computer readable medium having stored thereon instructions for detecting software application partial commits comprising machine executable code which when executed by at least one processor, causes the processor to: extract historical dependency information from a version repository and a bug database for a software application; generate a dimensional matrix based on the historical dependency information, wherein the dimensional matrix comprises at least an indication of a set of commits and an indication of a set of files included in each of the commits; generate a centrality matrix from the dimensional matrix based on one or more matrix transformations performed on the dimensional matrix; obtain file dependency information of a received another commit associated with the software application, the another commit identifying a plurality of files to be committed; determine when the another commit is a partial commit having one or more missing files, which were not included in the plurality of files to be committed, based on one or more matrix transformations performed using the file dependency information and the centrality matrix; and determine a set of plausible files for the another commit, generate a ranking of the set of plausible files according to a probability of occurrence in the another commit, determine the missing files based on the ranking or a predetermined threshold, and update a knowledge database or the predetermined threshold, when the determining indicates that the another commit is a partial commit. 14. The non-transitory computer readable medium of claim 13 , wherein the dependency information comprises one or more of an indication of one or more transactions performed on the software application, a set of bugs in the software application, or fix information available for the set of bugs. 15. The non-transitory computer readable medium of claim 13 , wherein the centrality matrix is a right singular matrix and the one or more matrix transformations comprises a singular vector decomposition of the dimensional matrix.
Related publications grouped by family.
Answers are generated from the same data shown on this page.