Machine learning-based universal software component identification
US-12175241-B1 · Dec 24, 2024 · US
US2016019056A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016019056-A1 |
| Application number | US-201414332165-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 15, 2014 |
| Priority date | Jul 15, 2014 |
| Publication date | Jan 21, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for automatically identifying input files used to generate output files in a software build process are provided. In one embodiment, a computer system can execute one or more build commands for generating output files for a software product, where the software product is associated with a build tree comprising various input files. The computer system can further intercept system calls invoked during the execution of the one or more build commands and can collect information pertaining to at least a portion of the intercepted system calls. The computer system can then create a dependency graph based on the collected information, where the dependency graph identifies a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files.
Opening claim text (preview).
1 . A method comprising: executing, by a computer system, one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; intercepting, by the computer system while the one or more build commands are executing, system calls invoked by the one or more build commands; collecting, by the computer system, information pertaining to at least a portion of the intercepted system calls; and creating, by the computer system, a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 2 . The method of claim 1 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the system calls. 3 . The method of claim 1 wherein collecting information pertaining to at least a portion of the intercepted system calls comprises: identifying a portion of the intercepted system calls that are relevant to process creation, file access, or file creation; and collecting information pertaining to the identified portion. 4 . The method of claim 1 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 5 . The method of claim 1 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 6 . The method of claim 1 further comprising: receiving a list of input files in the build tree that meet one or more predefined criteria; comparing the list to the subset of input files identified by the dependency graph as being used to generate the output files; and returning those input files in the subset that match the list. 7 . The method of claim 6 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license. 8 . A non-transitory computer readable storage medium having stored thereon program code executable by one or more computer systems, the program code embodying a method that comprises: executing one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; intercepting, while the one or more build commands are executing, system calls invoked by the one or more build commands; collecting information pertaining to at least a portion of the intercepted system calls; and creating a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 9 . The non-transitory computer readable storage medium of claim 8 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the system calls. 10 . The non-transitory computer readable storage medium of claim 8 wherein collecting information pertaining to at least a portion of the intercepted system calls comprises: identifying a portion of the intercepted system calls that are relevant to process creation, file access, or file creation; and collecting information pertaining to the identified portion. 11 . The non-transitory computer readable storage medium of claim 8 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 12 . The non-transitory computer readable storage medium of claim 8 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 13 . The non-transitory computer readable storage medium of claim 8 wherein the method further comprises: receiving a list of input files in the build tree that meet one or more predefined criteria; comparing the list to the subset of input files identified by the dependency graph as being used to generate the output files; and returning those input files in the subset that match the list. 14 . The non-transitory computer readable storage medium of claim 13 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license. 15 . A computer system comprising: a processor; and a non-transitory computer readable medium having stored thereon program code that, when executed, causes the processor to: execute one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; intercept, while the one or more build commands are executing, system calls invoked by the one or more build commands; collect information pertaining to at least a portion of the intercepted system calls; and create a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 16 . The computer system of claim 15 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the system calls. 17 . The computer system of claim 15 wherein the program code that causes the processor to collect information pertaining to at least a portion of the intercepted system calls comprises program code that causes the processor to: identify a portion of the intercepted system calls that are relevant to process creation, file access, or file creation; and collect information pertaining to the identified portion. 18 . The computer system of claim 15 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 19 . The computer system of claim 15 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 20 . The computer system of claim 15 wherein the program code further causes the processor to: receive a list of input files in the build tree that meet one or more predefined criteria; compare the list to the subset of input files identified by the dependency graph as being used to generate the output files; and return those input files in the subset that match the list. 21 . The computer system of claim 20 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license.
Related publications grouped by family.
Answers are generated from the same data shown on this page.