Techniques for automatically identifying input files used to generate output files in a software build process

US9442717B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9442717-B2
Application numberUS-201414332165-A
CountryUS
Kind codeB2
Filing dateJul 15, 2014
Priority dateJul 15, 2014
Publication dateSep 13, 2016
Grant dateSep 13, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for automatically identifying input files used to generate output files in a software build process are provided. In one embodiment, a computer system can execute one or more build commands for generating output files for a software product, where the software product is associated with a build tree comprising various input files. The computer system can further intercept system calls invoked during the execution of the one or more build commands and can collect information pertaining to at least a portion of the intercepted system calls. The computer system can then create a dependency graph based on the collected information, where the dependency graph identifies a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: executing, by a computer system, one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; concurrently with the executing of the one or more build commands: intercepting, by the computer system, operating system calls invoked by the one or more build commands; and for each intercepted operating system call: determining whether the intercepted operating system call is relevant to process creation, file access, or file creation; and if the intercepted operating system call is relevant to process creation, file access, or file creation, collecting, by the computer system, information pertaining to the intercepted operating system call, wherein collecting information pertaining to the intercepted operating system call comprises: logging information regarding one or more input files that are accessed or one or more output files that are created by the one or more build commands using the intercepted operating system call; and creating, by the computer system, a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 2. The method of claim 1 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the operating system calls. 3. The method of claim 1 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 4. The method of claim 1 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 5. The method of claim 1 further comprising: receiving a list of input files in the build tree that meet one or more predefined criteria; comparing the list to the subset of input files identified by the dependency graph as being used to generate the output files; and returning those input files in the subset that match the list. 6. The method of claim 5 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license. 7. A non-transitory computer readable storage medium having stored thereon program code executable by one or more computer systems, the program code embodying a method that comprises: executing one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; concurrently with the executing of the one or more build commands: intercepting operating system calls invoked by the one or more build commands; and for each intercepted operating system call: determining whether the intercepted operating system call is relevant to process creation, file access, or file creation; and if the intercepted operating system call is relevant to process creation, file access, or file creation, collecting information pertaining to the intercepted operating system call, wherein collecting information pertaining to the intercepted operating system call comprises: logging information regarding one or more input files that are accessed or one or more output files that are created by the one or more build commands using the intercepted operating system call; and creating a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 8. The non-transitory computer readable storage medium of claim 7 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the operating system calls. 9. The non-transitory computer readable storage medium of claim 7 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 10. The non-transitory computer readable storage medium of claim 7 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 11. The non-transitory computer readable storage medium of claim 7 wherein the method further comprises: receiving a list of input files in the build tree that meet one or more predefined criteria; comparing the list to the subset of input files identified by the dependency graph as being used to generate the output files; and returning those input files in the subset that match the list. 12. The non-transitory computer readable storage medium of claim 11 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license. 13. A computer system comprising: a processor; and a non-transitory computer readable medium having stored thereon program code that, when executed, causes the processor to: execute one or more build commands for generating output files for a software product, the software product being associated with a build tree comprising input files; concurrently with the executing of the one or more build commands: intercept operating system calls invoked by the one or more build commands; and for each intercepted operating system call: determine whether the intercepted operating system call is relevant to process creation, file access, or file creation; and if the intercepted operating system call is relevant to process creation, file access, or file creation, collect information pertaining to the intercepted operating system call, wherein collecting information pertaining to the intercepted operating system call comprises: logging information regarding one or more input files that are accessed or one or more output files that are created by the one or more build commands using the intercepted operating system call; and create a dependency graph based on the collected information, the dependency graph identifying a subset of input files in the build tree that are actually used by the one or more build commands to generate the output files. 14. The computer system of claim 13 wherein the one or more build commands are invoked using a wrapper command that enables the intercepting of the operating system calls. 15. The computer system of claim 13 wherein the dependency graph includes, for each output file, a path from the output file to one or more of the subset of input files. 16. The computer system of claim 13 wherein the dependency graph further identifies how the subset of input files are used by the one or more build commands to generate the output files. 17. The computer system of claim 13 wherein the program code further causes the processor to: receive a list of input files in the build tree that meet one or more predefined criteria; compare the list to the subset of input files identified by the dependency graph as being used to generate the output files; and return those input files in the subset that match the list. 18. The computer system of claim 17 wherein the one or more predefined criteria include a criterion identifying input files that are subject to an open source license.

Assignees

Inventors

Classifications

  • G06F8/71Primary

    Version control (security arrangements therefor G06F21/57); Configuration management · CPC title

  • G06F8/70Primary

    Software maintenance or management · CPC title

  • Performance evaluation by tracing or monitoring · CPC title

  • Compilation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9442717B2 cover?
Techniques for automatically identifying input files used to generate output files in a software build process are provided. In one embodiment, a computer system can execute one or more build commands for generating output files for a software product, where the software product is associated with a build tree comprising various input files. The computer system can further intercept system call…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F8/71. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).