Machine learning based ranking of test cases for software development

US10474562B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10474562-B2
Application numberUS-201715710127-A
CountryUS
Kind codeB2
Filing dateSep 20, 2017
Priority dateSep 20, 2017
Publication dateNov 12, 2019
Grant dateNov 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An online system ranks test cases run in connection with check-in of sets of software files in a software repository. The online system ranks the test cases higher if they are more likely to fail as a result of defects in the set of files being checked in. Accordingly, the online system informs software developers of potential defects in the files being checked in early without having to run the complete suite of test cases. The online system determines a vector representation of the files and test cases based on a neural network. The online system determines an aggregate vector representation of the set of files. The online system determines a measure of similarity between the test cases and the aggregate vector representation of the set of files. The online system ranks the test cases based on the measures of similarity of the test cases.

First claim

Opening claim text (preview).

We claim: 1. A non-transitory computer readable storage medium comprising computer executable code that when executed by one or more processors causes the one or more processors to perform operations comprising: storing in a software repository, a plurality of files and a plurality of test cases; training a neural network using historical data comprising sets of files checked in together into the software repository and the test cases that failed as a result of modifications to each set of files, the neural network comprising an input layer, a hidden layer, and an output layer; receiving a set of files identified for checking in together into the software repository; for each file from the set of files, extracting a vector representation of the file from the hidden layer of the neural network; aggregating the vector representations of the set of files to determine an aggregate vector representation of the set of files; for each test case from the plurality of test cases, extracting a vector representation of the test case from the hidden layer of the neural network; for each test case, determining a measure of similarity between the vector representation of the test case and the aggregate vector representation of the set of files; ranking the plurality of test cases based on the measure of similarity between the vector representation of the test case and the aggregate vector representation of the set of files; and executing at least a subset of the test cases from the plurality of test cases in an order determined based on the rank. 2. The computer readable storage medium of claim 1 , wherein the neural network is configured to receive an encoding of an input file and generate an output identifying a file that is likely to be checked in to the software repository together with the input file. 3. The computer readable storage medium of claim 1 , wherein the neural network is configured to receive an input vector representation of the file, wherein the extracted vector representation of the file has fewer dimensions than the input vector representation of the file. 4. The computer readable storage medium of claim 3 , wherein the input vector representation has an element for each file from the plurality of files, wherein the input vector representation of a particular file has a first binary value for the bit corresponding to the particular file and a second binary value for the remaining files. 5. The computer readable storage medium of claim 3 , wherein the input vector representation has an element for each test case from the plurality of test cases, wherein the input vector representation of a particular test case has a first binary value for the bit corresponding to the particular test case and a second binary value for the remaining test cases. 6. The computer readable storage medium of claim 3 , wherein the measure of similarity between the vector representation of the test case and the aggregate vector representation comprises a cosine similarity between the vector representation of the test case and the aggregate vector representation. 7. A computer implemented method for ordering test cases executed for testing modifications to files checked into a software repository, the method comprising: storing in a software repository, a plurality of files and a plurality of test cases; receiving a set of files identified for checking in together into the software repository; for each file from the set of files, extracting a vector representation of the file from a hidden layer of a neural network; aggregating the vector representations of the set of files to determine an aggregate vector representation of the set of files; for each test case from the plurality of test cases, extracting a vector representation of the test case from the hidden layer of the neural network; for each test case, determining a measure of similarity between the vector representation of the test case and the aggregate vector representation of the set of files; ranking the plurality of test cases based on the measure of similarity between the vector representation of the test case and the aggregate vector representation of the set of files; and executing at least a subset of the test cases from the plurality of test cases in an order determined based on the rank. 8. The computer implemented method of claim 7 , wherein the neural network is trained using historical data comprising sets of files checked in together into the software repository and the test cases that failed as a result of modifications to each set of files. 9. The computer implemented method of claim 7 , wherein the neural network is configured to receive an encoding of an input file and generate an output identifying a file that is likely to be checked in to the software repository together with the input file. 10. The computer implemented method of claim 7 , wherein the neural network is configured to receive an encoding of an input file and identify a test case that is likely to fail as a result of modifications to a particular set of files including the input file. 11. The computer implemented method of claim 7 , wherein the vector representation of the test case has the same number of dimensions as the vector representation of a file. 12. The computer implemented method of claim 7 , wherein the neural network is configured to receive an input vector representation of the file, wherein the extracted vector representation of the file has fewer dimensions than the input vector representation of the file. 13. The computer implemented method of claim 12 , wherein the input vector representation has an element for each file from the plurality of files, wherein the input vector representation of a particular file has a first binary value for the bit corresponding to the particular file and a second binary value for the remaining files. 14. The computer implemented method of claim 12 , wherein the input vector representation has an element for each test case from the plurality of test cases, wherein the input vector representation of a particular test case has a first binary value for the bit corresponding to the particular test case and a second binary value for the remaining test cases. 15. The computer implemented method of claim 7 , wherein a file comprises at least one of: a program, a resource for use by a program, or configuration data. 16. The computer implemented method of claim 7 , wherein the measure of similarity between the vector representation of the test case and the aggregate vector representation comprises a cosine similarity between the vector representation of the test case and the aggregate vector representation. 17. A computer system comprising: one or more computer processors; and a non-transitory computer readable storage medium comprising computer executable code that when executed by the one or more processors causes the one or more processors to perform operations comprising: storing in a software repository, a plurality of files and a plurality of test cases; receiving a set of files identified for checking in together into the software repository; for each file from the set of files, extracting a vector representation of the file from a hidden layer of a neural network; aggregating the vector representations of the set of files to determine an aggregate vector representation of the set of files; for each test case from the plurality of test cases, extracting a vector representation of the test case from the hidden layer of the neural network; for each test case, determining a measure of similarit

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

  • for test execution, e.g. scheduling of test suites · CPC title

  • Version control (security arrangements therefor G06F21/57); Configuration management · CPC title

  • for test design, e.g. generating new test cases · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10474562B2 cover?
An online system ranks test cases run in connection with check-in of sets of software files in a software repository. The online system ranks the test cases higher if they are more likely to fail as a result of defects in the set of files being checked in. Accordingly, the online system informs software developers of potential defects in the files being checked in early without having to run th…
Who is the assignee on this patent?
Salesforce Com Inc, Salesforce Com
What technology area does this patent fall under?
Primary CPC classification G06F11/3684. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).