Software version fingerprint generation and identification

US10474456B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10474456-B2
Application numberUS-201916415192-A
CountryUS
Kind codeB2
Filing dateMay 17, 2019
Priority dateDec 7, 2016
Publication dateNov 12, 2019
Grant dateNov 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for accessing a source code repository comprising a plurality of versions of code, analyzing the plurality of versions of code of the component to compute metrics to identify each version of code, analyzing the metrics to determine a subset of the metrics to use to as a fingerprint definition to identify each version of the code, generating a fingerprint for each version of code using the fingerprint definition, generating a fingerprint matrix with the fingerprint for each version of code for the software component and storing the fingerprint definition and the fingerprint matrix

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: accessing, by a computing device, a metrics matrix comprising candidate metrics to identify each version of code of a plurality of versions of code for a software component; generating, by the computing device, a fingerprint definition to identify each version of the code for the software component by performing operations comprising: (1) determining a best candidate metric among the candidate metrics to identify the most versions of the plurality of versions of code; (2) adding the best candidate metric to a set of optimal metrics; (3) determining which versions of code of the plurality of versions of code can be identified with the best candidate metric; (4) removing the versions of code of the plurality of versions of code that can be identified with the best candidate metric and the best candidate metric from the metrics matrix to generate a reduced metrics matrix; and (5) repeating (1)-(4) on the reduced metrics matrix until all versions of the plurality of versions of code are uniquely identified by the set of optimal metrics or until there is one or more versions of code that cannot be uniquely identified using the candidate metrics; and based on determining that all versions of the plurality of version of code are uniquely identified by the set of optimal metrics, generating a fingerprint for each version of code for the software component, using the fingerprint definition comprising the set of optimal metrics. 2. The method of claim 1 , further comprising: generating a fingerprint matrix comprising the fingerprint for each version of code for the software component; receiving a request for version analysis, the request comprising a package associated with the software component; generating a fingerprint for the package using the fingerprint definition; accessing the fingerprint matrix to determine the version of the package using the fingerprint for the package; and providing the version of the package for the component. 3. The method of claim 2 , wherein providing the version of the package for the component comprises providing a list with each version of the plurality of versions of the source code repository and a level of matching with the package. 4. The method of claim 3 , wherein the level of matching is evaluated by computing the Euclidean distance between the package fingerprint and the fingerprints of each version of the plurality of versions. 5. The method of claim 1 , further comprising: receiving a new version of code for the software component; generating a fingerprint for the new version of code for the software component using the fingerprint definition; determining that the fingerprint for the new version of code is not unique from other fingerprints in a fingerprint matrix comprising the fingerprint for each version of code for the software component; generating an updated fingerprint definition to identify each version of the code for the software component by performing operations (1)-(5); and based on determining that all versions of the plurality of version of code are uniquely identified by the set of optimal metrics, generating an updated fingerprint for each version of code for the software component, using the updated fingerprint definition comprising the set of optimal metrics. 6. The method of claim 5 , further comprising: generating an updated fingerprint matrix comprising the updated fingerprint for each version of code for the software component. 7. The method of claim 1 , further comprising: receiving a new version of code for the software component; generating a fingerprint for the new version of code for the software component using the fingerprint definition; determining that the fingerprint for the new version of code is unique from other fingerprints in a fingerprint matrix comprising the fingerprint for each version of code for the software component; and storing the new fingerprint for the new version of code in the fingerprint matrix. 8. The method of claim 1 , further comprising: accessing a second metrics matrix comprising candidate metrics to identify each version of code of a plurality of versions of code for a second software component; generating, by the computing device a second fingerprint definition to identify each version of the code for the second software component by performing operations (1)-(5); and based on determining that there is one or more version of code that cannot be uniquely identified using the candidate metrics, analyzing the plurality of versions of code of the second software component to compute additional candidate metrics to identify each version for the second software component. 9. The method of claim 1 , wherein the candidate metrics include at least one from a group comprising: name and size of classes, name and size of methods, number of methods, name and type of method parameters, name and type of local variables, conditional instruction branching conditions, cyclomatic complexity by method, (WMC) weighted methods per class, (DIT) depth of inheritance tree, (NOC) number of children, (CBO) coupling between object classes, (RFC) response for a class, (LCOM) lack of cohesion in methods, (Ca) afferent couplings, (NPM) number of public methods, and Chidamber and Kemerer metrics. 10. The method of claim 1 , wherein determining a best candidate metric among the candidate metrics to identify the most versions of the plurality of versions of code comprises choosing a candidate metric with the largest Shannon entropy contribution as the best candidate metric. 11. A computing device comprising: a memory that stores instructions; and one or more processors configured to perform operations comprising: accessing a metrics matrix comprising candidate metrics to identify each version of code of a plurality of versions of code for a software component; generating a fingerprint definition to identify each version of the code for the software component by performing operations comprising: (1) determining a best candidate metric among the candidate metrics to identify the most versions of the plurality of versions of code; (2) adding the best candidate metric to a set of optimal metrics; (3) determining which versions of code of the plurality of versions of code can be identified with the best candidate metric; (4) removing the versions of code of the plurality of versions of code that can be identified with the best candidate metric and the best candidate metric from the metrics matrix to generate a reduced metrics matrix; and (5) repeating (1)-(4) on the reduced metrics matrix until all versions of the plurality of versions of code are uniquely identified by the set of optimal metrics or until there is one or more versions of code that cannot be uniquely identified using the candidate metrics; and based on determining that all versions of the plurality of version of code are uniquely identified by the set of optimal metrics, generating a fingerprint for each version of code for the software component, using the fingerprint definition comprising the set of optimal metrics. 12. The computing device of claim 11 , the operations further comprising: generating a fingerprint matrix comprising the fingerprint for each version of code for the software component; receiving a request for version analysis, the request comprising a package associated with the software component; generating a fingerprint for the package using the fingerprint definition; accessing the fingerprint matrix to determine the version of the package using the fingerprint for the package; and providing the version of the packa

Assignees

Inventors

Classifications

  • G06F8/71Primary

    Version control (security arrangements therefor G06F21/57); Configuration management · CPC title

  • Software metrics · CPC title

  • Software reuse · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10474456B2 cover?
Systems and methods are provided for accessing a source code repository comprising a plurality of versions of code, analyzing the plurality of versions of code of the component to compute metrics to identify each version of code, analyzing the metrics to determine a subset of the metrics to use to as a fingerprint definition to identify each version of the code, generating a fingerprint for eac…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F8/71. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).