Automatically mapping binary executable files to source code by a software modernization system

US11537400B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11537400-B1
Application numberUS-202017074315-A
CountryUS
Kind codeB1
Filing dateOct 19, 2020
Priority dateOct 19, 2020
Publication dateDec 27, 2022
Grant dateDec 27, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for enabling a software modernization system to automatically map binary executable files and other runtime artifacts (e.g., application binaries, Java ARchive (JAR) files, .NET Dynamic Link Library (DLL) files, process identifiers, etc.) to source code associated with the binary executable files, e.g., as part of modernization processes aimed at migrating users' applications to a cloud service provider's infrastructure. A software modernization service of a cloud provider network provides discovery agents and other tools that are capable of creating an inventory of users' software applications and collecting profile data about the software applications. Various techniques are described for automatically identifying the source code associated with software applications identified by a discovery agent in a user's computing environment, thereby improving the efficiency of various software modernization analyses and other modernization processes.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: obtaining, by a software agent running in a user's computing environment, application profile data identifying a plurality of software applications located in the user's computing environment, wherein the application profile data includes, for a first software application of the plurality of software applications, an identifier of a first binary executable file associated with the first software application; sending a request to an automation server to obtain a plurality of workflow configurations stored by the automation server, wherein a workflow configuration of the plurality of workflow configurations defines a build and deployment pipeline for the first software application; determining that the workflow configuration includes an identifier of the first binary executable file, wherein the workflow configuration further includes an identifier of a first source code repository storing source code used to implement the first software application; storing data indicating a mapping between the first software application and the first source code repository; determining that another workflow configuration matching a second binary executable file associated with a second software application does not exist in the plurality of workflow configurations; decomposing the second binary executable file to obtain decomposed application data; using a hash function to generate a signature based on the decomposed application data; comparing the signature against other signatures generated based on respective ones of a plurality of other source code repositories managed by a version control system to identify a match with a second source code repository from among the plurality of other source code repositories; and storing data indicating a second mapping between the second software application and the second source code repository. 2. The computer-implemented method of claim 1 , wherein the decomposed application data includes identifiers of at least one of: a class associated with the second binary executable file, a method associated with the second binary executable file, a package associated with the second binary executable file, or a string literal. 3. The computer-implemented method of claim 1 , further comprising: obtaining source code associated with the first software application from the first source code repository; performing a static analysis of the source code; and generating a modernization recommendation based at least in part on the static analysis of the source code. 4. A computer-implemented method comprising: obtaining application profile data identifying a plurality of software applications located in a user's computing environment, wherein the application profile data includes, for a first software application of the plurality of software applications, an identifier of a first binary executable file associated with the first software application; obtaining workflow configuration data from an automation server defining a build and deployment pipeline for the first software application; determining that the workflow configuration data is associated with an identifier of the first binary executable file, wherein the workflow configuration data further includes an identifier of a first source code repository storing source code used to implement the first software application; and storing data indicating a mapping between the first software application and the first source code repository; determining that a workflow configuration associated with a second binary executable file associated with a second software application does not exist in a plurality of workflow configurations obtained from the automation server; decomposing the second binary executable file to obtain decomposed application data; using a hash function to generate a hash map based on the decomposed application data; comparing the hash map against hash maps generated based on a plurality of other source code repositories managed by a version control system to identify a matching second source code repository of the plurality of other source code repositories; and storing data indicating a second mapping between the second software application and the second source code repository. 5. The computer-implemented method of claim 4 , wherein the decomposed application data includes identifiers of at least one of: a class associated with the second binary executable file, a method associated with the second binary executable file, or a package associated with the second binary executable file. 6. The computer-implemented method of claim 4 , further comprising: obtaining source code associated with the first software application from the first source code repository; performing a static analysis of the source code; and generating a modernization recommendation based at least in part on the static analysis of the source code. 7. The computer-implemented method of claim 4 , further comprising providing a software agent for installation in the user's computing environment, and wherein the software agent collects the application profile data further including at least one of: an indication of a programming language used to implement the first software application, an indication of a dependency with a software package, or an indication of a software framework dependency. 8. The computer-implemented method of claim 4 , wherein obtaining the workflow configuration data includes sending an application programming interface (API) request to the automation server requesting the workflow configuration data, wherein the workflow configuration data defines a series of automated steps of the build and deployment pipeline, and wherein the workflow configuration data includes a first parameter identifying the first binary executable file and a second parameter identifying the first source code repository. 9. The computer-implemented method of claim 4 , wherein a modernization service of a cloud provider network obtains the application profile data from a software agent installed in the user's computing environment, wherein a modernization service obtains the workflow configuration data from the automation server by sending an application programming interface (API) request to the automation server requesting the workflow configuration data, and wherein the method further comprises: obtaining, by the modernization service, the source code used to implement the first software application from the first source code repository using a web address identifying the first source code repository; and generating a modernization recommendation based at least in part on a static analysis of the source code. 10. The computer-implemented method of claim 4 , further comprising: determining that a workflow configuration associated with a third binary executable file associated with a third software application does not exist in the plurality of workflow configurations; obtaining a symbol table associated with the third binary executable file, wherein the symbol table includes symbol table data including indications of at least one of: a class associated with the third binary executable file, a method associated with the third binary executable file, a package associated with the third binary executable file, or a string literal; comparing the symbol table data to source code contained in the plurality of other source code repositories to identify a match between the symbol table data and a third source code repository of the plurality of other source code repositories; and storing data indicating a third mapping between the second third software application and the third source code reposi

Assignees

Inventors

Classifications

  • via adapters, e.g. between incompatible applications · CPC title

  • G06F8/71Primary

    Version control (security arrangements therefor G06F21/57); Configuration management · CPC title

  • Decompilation; Disassembly · CPC title

  • Programming languages or programming paradigms · CPC title

  • G06F9/3017Primary

    Runtime instruction translation, e.g. macros · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11537400B1 cover?
Techniques are described for enabling a software modernization system to automatically map binary executable files and other runtime artifacts (e.g., application binaries, Java ARchive (JAR) files, .NET Dynamic Link Library (DLL) files, process identifiers, etc.) to source code associated with the binary executable files, e.g., as part of modernization processes aimed at migrating users' applic…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F8/71. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).