Method and system for enabling conversational reverse engineering and understanding of a software application
US-2024160500-A1 · May 16, 2024 · US
US9836301B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9836301-B2 |
| Application number | US-201615076207-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 21, 2016 |
| Priority date | Apr 9, 2012 |
| Publication date | Dec 5, 2017 |
| Grant date | Dec 5, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for component discovery from source code may include receiving source code, and determining business classes by excluding packages and classes in the source code identified as belonging to a presentation layer, as belonging to a data access layer, as models and/or as utilities. The method may further include extracting multi-dimensional features from the business classes, estimating similarity for business class pairs based on the extracted multi-dimensional features, clustering the business classes based on the similarity and mapping functional concepts to the clusters. The clusters generated by the clustering may represent components of the source code. The method may also include determining interfaces for the components based on the clustering.
Opening claim text (preview).
What is claimed is: 1. A method for component discovery from source code, the method performed by a processor and comprising: receiving source code; determining business classes by determining a component identification boundary in the source code; extracting features from the business classes by extracting packaging information for each of the business classes, wherein extracting packaging information for each of the business classes includes extracting concept words embedded in business class names, extracting a packaging hierarchy as a string, and extracting a substring that describes the packaging hierarchy; estimating similarity for business class pairs based on the extracted features; clustering the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determining interfaces for the components based on the clustering. 2. The method of claim 1 , wherein extracting features from the business classes further comprises: extracting inheritance and interface realization relationships for each of the business classes. 3. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating partitions for the clusters by determining, for each node in a cluster, whether the node belongs to the cluster or to a different cluster, wherein the node represents a business class; and moving, based on the determination that the node belongs to the different cluster, the node to the different cluster. 4. The method of claim 1 , wherein estimating similarity for business class pairs based on the extracted features further comprises: determining, based on the extracted packaging information, packaging based similarity for the business class pairs. 5. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating seed populations by sorting a list of edges between business class pairs; and generating, based on the seed populations, a set of seed clusters. 6. The method of claim 1 , further comprising: mapping functional entities to the components by separating each of the functional entities into distinct words; and determining, based on the separation of each of the functional entities into distinct words, a similarity between each of the functional entities and the components. 7. A component discovery system comprising: a processor; and a memory storing machine readable instructions that when executed by the processor cause the processor to: determine business classes by excluding packages and classes in source code; extract textual features from the business classes by extracting packaging information for each of the business classes, wherein extracting packaging information for each of the business classes includes extracting concept words embedded in business class names, extracting a packaging hierarchy as a string, and extracting a substring that describes the packaging hierarchy; estimate similarity for business class pairs based on the extracted features; cluster the business classes based on the similarity by generating seed populations by sorting a list of edges between business class pairs, and generating, based on the seed populations, a set of seed clusters, wherein clusters generated by the clustering represent components of the source code; and determine interfaces for the components based on the clustering. 8. The component discovery system according to claim 7 , wherein the machine readable instructions to extract the textual features from the business classes further comprise machine readable instructions that when executed by the processor further cause the processor to: extract inheritance and interface realization relationships for each of the business classes. 9. The component discovery system according to claim 7 , wherein the machine readable instructions to cluster the business classes based on the similarity further comprise machine readable instructions that when executed by the processor further cause the processor to: generate partitions for the clusters by determining, for each node in a cluster, whether the node belongs to the cluster or to a different cluster, wherein the node represents a business class; and move, based on the determination that the node belongs to the different cluster, the node to the different cluster. 10. The component discovery system according to claim 7 , wherein the machine readable instructions to estimate similarity for business class pairs based on the extracted features further comprise machine readable instructions that when executed by the processor further cause the processor to: determine, based on the extracted packaging information, packaging based similarity for the business class pairs. 11. The component discovery system according to claim 7 , further comprising machine readable instructions that when executed by the processor further cause the processor to: map functional entities to the components by separating each of the functional entities into distinct words; and determine, based on the separation of each of the functional entities into distinct words, a similarity between each of the functional entities and the components. 12. A non-transitory computer readable medium having stored thereon machine readable instructions for component discovery, the machine readable instructions, when executed, cause a processor to: determine business classes by excluding packages and classes in source code; extract code features from the business classes by extracting packaging information for each of the business classes, wherein extracting packaging information for each of the business classes includes extracting concept words embedded in business class names, extracting a packaging hierarchy as a string, and extracting a substring that describes the packaging hierarchy; estimate similarity for business class pairs based on the extracted features; cluster the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determine interfaces for the components based on the clustering by identifying public methods of the business classes in a cluster of the generated clusters that are called by the business classes of other clusters from the generated clusters. 13. The non-transitory computer readable medium according to claim 12 , wherein the machine readable instructions to cluster the business classes based on the similarity further comprise machine readable instructions that when executed by the processor further cause the processor to: generate partitions for the clusters by determining, for each node in a cluster, whether the node belongs to the cluster or to a different cluster, wherein the node represents a business class; and move, based on the determination that the node belongs to the different cluster, the node to the different cluster. 14. The non-transitory computer readable medium according to claim 12 , wherein the machine readable instructions to estimate similarity for business class pairs based on the extracted features further comprise machine readable instructions that when executed by the processor further cause the processor to: determine, based on the extracted packaging information, packaging based similarity for the business class pairs. 15. The non-transitory computer readable medium according to claim 12 , wherein the machine readable instructions to cluster the business classes based on the similarity further comprise machi
Related publications grouped by family.
Answers are generated from the same data shown on this page.