Patching Auto-Stop
US-2015378710-A1 · Dec 31, 2015 · US
US9323520B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9323520-B2 |
| Application number | US-201414504194-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 1, 2014 |
| Priority date | Apr 9, 2012 |
| Publication date | Apr 26, 2016 |
| Grant date | Apr 26, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for component discovery from source code may include receiving source code, and determining business classes by excluding packages and classes in the source code identified as belonging to a presentation layer, as belonging to a data access layer, as models and/or as utilities. The method may further include extracting multi-dimensional features from the business classes, estimating similarity for business class pairs based on the extracted multi-dimensional features, clustering the business classes based on the similarity and mapping functional concepts to the clusters. The clusters generated by the clustering may represent components of the source code. The method may also include determining interfaces for the components based on the clustering.
Opening claim text (preview).
What is claimed is: 1. A method for component discovery from source code, the method performed by a processor and comprising: receiving source code; determining business classes by excluding packages and classes in the source code; extracting features from the business classes; estimating similarity for business class pairs based on the extracted features by determining textual similarity by using a co-occurrence matrix that is defined as a sequence of the business classes in the source code and a sequence of unique intermediate representation (IR) tokens occurring across the business classes, and evaluating, for the co-occurrence matrix, a frequency of occurrence of an IR token from the IR tokens occurring in a particular business class of the business classes; clustering the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determining interfaces for the components based on the clustering. 2. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: using k-means clustering to generate initial clusters that are used to cluster the business classes. 3. The method of claim 1 , further comprising: clustering a plurality of application portfolios that each includes a plurality of applications that use different types of source code including the source code. 4. The method of claim 1 , further comprising: determining similarity between different pairs of the clusters based on a normalized summation of similarity scores between the business class pairs across the clusters. 5. The method of claim 1 , wherein estimating similarity for business class pairs based on the extracted features further comprises: including a class name in an inheritance and interface realization list for a current business class; including names of other business classes in the inheritance and interface realization list that have the class name of the current business class in inheritance and interface realization lists of the other business classes; and determining inheritance and interface realization based similarity for the business class pairs based on evaluation of the inheritance and interface realization list for the current business class and an inheritance and interface realization list for the other business classes. 6. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating a set of seed clusters by using top weighted edges between business class pairs, wherein the edges represent the similarity for the business class pairs. 7. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating a set of seed clusters by using edges between business class pairs with non-zero inheritance and interface realization similarity, wherein the edges represent the similarity for the business class pairs. 8. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating a set of seed clusters based on a clique strength of nodes of edges between business class pairs, wherein the edges represent the similarity for the business class pairs and the nodes representing the business classes. 9. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: generating a set of seed clusters based on a characteristic of edges or nodes of the business class pairs, wherein the edges represent the similarity for the business class pairs and the nodes representing the business classes; and evaluating a modularisation quality (MQ) of the set of seed clusters. 10. The method of claim 1 , wherein clustering the business classes based on the similarity further comprises: maximizing modularisation quality (MQ) of clusters based on movement of nodes between the clusters, wherein the nodes represent the business classes. 11. The method of claim 1 , wherein determining interfaces for the components further comprises: identifying public methods of the business classes in a cluster that are called by the business classes of other clusters. 12. The method of claim 1 , further comprising: determining component interactions based on public methods of a cluster that are called by the business classes of another cluster. 13. The method of claim 1 , further comprising: identifying borderline classes by identifying the business classes in a first cluster having a high similarity to the business classes in another cluster. 14. A method for component discovery from source code, the method performed by a processor and comprising: receiving source code; determining business classes by excluding packages and classes in the source code; extracting features from the business classes; estimating similarity for business class pairs based on the extracted features by determining structural similarity by collapsing edges with a same method name for a dependency graph that includes nodes representing the business classes, wherein an edge of the edges represents a function call in the source code for a business class of the business classes where a function of another business class of the business classes is called; clustering the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determining interfaces for the components based on the clustering. 15. A method for component discovery from source code, the method performed by a processor and comprising: receiving source code; determining business classes by excluding packages and classes in the source code; extracting features from the business classes; estimating similarity for business class pairs based on the extracted features by determining a combined similarity for the business class pairs based on evaluation of similarity measures that include textual, class name, method name, packaging, inheritance and interface realization, and structural based similarities, and using a relative significance factor for each of the similarity measures such that the sum of the similarity measures is equal to a predetermined value; clustering the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determining interfaces for the components based on the clustering. 16. A component discovery system comprising: a processor; and a memory storing machine readable instructions that when executed by the processor cause the processor to: determine business classes by excluding packages and classes in source code; extract at least one of textual, code, and structural dependency based features from the business classes; estimate similarity for business class pairs based on the extracted features by populating a class name matrix that accounts for a frequency of occurrence of word concepts in a business class name, applying term frequency-inverse document frequency (tf-idf) based weighting to the class name matrix, and determining class name similarity for the business class pairs by evaluating class name matrices corresponding to the business class pairs; cluster the business classes based on the similarity, wherein clusters generated by the clustering represent components of the source code; and determine interfaces for the components based on the clustering. 17. The component discovery system according to claim 16 , wherein the machine readabl
Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling · CPC title
Structural analysis for program understanding · CPC title
Clustering or classification · CPC title
Software maintenance or management · CPC title
Object-oriented languages · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.