System for generating readable and meaningful descriptions of stream processing source code

US9043758B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9043758-B2
Application numberUS-201313839289-A
CountryUS
Kind codeB2
Filing dateMar 15, 2013
Priority dateMar 15, 2013
Publication dateMay 26, 2015
Grant dateMay 26, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing system, computer readable storage medium, and method for automatically generating human readable and meaningful documentation for one or more source code files. A processor of the information processing system receives one or more source code files containing source code artifacts (SCA) and infers semantics therefrom based on predefined rules. The processor, based on the inferred semantics, extracts documentation from another source code file. The extracted documentation and the inferred semantics are used to generate new human readable and meaningful documentation for the SCA, such new documentation being previously missing from the SCA. The generated new documentation is included with the SCA in one or more source code files.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, with a processor of an information processing system, for automatically generating human readable and meaningful documentation for one or more source code files, comprising: receiving one or more source code files containing at least source code artifacts (first SCA); inferring semantics, with a processor of an information processing system, from at least the first SCA based on predefined rules, the inferred semantics at least indicating description of at least one of structure, function, and features, of the first SCA; automatically generating, with the processor and based at least on the inferred semantics, new human readable and meaningful documentation as new SCA documentation that describes at least one of structure, function, and features of the first SCA, and that such new SCA documentation is previously missing from the first SCA and any associated SCA documentation; and storing the generated new human readable and meaningful documentation associated with the first SCA as SCA documentation for the first SCA, and wherein the inferring semantics further comprises: compiling the received one or more source code files into one or more source code models, the compiled one or more source code models capturing at least key language features of the first SCA and any associated SCA documentation. 2. The method of claim 1 , further comprising: at least one of combining and merging the generated new human readable and meaningful documentation associated with the first SCA as SCA documentation included with the first SCA in one or more source code files. 3. The method of claim 1 , further comprising: identifying, based at least on the inferred semantics, at least one source code model stored in a source code model repository, the at least one source code model including SCA of at least one other source code file related to the first SCA and being associated with second SCA documentation; and acquiring, based at least on the inferred semantics, at least a portion of the second SCA documentation that is relevant to a description of at least one of structure, function, and features, of the first SCA; and wherein the generating comprises: generating, with the processor, new human readable and meaningful documentation as SCA documentation that describes the first SCA, based at least on the acquired relevant at least a portion of the second SCA documentation. 4. The method of claim 3 , wherein the acquiring comprises one or more of harvesting, deriving, and extracting, the at least a portion of the second SCA documentation that is relevant to a description of at least one of structure, function, and features, of the first SCA. 5. The method of claim 3 , wherein the acquiring comprises extracting at least a portion of the second SCA documentation related to the first SCA, the source code model repository storing a plurality of source code models for a plurality of other source code files. 6. The method of claim 5 , wherein the extracting comprises: caching the extracted at least a portion of the second SCA documentation as a tuple that includes at least one language feature that's a subject of the documentation and the documentation content related to the first SCA. 7. The method of claim 1 , wherein the compiling comprises automatically, with the processor and without human intervention, compiling the received one or more source code files into the one or more source code models. 8. The method of claim 1 , wherein the inferring semantics further comprises: analyzing the first SCA using the predefined rules to evaluate at least the language features specified in the one or more source code models to produce description of at least one of structure, function, and features, of the first SCA. 9. The method of claim 8 , wherein the predefined rules include one or more of the following: a stream language rule; an operator properties rule; an application pattern specification rules; a matching compatible input and output rule; a function invocation rule; a function definition rule; an object reference rule; an object definition rule; a class reference rule; a class definition rule; and a tokens names rule. 10. The method of claim 1 , further comprising: evaluating, based on a set of predefined rules, the one or more source code models and identifying at least one other source code file related to the first SCA, the at least one other source code file including second SCA and associated second SCA documentation that are separate and related to the first SCA, the set of predefined rules including at least matching on compatible inputs and outputs, function invocations and function definitions, object references and object definitions, class references and class definitions, and token names; extracting SCA documentation from at least one source code model of the at least one other source code file related to the first SCA, the at least one source code model being stored in a source code model repository storing a plurality of source code models for a plurality of other source code files; caching the extracted SCA documentation as one or more tuples that include at least one language feature that's a subject of the documentation and the documentation content related to the first SCA; scoring and ranking the cached SCA documentation tuples with respect to properties that include, but are not limited to, completeness, correctness, and currentness; and wherein the generating comprises: generating, with the processor, new human readable and meaningful documentation as SCA documentation that describes the first SCA, based at least on the scoring and ranking of the cached SCA documentation tuples. 11. The method of claim 10 , further comprising: selecting at least one highest scored and ranked cached SCA documentation tuple; and wherein the generating comprises: generating, with the processor, new human readable and meaningful documentation as SCA documentation that describes the first SCA from extracted SCA documentation, based at least on the selected cached SCA documentation tuple. 12. The method of claim 11 , further comprising: at least one of combining and merging the generated new human readable and meaningful documentation of the first SCA as SCA documentation included with the first SCA in one or more source code files. 13. An information processing system comprising: memory; a rules repository for storing predefined rules relating to at least one of structure, function, and features, of a set of source code artifacts; and a processor communicatively coupled to the memory and the rules repository, wherein the processor, responsive to executing computer instructions, performs operations comprising: receiving one or more source code files containing at least source code artifacts (first SCA); inferring semantics, with the processor, from at least the first SCA based on the predefined rules, the inferred semantics at least indicating description of at least one of structure, function, and features, of the first SCA; compiling the received one or more source code files into one or more source code models, the compiled one or more source code models capturing at least key language features of the first SCA and any associated SCA documentation; automatically generating, with the processor and based at least on the inferred semantics, new human readable and meaningful documentation as new SCA documentation that describes at least one of structure, function, and features of the first SCA, and that such new SCA documentation is previously missing from the first SCA and any associated SCA docum

Assignees

Inventors

Classifications

  • G06F8/73Primary

    Program documentation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9043758B2 cover?
An information processing system, computer readable storage medium, and method for automatically generating human readable and meaningful documentation for one or more source code files. A processor of the information processing system receives one or more source code files containing source code artifacts (SCA) and infers semantics therefrom based on predefined rules. The processor, based on t…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/73. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 26 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).