Searching code based on learned programming construct patterns and NLP similarity

US9946786B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9946786-B2
Application numberUS-201615067997-A
CountryUS
Kind codeB2
Filing dateMar 11, 2016
Priority dateMar 23, 2015
Publication dateApr 17, 2018
Grant dateApr 17, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided to ingest software source code files into a question/answering (QA) system. During ingestion, source code blocks are classified to identify one or more constructs in the blocks as being domain-specific. Relationships between the blocks are then mapped. Software compliance regulations are ingested into the QA system. Using the QA system, a source code file is analyzed for compliance to the software compliance regulations. The analysis identifies code sections within the source code file as being domain-specific and subject to the ingested set of software compliance regulations.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method implemented by an information handling system that includes a memory having executable instructions stored thereon and a processor, where the processor executes the instructions to perform the method steps, the method steps comprising: building a domain-specific knowledge base of a domain by ingesting a plurality of source code files into the domain-specific knowledge base utilized by a question/answering (QA) system, wherein the ingesting further comprises: ingesting a set of software compliance regulations corresponding to the domain into the domain-specific knowledge base; selecting one of the plurality of source code files that includes a plurality of source code sections; classifying one of the plurality of source code sections as a domain-specific source code section; and classifying a construct in the domain-specific source code section as a domain-specific construct; retrieving a new source code file that comprises a plurality of new source code sections, wherein a selected one of the plurality of new source code sections includes a new construct; generating a natural language question based on the new construct; and processing, by the QA system, the natural language question using the domain-specific knowledge base, wherein the processing further comprises: matching the new construct to the domain-specific construct; determining, in response to the matching, that the selected source code section is subject to the ingested set of software compliance regulations; and generating an answer, in response the determining, that indicates the new source code section is subject to the set of software compliance regulations. 2. The method of claim 1 further comprising: identifying one or more domain-specific relationships between the plurality of new code sections of the new source code file; and using the QA system, analyzing compliance of the new source code file by further determining whether the identified one or more domain-specific relationships are subject to the ingested set of software compliance regulations. 3. The method of claim 1 further comprising: formulating a different natural language question pertaining to the new source code file being domain specific and subject to the ingested set of software compliance regulations; submitting the different natural language question to the QA system; and receiving a response from the QA system that indicates whether the new source code file is subject to the ingested set of software compliance regulations. 4. The method of claim 1 further comprising: receiving, from the QA system, a confidence value corresponding to the answer; and in response to the confidence value exceeding a threshold, generating a report that indicates the new source code section is subject to the ingested set of software compliance regulations. 5. The method of claim 4 further comprising: including the set of software compliance regulations in the report.

Assignees

Inventors

Classifications

  • for test execution, e.g. scheduling of test suites · CPC title

  • Version control (security arrangements therefor G06F21/57); Configuration management · CPC title

  • G06F8/36Primary

    Software reuse · CPC title

  • using natural language analysis · CPC title

  • Dependency analysis; Data or control flow analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9946786B2 cover?
An approach is provided to ingest software source code files into a question/answering (QA) system. During ingestion, source code blocks are classified to identify one or more constructs in the blocks as being domain-specific. Relationships between the blocks are then mapped. Software compliance regulations are ingested into the QA system. Using the QA system, a source code file is analyzed for…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/36. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).