Searching Code Based on Learned Programming Construct Patterns and NLP Similarity

US2016283360A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016283360-A1
Application numberUS-201514665089-A
CountryUS
Kind codeA1
Filing dateMar 23, 2015
Priority dateMar 23, 2015
Publication dateSep 29, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided to ingest software source code files into a question/answering (QA) system. During ingestion, source code blocks are classified to identify one or more constructs in the blocks as being domain-specific. Relationships between the blocks are then mapped. Software compliance regulations are ingested into the QA system. Using the QA system, a source code file is analyzed for compliance to the software compliance regulations. The analysis identifies code sections within the source code file as being domain-specific and subject to the ingested set of software compliance regulations.

First claim

Opening claim text (preview).

1 . (canceled) 2 . (canceled) 3 . (canceled) 4 . (canceled) 5 . (canceled) 6 . (canceled) 7 . (canceled) 8 . An information handling system comprising: one or more processors; one or more data stores accessible by at least one of the processors; a memory coupled to at least one of the processors; and a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of: ingesting a plurality of software source code files into a question/answering (QA) system; during ingestion, classifying one or more source code blocks pertaining to each of the software source code files, identifying one or more constructs in one or more of the blocks as being domain-specific, and mapping one or more relationships between the blocks; ingesting a set of software compliance regulations into the QA system; and using the QA system, analyzing compliance of a source code file to the software compliance regulation, wherein the analyzing identifies one or more code sections within the source code file as being domain-specific and subject to the ingested set of software compliance regulations. 9 . The information handling system of claim 8 wherein the actions further comprise: identifying one or more domain-specific relationships between the code sections of the source code file; and using the QA system, analyzing compliance of the source code file by further determining whether the identified domain-specific relationships are subject to the ingested set of software compliance regulations. 10 . The information handling system of claim 8 wherein the analyzing further comprises: formulating a natural language question pertaining to the source code file being domain specific and subject to the ingested set of software compliance regulations; submitting the natural language question to the QA system; and receiving a response from the QA system that indicates whether the source code file is subject to the ingested set of software compliance regulations. 11 . The information handling system of claim 8 wherein the analyzing further comprises: identifying a classification pertaining to a selected block retrieved from the source code file, wherein the classification is selected from the group consisting of a conditional assignment, a manipulation, a transformer, a routing, and another type of block; formulating a natural language question pertaining to the selected block and the classification as being domain specific and subject to the ingested set of software compliance regulations; and submitting the natural language question to the QA system; and receiving a response from the QA system that indicates whether the selected block is subject to the ingested set of software compliance regulations. 12 . The information handling system of claim 11 wherein the actions further comprise: identifying an answer and a confidence value included in the response from the QA system; and in response to the answer indicating that the selected block is subject to the ingested set of software compliance regulations and the confidence value exceeding a threshold, indicating that the selected block is subject to the ingested set of software compliance regulations. 13 . The information handling system of claim 12 wherein the actions further comprise: identifying that the selected block is subject to a selected software compliance regulation from the ingested set of software compliance regulations, wherein the selected software compliance regulation is included in the response from the QA system. 14 . The information handling system of claim 8 wherein the analyzing further comprises: identifying one or more domain-specific relationships between the code sections of the source code file; formulating a natural language question pertaining to the identified domain specific relationships and the ingested set of software compliance regulations; submitting the natural language question to the QA system; and receiving a response from the QA system that indicates whether one or more of the domain-specific relationships are subject to the ingested set of software compliance regulations. 15 . A computer program product stored in a computer readable storage medium, comprising computer program code that, when executed by an information handling system, causes the information handling system to perform actions comprising: ingesting a plurality of software source code files into a question/answering (QA) system; during ingestion, classifying one or more source code blocks pertaining to each of the software source code files, identifying one or more constructs in one or more of the blocks as being domain-specific, and mapping one or more relationships between the blocks; ingesting a set of software compliance regulations into the QA system; and using the QA system, analyzing compliance of a source code file to the software compliance regulation, wherein the analyzing identifies one or more code sections within the source code file as being domain-specific and subject to the ingested set of software compliance regulations. 16 . The computer program product of claim 15 wherein the actions further comprise: identifying one or more domain-specific relationships between the code sections of the source code file; and using the QA system, analyzing compliance of the source code file by further determining whether the identified domain-specific relationships are subject to the ingested set of software compliance regulations. 17 . The computer program product of claim 15 wherein the analyzing further comprises: formulating a natural language question pertaining to the source code file being domain specific and subject to the ingested set of software compliance regulations; submitting the natural language question to the QA system; and receiving a response from the QA system that indicates whether the source code file is subject to the ingested set of software compliance regulations. 18 . The computer program product of claim 15 wherein the analyzing further comprises: identifying a classification pertaining to a selected block retrieved from the source code file, wherein the classification is selected from the group consisting of a conditional assignment, a manipulation, a transformer, a routing, and another type of block; formulating a natural language question pertaining to the selected block and the classification as being domain specific and subject to the ingested set of software compliance regulations; submitting the natural language question to the QA system; and receiving a response from the QA system that indicates whether the selected block is subject to the ingested set of software compliance regulations. 19 . The computer program product of claim 18 wherein the actions further comprise: identifying an answer and a confidence value included in the response from the QA system; in response to the answer indicating that the selected block is subject to the ingested set of software compliance regulations and the confidence value exceeding a threshold, indicating that the selected block is subject to the ingested set of software compliance regulations; and identifying that the selected block is subject to a selected software compliance regulation from the ingested set of software compliance regulations, wherein the selected software compliance regulation is included in the response from the QA system. 20 . The computer program produc

Assignees

Inventors

Classifications

  • Natural language query formulation · CPC title

  • G06F8/36Primary

    Software reuse · CPC title

  • Clustering or classification · CPC title

  • using natural language analysis · CPC title

  • for test execution, e.g. scheduling of test suites · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016283360A1 cover?
An approach is provided to ingest software source code files into a question/answering (QA) system. During ingestion, source code blocks are classified to identify one or more constructs in the blocks as being domain-specific. Relationships between the blocks are then mapped. Software compliance regulations are ingested into the QA system. Using the QA system, a source code file is analyzed for…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).