Recording, replaying and modifying an unstructured information management architecture (UIMA) pipeline

US9734046B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9734046-B2
Application numberUS-201414242488-A
CountryUS
Kind codeB2
Filing dateApr 1, 2014
Priority dateApr 1, 2014
Publication dateAug 15, 2017
Grant dateAug 15, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technique herein substantially improves productivity of Annotator developers by providing methods and systems to develop and test Annotators without having to run a full pipeline every time changes are made to a particular Annotator. To this end, preferably a running pipeline is instrumented to enable automated recording of static configuration and dynamically-generated event data as the pipeline is executed. Based on these data, a reusable data model is generated that captures code and other dependencies in the pipeline (e.g., configuration parameters, intermediary CASes, program flow, annotations, and the like). The data model is then used to facilitate testing of Annotators without using the full pipeline (or even major sub-pipelines therein).

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of testing elements of a pipeline software system comprising a set of distinct elements organized end-to-end from an initial element to a final element, comprising: as an end-to-end run of a pipeline represented by the pipeline software system executes, receiving information about a state of the pipeline software system, wherein the pipeline software system comprises an Unstructured Information Management Architecture (UIMA) pipeline and the information includes one or more intermediary Common Analysis Structure (CAS) datasets; using the information received to generate a reusable data model representative of conditions existing along the set of distinct elements of the pipeline software system; and in association with a development action on a particular element positioned along the pipeline between the initial and final elements, and based on the reusable data model, identifying the conditions upstream of the particular element, providing the identified conditions as input to the particular element, and replaying execution of the pipeline software system without requiring initialization and re-execution of the end-to-end run of the pipeline from the initial element to the final element; wherein the method is carried out in software executing in a hardware element. 2. The method as described in claim 1 wherein the information also includes event data generated by at least one UIMA Annotator in the UIMA pipeline. 3. The method as described in claim 2 further including instrumenting the UIMA Annotator to generate the event data. 4. The method as described in claim 1 wherein the development action is one of: executing or debugging an element in a standalone manner, running a test to determine how deletion of an element affects the pipeline, running a test to determine how adding an element affects the pipeline, running a test to determine how a conditional execution of an element affects the pipeline, and building a code dependency tree. 5. The method as described in claim 1 further including receiving additional information about the state of the pipeline software system from one or more additional runs of the pipeline software system. 6. An apparatus to test elements of a pipeline software system comprising a set of distinct elements organized end-to-end from an initial element to a final element, comprising: a processor; and computer memory holding computer program instructions executed by the processor, the computer program instructions comprising: code, configured as an end-to-end run of a pipeline represented by the pipeline software system executes, to receive information about a state of the pipeline software system, wherein the pipeline software system comprises an Unstructured Information Management Architecture (UIMA) pipeline and the information includes one or more intermediary Common Analysis Structure (CAS) datasets; code, configured to use the information received to generate a reusable data model representative of conditions existing along the set of distinct elements of the pipeline software system; and code, configured in association with a development action on a particular element positioned along the pipeline between the initial and final elements, and based on the reusable data model, to identify the conditions upstream of the particular element, to provide the identified conditions as input to the particular element, and to replay execution of the pipeline software system without requiring initialization and re-execution of the end-to-end run of the pipeline from the initial element to the final element. 7. The apparatus as described in claim 6 wherein the information also includes event data generated by at least one UIMA Annotator in the UIMA pipeline. 8. The apparatus as described in claim 7 further including code to instrument the UIMA Annotator to generate the event data. 9. The apparatus as described in claim 6 further including a user interface through the development action is performed, the development action being one of: executing or debugging an element in a standalone manner, running a test to determine how deletion of an element affects the pipeline, running a test to determine how adding an element affects the pipeline, running a test to determine how a conditional execution of an element affects the pipeline, and building a code dependency tree. 10. The apparatus as described in claim 6 wherein the code configured to receive the information receives additional information about the state of the pipeline software system from one or more additional runs of the pipeline software system. 11. A computer program product in a non-transitory computer readable storage medium for use in a computing entity, the computer program product holding computer program instructions which, when executed, test elements of a pipeline software system comprising a set of distinct elements organized end-to-end from an initial element to a final element, the computer program instructions comprising: code, configured as an end-to-end run of a pipeline represented by the pipeline software system executes, to receive information about a state of the pipeline software system, wherein the pipeline software system comprises an Unstructured Information Management Architecture (UIMA) pipeline and the information includes one or more intermediary Common Analysis Structure (CAS) datasets; code, configured to use the information received to generate a reusable data model representative of conditions existing along the set of distinct elements of the pipeline software system; and code, configured in association with a development action on a particular element positioned along the pipeline between the initial and final elements, and based on the reusable data model, to identify the conditions upstream of the particular element, to provide the identified conditions as input to the particular element, and to replay execution of the pipeline software system without requiring initialization and re-execution of the end-to-end run of the pipeline from the initial element to the final element. 12. The computer program product as described in claim 11 wherein the information also includes event data generated by at least one UIMA Annotator in the UIMA pipeline. 13. The computer program product as described in claim 11 further including code to instrument the UIMA Annotator to generate the event data. 14. The computer program product as described in claim 11 further including a user interface through the development action is performed, the development action being one of: executing or debugging an element in a standalone manner, running a test to determine how deletion of an element affects the pipeline, running a test to determine how adding an element affects the pipeline, running a test to determine how a conditional execution of an element affects the pipeline, and building a code dependency tree. 15. The apparatus as described in claim 11 wherein the code configured to receive the information receives additional information about the state of the pipeline software system from one or more additional runs of the pipeline software system.

Assignees

Inventors

Classifications

  • Administration; Management · CPC title

  • Physics · mapped topic

  • for test execution, e.g. scheduling of test suites · CPC title

  • Environments for analysis, debugging or testing of software · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9734046B2 cover?
The technique herein substantially improves productivity of Annotator developers by providing methods and systems to develop and test Annotators without having to run a full pipeline every time changes are made to a particular Annotator. To this end, preferably a running pipeline is instrumented to enable automated recording of static configuration and dynamically-generated event data as the pi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F11/3688. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 15 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).