Automated content inference system for unstructured text data
US-9378200-B1 · Jun 28, 2016 · US
US10628520B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10628520-B2 |
| Application number | US-201715591645-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 10, 2017 |
| Priority date | May 10, 2017 |
| Publication date | Apr 21, 2020 |
| Grant date | Apr 21, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to an embodiment of the present invention, a system dynamically processes a document including unstructured text and comprises a computer system including at least one processor. Initially, the system configures a plurality of dictionaries with terms supplied by a user and associated with a desired category. The processor in the system applies a set of rules to the unstructured text of the document to detect patterns indicating a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences. The system produces annotations associated with the desired category for the document based on the detected patterns. Embodiments of the present invention further include a method and computer program product for dynamically processing a document including unstructured text in substantially the same manner as is described above.
Opening claim text (preview).
What is claimed is: 1. A method of processing a document including unstructured text comprising: configuring, via a processor, a plurality of dictionaries with user-supplied terms associated with a desired category, wherein the desired category includes an entity requiring assistance to perform a physical activity; applying, via the processor, a set of rules to generate patterns for comparison with the unstructured text of the document to determine a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences, wherein at least one rule references two or more of the plurality of dictionaries for the dictionary terms to generate the patterns, wherein a first dictionary term from the two or more dictionaries is associated with a plurality of physical activities and requires confirmation to be associated with the physical activity of the desired category, and wherein one or more additional dictionary terms from the two or more dictionaries provide the confirmation and associate the first dictionary term with the physical activity of the desired category; detecting, via the processor, the patterns in the unstructured text of the document indicating the presence of the desired category; generating, via the processor, annotations associated with the desired category for the document based on the detected patterns via a series of annotators that are re-usable across dictionaries configured for different categories; and changing, via the processor, the desired category to the entity requiring assistance to perform a different physical activity by providing the plurality of dictionaries with new dictionaries including user-supplied terms associated with the different physical activity, wherein the set of rules generates new patterns for the different physical activity based on the new dictionaries. 2. The method of claim 1 , wherein the plurality of dictionaries includes: a first dictionary comprising terms associated with functions related to the physical activity; a second dictionary comprising terms associated with actions requiring implements; a third dictionary comprising terms associated with the implements for the actions; and a fourth dictionary comprising terms associated with devices used to assist the entity in performing the physical activity. 3. The method of claim 1 , wherein generating the annotations further comprises: applying dependence rules to the unstructured text and determining a dependence of the entity with respect to performing the physical activity; and modifying a generated annotation to indicate the determined dependence of the entity, wherein the modified annotation indicates that the entity is one of independent, semi-dependent, and fully dependent relative to performing the physical activity. 4. The method of claim 1 , wherein the entity requiring assistance is a medical patient. 5. The method of claim 4 , wherein the physical activity is selected from a group consisting of standing, walking, eating, dressing, bathing, using a toilet, speaking, and administering a health substance. 6. A system for processing a document including unstructured text comprising: a computer system including at least one processor configured to: configure a plurality of dictionaries with user-supplied terms associated with a desired category, wherein the desired category includes an entity requiring assistance to perform a physical activity; apply a set of rules to generate patterns for comparison with the unstructured text of the document to determine a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences, wherein at least one rule references two or more of the plurality of dictionaries for the dictionary terms to generate the patterns, wherein a first dictionary term from the two or more dictionaries is associated with a plurality of physical activities and requires confirmation to be associated with the physical activity of the desired category, and wherein one or more additional dictionary terms from the two or more dictionaries provide the confirmation and associate the first dictionary term with the physical activity of the desired category; detect the patterns in the unstructured text of the document indicating the presence of the desired category; generate annotations associated with the desired category for the document based on the detected patterns via a series of annotators that are re-usable across dictionaries configured for different categories; and change the desired category to the entity requiring assistance to perform a different physical activity by providing the plurality of dictionaries with new dictionaries including user-supplied terms associated with the different physical activity, wherein the set of rules generates new patterns for the different physical activity based on the new dictionaries. 7. The system of claim 6 , wherein the plurality of dictionaries includes: a first dictionary comprising terms associated with functions related to the physical activity; a second dictionary comprising terms associated with actions requiring implements; a third dictionary comprising terms associated with the implements for the actions; and a fourth dictionary comprising terms associated with devices used to assist the entity in performing the physical activity. 8. The system of claim 6 , wherein generating the annotations further comprises: applying dependence rules to the unstructured text and determining a dependence of the entity with respect to performing the physical activity; and modifying a generated annotation to indicate the determined dependence of the entity, wherein the modified annotation indicates that the entity is one of independent, semi-dependent, and fully dependent relative to performing the physical activity. 9. The system of claim 6 , wherein the entity requiring assistance is a medical patient. 10. The system of claim 9 , wherein the physical activity is selected from group consisting of standing, walking, eating, dressing, bathing, using a toilet, speaking, and administering a health substance. 11. A computer program product for processing a document including unstructured text comprising: a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: configure a plurality of dictionaries with user-supplied terms associated with a desired category, wherein the desired category includes an entity requiring assistance to perform a physical activity; apply a set of rules to generate patterns for comparison with the unstructured text of the document to determine a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences, wherein at least one rule references two or more of the plurality of dictionaries for the dictionary terms to generate the patterns, wherein a first dictionary term from the two or more dictionaries is associated with a plurality of physical activities and requires confirmation to be associated with the physical activity of the desired category, and wherein one or more additional dictionary terms from the two or more dictionaries provide the confirmation and associate the first dictionary term with the physical activity of the desired category; detect the patterns in the unstructured text of the document indicating the pr
Social work or social welfare, e.g. community support activities or counselling services · CPC title
Dictionaries · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.