Apparatus and method for automatically mapping verbatim narratives to terms in a terminology dictionary

US11023679B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11023679-B2
Application numberUS-201715443828-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2017
Priority dateFeb 27, 2017
Publication dateJun 1, 2021
Grant dateJun 1, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus for automatically mapping a verbatim narrative to a term in a medical terminology dictionary includes a natural language processor and a comparator. The natural language processor processes terms from the medical terminology dictionary and from a medical coding decision database to generate a processed database that also includes the original terms from the medical terminology dictionary and the medical coding decision database. The natural language processor also processes the verbatim narrative. The comparator compares the processed verbatim narrative to the terms in the processed database and determines whether the processed verbatim narrative is an exact match to a term in the processed database. The verbatim narrative is mapped to the term in the medical terminology dictionary that corresponds to the term in the processed database that is an exact match. The verbatim narratives may include adverse event narratives, concomitant medication narratives, or other types of narratives. A method for automatically mapping a verbatim narrative to a term in a medical terminology dictionary is also described and claimed.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus for automatically mapping a verbatim narrative to a term in a medical terminology dictionary, comprising: a natural language processor configured to: process terms from the medical terminology dictionary and from a medical coding decision database to generate a processed database, the terms being processed by stemming, wherein the processed database also comprises the original terms from the medical terminology dictionary and the medical coding decision database; and process the verbatim narrative by stemming one or more words in the verbatim narrative; a comparator configured to compare the processed verbatim narrative as a whole to the terms in the processed database and determine whether the processed verbatim narrative as a whole is an exact match to a term in the processed database, wherein the verbatim narrative is mapped to the term in the medical terminology dictionary that corresponds to the term in the processed database that is an exact match; and a naïve Bayes classifier that, if there is no exact match, is configured to statistically analyze the verbatim narrative and map the verbatim narrative to the term in the medical terminology dictionary that is the closest match to the verbatim narrative, wherein the naïve Bayes classifier is a letters-based model if the probability of an assigned term is less than a pre-determined value and a words-based model if the probability exceeds the pre-determined value. 2. The apparatus of claim 1 , wherein the medical terminology dictionary is a drug terminology dictionary and wherein the terms in the drug terminology dictionary comprise active ingredients of drugs. 3. The apparatus of claim 1 , wherein processing the verbatim narrative comprises substituting for words in the verbatim narrative synonyms derived from the medical coding decision database or deleting words from the verbatim narrative that are considered inconsequential based on the medical coding decision database, or both substituting and deleting. 4. The apparatus of claim 3 , wherein: before the natural language processor processes terms to generate the processed database, the comparator compares the verbatim narrative to the terms from the medical terminology dictionary, the term that is an exact match is selected, and the verbatim narrative is mapped to the term that is an exact match; and if there is no match, the comparator compares the verbatim narrative to terms from the medical coding decision database, the term that is an exact match is selected, and the verbatim narrative is mapped to the term in the medical terminology dictionary that corresponds to the term that is an exact match. 5. The apparatus of claim 4 , wherein a second comparator performs at least one of the comparisons. 6. The apparatus of claim 1 , wherein the natural language processor: cleans the verbatim narrative; and sorts the words in the verbatim narrative, wherein after each of these operations the comparator compares the processed verbatim narrative to the terms in the processed database and it is then determined whether the processed verbatim narrative is an exact match to a term in the processed database. 7. The apparatus of claim 1 , wherein the medical coding decision database comprises auto-mappings to exact matches in the medical terminology dictionary. 8. The apparatus of claim 1 , wherein the medical coding decision database comprises human-coded mappings to the medical terminology dictionary. 9. The apparatus of claim 1 , wherein the medical coding decision database comprises auto-mappings to exact matches in the medical terminology dictionary and human-coded mappings to the medical terminology dictionary. 10. A method for automatically mapping a verbatim narrative to a term in a medical terminology dictionary, comprising: generating a processed database by processing through a natural language processor terms from the medical terminology dictionary and from a medical coding decision database, the term processing including stemming the terms, wherein the processed database also comprises the original terms from the medical terminology dictionary and the medical coding decision database; processing the verbatim narrative through the natural language processor by stemming one or more words in the verbatim narrative; comparing the processed verbatim narrative as a whole to the terms in the processed database; determining whether the processed verbatim narrative as a whole is an exact match to a term in the processed database; if there is a match, mapping the verbatim narrative to the term in the medical terminology dictionary that corresponds to the term in the processed database that is an exact match; and if there is no exact match, statistically analyzing the verbatim narrative using a naïve Bayes classifier that is a letters-based model if the probability of an assigned term is less than a pre-determined value and a words-based model if the probability exceeds the pre-determined value; and mapping the verbatim narrative to the term in the medical terminology dictionary that is the closest match to the verbatim narrative. 11. The method of claim 10 , wherein the medical terminology dictionary is a drug terminology dictionary and wherein the terms in the drug terminology dictionary are active ingredients of drugs. 12. The method of claim 10 , wherein processing the verbatim narrative through the natural language processor comprises substituting for words in the verbatim narrative synonyms derived from the medical coding decision database or deleting words from the verbatim narrative that are considered inconsequential based on the medical coding decision database, or both substituting and deleting. 13. The method of claim 10 , further comprising: before generating the processed database, comparing the verbatim narrative to the terms from the medical terminology dictionary, selecting the term that is an exact match, and mapping the verbatim narrative to the term that is an exact match; and if there is no match, comparing the verbatim narrative to terms from the medical coding decision database, selecting the term that is an exact match, and then mapping the verbatim narrative to the term in the medical terminology dictionary that corresponds to the term that is an exact match. 14. The method of claim 10 , wherein processing the verbatim narrative through the natural language processor further comprises: cleaning the verbatim narrative; and sorting the words in the verbatim narrative, wherein after each of these operations: the processed verbatim narrative is compared to the terms in the processed database; and the processed verbatim narrative is determined whether it is an exact match to a term in the processed database. 15. An apparatus for automatically mapping a concomitant medication (con-med) narrative to an active ingredient in a drug terminology dictionary, comprising: a processed database comprising: original active ingredients from the drug terminology dictionary; original terms from a medical coding decision database that correspond to active ingredients in the drug terminology dictionary; and processed active ingredients from the drug terminology dictionary and processed terms from the medical coding decision database that correspond to active ingredients in the drug terminology dictionary; a comparator configured to compare a processed con-med narrative to the active ingredients and terms in the processed database and to determine whether the processed con-med narrative is an exact match to an active ingredient or term in the pr

Assignees

Inventors

Classifications

  • G06F40/242Primary

    Dictionaries · CPC title

  • ICT specially adapted for medical reports, e.g. generation or transmission thereof · CPC title

  • G06F40/279Primary

    Recognition of textual entities · CPC title

  • for electronic clinical trials or questionnaires · CPC title

  • Thesauruses; Synonyms · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11023679B2 cover?
An apparatus for automatically mapping a verbatim narrative to a term in a medical terminology dictionary includes a natural language processor and a comparator. The natural language processor processes terms from the medical terminology dictionary and from a medical coding decision database to generate a processed database that also includes the original terms from the medical terminology dict…
Who is the assignee on this patent?
Medidata Solutions Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/242. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 01 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).