Methods and apparatus for formatting text for clinical fact extraction

US9905229B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9905229-B2
Application numberUS-201213489266-A
CountryUS
Kind codeB2
Filing dateJun 5, 2012
Priority dateFeb 18, 2011
Publication dateFeb 27, 2018
Grant dateFeb 27, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An original text that is a representation of a narration of a patient encounter provided by a clinician may be received and re-formatted to produce a formatted text. One or more clinical facts may be extracted from the formatted text. A first fact of the clinical facts may be extracted from a first portion of the formatted text, and the first portion of the formatted text may be a formatted version of a first portion of the original text. A linkage may be maintained between the first fact and the first portion of the original text.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving an original free-form text narrative regarding a patient encounter provided by a clinician; re-formatting the original free-form text narrative, using at least one processor, at least in part by adding, removing, and/or correcting sentence boundaries and/or section boundaries with respect to the original free-form text narrative to produce a formatted text including the added and/or corrected sentence boundaries and/or section boundaries, the re-formatting comprising applying at least one statistical model to the original free-form text narrative to generate, for a word or a sequence of words in the original free-form text narrative, a probability that the word or the sequence of words would be followed by a sentence boundary and/or a section boundary, wherein the at least one statistical model is trained at least in part with other free-form text narratives having correct sentence boundaries and/or section boundaries, and in response to determining that the probability satisfies one or more criteria, adding, removing, and/or correcting a sentence boundary and/or a section boundary following the word or the sequence of words, with respect to the original free-form text narrative; extracting one or more clinical facts from the formatted text, wherein a first fact of the one or more clinical facts is extracted from a first portion of the formatted text, wherein the first portion of the formatted text is a formatted version of a first portion of the original free-form text narrative, the extracting comprising analyzing the formatted text to identify a set of one or more features of at least the first portion of the formatted text, correlating the set of features to one or more abstract semantic concepts, and generating computer-readable data that expresses the one or more abstract semantic concepts as the one or more clinical facts extracted from the formatted text; and providing to a user an indicator that distinguishes the first portion of the original free-form text narrative that resulted in extraction of the first fact, from other portions of the original free-form text narrative that did not result in the extraction of the first fact. 2. The method of claim 1 , wherein the re-formatting comprises adding to, removing from or correcting at least one section heading in the original free-form text narrative, to produce the formatted text. 3. The method of claim 1 , wherein the re-formatting comprises normalizing at least one section heading according to a standard for an institution associated with the patient encounter. 4. The method of claim 1 , wherein receiving the original free-form text narrative comprises performing automatic speech recognition on a spoken free-form narration provided by the clinician. 5. The method of claim 4 , wherein performing the automatic speech recognition comprises accessing a lexicon of terms linked to a clinical ontology. 6. The method of claim 5 , wherein the clinical ontology is a language understanding ontology. 7. The method of claim 1 , wherein neither the original free-form text narrative nor the formatted text is edited by a human other than the clinician before the one or more clinical facts are extracted. 8. The method of claim 1 , wherein the re-formatting and the extracting are performed automatically. 9. An apparatus comprising: at least one processor; and a memory storing processor-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform a method comprising: receiving an original free-form text narrative regarding a patient encounter provided by a clinician; re-formatting the original free-form text narrative at least in part by adding, removing, and/or correcting sentence boundaries and/or section boundaries with respect to the original free-form text narrative to produce a formatted text including the added and/or corrected sentence boundaries and/or section boundaries, the re-formatting comprising applying at least one statistical model to the original free-form text narrative to generate, for a word or a sequence of words in the original free-form text narrative, a probability that the word or the sequence of words would be followed by a sentence boundary and/or a section boundary, wherein the at least one statistical model is trained at least in part with other free-form text narratives having correct sentence boundaries and/or section boundaries, and in response to determining that the probability satisfies one or more criteria, adding, removing, and/or correcting a sentence boundary and/or a section boundary following the word or the sequence of words, with respect to the original free-form text narrative; extracting one or more clinical facts from the formatted text, wherein a first fact of the one or more clinical facts is extracted from a first portion of the formatted text, wherein the first portion of the formatted text is a formatted version of a first portion of the original free-form text narrative, the extracting comprising analyzing the formatted text to identify a set of one or more features of at least the first portion of the formatted text, correlating the set of features to one or more abstract semantic concepts, and generating computer-readable data that expresses the one or more abstract semantic concepts as the one or more clinical facts extracted from the formatted text; and providing to a user an indicator that distinguishes the first portion of the original free-form text narrative that resulted in extraction of the first fact, from other portions of the original free-form text narrative that did not result in the extraction of the first fact. 10. The apparatus of claim 9 , wherein the re-formatting comprises adding to, removing from or correcting at least one section heading in the original free-form text narrative, to produce the formatted text. 11. The apparatus of claim 9 , wherein the re-formatting comprises normalizing at least one section heading according to a standard for an institution associated with the patient encounter. 12. The apparatus of claim 9 , wherein receiving the original free-form text narrative comprises performing automatic speech recognition on a spoken free-form narration provided by the clinician. 13. The apparatus of claim 12 , wherein performing the automatic speech recognition comprises accessing a lexicon of terms linked to a clinical ontology. 14. The apparatus of claim 13 , wherein the clinical ontology is a language understanding ontology. 15. The apparatus of claim 9 , wherein the method further comprises prompting a user to approve the formatted text. 16. The apparatus of claim 9 , wherein neither the original free-form text narrative nor the formatted text is edited by a human other than the clinician before the one or more clinical facts are extracted. 17. At least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method comprising: receiving an original free-form text narrative regarding a patient encounter provided by a clinician; re-formatting the original free-form text narrative at least in part by adding, removing, and/or correcting sentence boundaries and/or section boundaries with respect to the original free-form text narrative to produce a formatted text including the added and/or corrected sentence boundaries and/or section boundaries, the re-formatting comprising app

Assignees

Inventors

Classifications

  • of application context · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • ICT specially adapted for medical reports, e.g. generation or transmission thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9905229B2 cover?
An original text that is a representation of a narration of a patient encounter provided by a clinician may be received and re-formatted to produce a formatted text. One or more clinical facts may be extracted from the formatted text. A first fact of the clinical facts may be extracted from a first portion of the formatted text, and the first portion of the formatted text may be a formatted ver…
Who is the assignee on this patent?
Montyne Frank, Decraene David, Van Der Vloet Joeri, and 12 more
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).