Who is the assignee on this patent?

Mcnair Douglas S, Kailasam Kanakasabha K, Murrish John Christopher, and 1 more

What technology area does this patent fall under?

Primary CPC classification G06F16/288. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Synonym discovery

US10198499B1 · US · B1

Patent metadata
Field	Value
Publication number	US-10198499-B1
Application number	US-201213569781-A
Country	US
Kind code	B1
Filing date	Aug 8, 2012
Priority date	Aug 8, 2011
Publication date	Feb 5, 2019
Grant date	Feb 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and computer-readable media are provided for facilitating mapping of semantically similar terms between and among two or more information systems. In particular, to facilitate automatic discovery, establishment, and/or statistical validation of linkages between a plurality of different nomenclatures employed by a plurality of information systems, such as multiple electronic health record systems. In embodiments, the imputation of latent synonymy in corpora comprised of samples of historical records from each system enables automated terminology mapping between disparate systems' records, thereby establishing reliable linkages that may subsequently be utilized for realtime decision support, data mining-based research, or other valuable purposes.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more non-transitory computer-readable media having computer-usable instructions embodied thereon that, when executed, enable a processor to perform a method of discovering latent relationships in data, said method comprising: obtaining a first set of records from a first health-records system, the first health-records system having a first structure corresponding to the organization of the first set of records and comprising a set of unified codesets, nomenclature rubric, or unified ontology; obtaining a second set of records from a second health-records system, the second health-records system having a second structure corresponding to the organization of the second set of records and comprising a set of unified codesets, nomenclature rubric, or unified ontology, based on one or more terms, the first structure being incompatible with the second structure; identifying at least one data item associated with episodes of care within said first set of records; selecting first raw data from the first set of records comprising a plurality of instances associated with the at least one data item, the first raw data comprising a set of values associated with the at least one data item; discarding extreme values of said first raw data; selecting second raw data from the second set of records, the second raw data comprising a set of values; determining a subset of the second raw data records as closely matching the at least one data item, the identifying for each of the at least one data item comprising: generating one or more clusters by applying a clustering method to the subset of the second raw data; calculating at least one measure quantifying similarity between the at least one cluster and the first raw data; and determining the subset of the second raw data as closely matching the data item in response to the at least one measure quantifying similarity being less than a predetermined threshold; and in response to determining that the subset of the second raw data records closely match the at least one data item, creating a provisional binding of at least one of the one or more terms associated with the at least one cluster in the second health record system to the data item in the first health-records system, thereby generating, at least partially, a mapping of the first structure to the second structure. 2. The non-transitory computer-readable media of claim 1 , wherein the selecting first raw data from the first set of records comprises selecting records containing demographic attributes associated with episodes of care that are associated with the data item. 3. The non-transitory computer-readable media of claim 2 , wherein the selecting second raw data from the second set of records comprises matching one or more demographic attributes associated with the first set of records to demographic attributes from the second set of records. 4. The non-transitory computer-readable media of claim 1 , further comprising: using a term identified by mapped clusters as a basis for cross-mapping the term to a third health-records system. 5. The non-transitory computer-readable media of claim 1 , wherein the calculating at least one measure quantifying similarity comprises using a two-sample Kolmogorov-Smirnov D test. 6. The non-transitory computer-readable media of claim 1 , wherein the calculating at least one measure quantifying similarity comprises using a non-parametric metric. 7. The non-transitory computer-readable media of claim 6 , wherein the calculating at least one measure quantifying similarity comprises using a Cramer V test. 8. The non-transitory computer-readable media of claim 1 , wherein the applying a cluster method involves reducing the dimensionality. 9. The non-transitory computer-readable media of claim 1 , wherein the applying a cluster method comprises generating a decision-tree classifier. 10. The non-transitory computer-readable media of claim 1 , further comprising displaying a supervisory screen presenting the provisional binding, thereby permitting a user to modify the mapping by including or excluding terms from the provisional mapping. 11. The non-transitory computer-readable media of claim 1 , wherein the identifying a subset of said second raw data records as closely matching the data item comprises matching a plurality of data values within the first raw data and the second raw data. 12. The non-transitory computer-readable media of claim 1 , further comprising reducing said subset of said second raw data records by cleaning the subset of extreme values. 13. The non-transitory computer-readable media of claim 12 , further comprising transforming some values of the subset of the second raw data. 14. A method for discovering latent relationships in data, the method comprising: obtaining a first set of records from a first health-records system, the first health-records system having a first structure corresponding to the organization of the first set of records and comprising a set of unified codesets, nomenclature rubric, or unified ontology; obtaining a second set of records from a second health-records system, the second health-records system having a second structure corresponding to the organization of the second set of records and comprising a set of unified codesets, nomenclature rubric, or unified ontology, based on one or more terms, the first structure being incompatible with the second structure; identifying at least one data item associated with episodes of care within said first set of records; selecting first raw data from the first set of records comprising a plurality of instances associated with the at least one data item, the first raw data comprising a set of values associated with the at least one data item; discarding extreme values of said first raw data; selecting second raw data from the second set of records, the second raw data comprising a set of values; determining a subset of the second raw data records as closely matching the at least one data item, the identifying for each of the at least one data item comprising: generating one or more clusters by applying a clustering method to the subset of the second raw data; calculating at least one measure quantifying similarity between the at least one cluster and the first raw data; and determining the subset of the second raw data as closely matching the data item in response to comparing the measure quantifying similarity being less than a predetermined threshold; and in response to determining that the subset of the second raw data records closely match the at least one data item, creating a provisional binding of at least one of the one or more terms associated with the at least one cluster in the second health record system to the data item in the first health-records system thereby generating, at least partially, a mapping of the first structure to the second structure. 15. The method of claim 14 , further comprising displaying a supervisory screen presenting the provisional binding, thereby permitting a user to modify the mapping by including or excluding clusters from the provisional mapping. 16. The method of claim 15 , wherein said measure is a non-parametric measure. 17. The method of claim 16 , wherein said non-parametric measure uses a two-sample Kolmogorov-Smirnov D test. 18. The method of claim 16 , wherein said non-parametric measure uses a Cramer V test. 19. The method of claim 16 , wherein the selecting first raw data from the first set of records comprises selecting records containi

Assignees

Inventors

Classifications

G06F16/288Primary
Entity relationship models · CPC title
G16H15/00
ICT specially adapted for medical reports, e.g. generation or transmission thereof · CPC title
G16H50/70
for mining of medical data, e.g. analysing previous cases of other patients · CPC title
G06F17/30604Primary
Physics · mapped topic
G16H10/60Primary
for patient-specific data, e.g. for electronic patient records · CPC title

Patent family

Related publications grouped by family.

View patent family 65200324

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10198499B1 cover?: Methods, systems, and computer-readable media are provided for facilitating mapping of semantically similar terms between and among two or more information systems. In particular, to facilitate automatic discovery, establishment, and/or statistical validation of linkages between a plurality of different nomenclatures employed by a plurality of information systems, such as multiple electronic he…
Who is the assignee on this patent?: Mcnair Douglas S, Kailasam Kanakasabha K, Murrish John Christopher, and 1 more
What technology area does this patent fall under?: Primary CPC classification G06F16/288. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).