Normalization of medical terms with multi-lingual resources

US11308289B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11308289-B2
Application numberUS-201916569874-A
CountryUS
Kind codeB2
Filing dateSep 13, 2019
Priority dateSep 13, 2019
Publication dateApr 19, 2022
Grant dateApr 19, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Method and apparatus are presented for receiving a medical or medical condition related input term or phrase in a source language, and translating the term or phrase from the source language into at least one target language to obtain a set of translated terms of the input term. For each translated term in the set of translations, the method and apparatus further translate the set of translations back into the source language to obtain an output list of standard versions of the input term, scoring each entry of the output list as to probability of being the most standard version of the input term, and providing the entry of the output list that has the highest score to a user.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving data including a medically-related input term in a source language; translating, using a first dictionary, the input term from the source language into a plurality of target languages to obtain a set of translated terms of the input term; for each respective translated term in the set of translated terms, translating, using a second dictionary, from one or more pre-defined domain specific dictionaries, that is specific to a medical domain, the respective translated term back into the source language to obtain an output list of normalized versions of the input term, wherein each normalized version in the output list are specific to the medical domain; assigning a respective score to each respective entry of the output list of normalized versions based on a respective probability of the respective entry being the most standard version of the input term; identifying an entry of the output list having a highest score; and providing the entry of the output list having the highest score as normalized input to a medical data processing application that processes the data. 2. The method of claim 1 , wherein the input term is a common version of a medical concept, condition, object, process, or drug. 3. The method of claim 1 , wherein each element of the set of translations is translated back into the source language using the one or more pre-defined domain specific dictionaries. 4. The method of claim 3 , wherein the one or more pre-defined domain specific dictionaries are professional, scientific, or academic. 5. The method of claim 1 , wherein the input term is used in spoken Chinese. 6. The method of claim 1 , wherein the plurality of target languages includes two separate target languages. 7. The method of claim 6 , wherein the two separate target languages are each European languages. 8. The method of claim 6 , wherein, for each of the two separate target languages a body of published medical literature exists. 9. The method of claim 1 , further comprising outputting each entry of the output list, and its score, to the medical data processing application. 10. The method of claim 1 , wherein the score of each entry of the output list is determined by one or more of: relative frequency of the entry in the source language, relative frequency of a term in the plurality of target languages that was translated into the entry, or relative authority or prestige of the dictionary that was used to obtain the entry. 11. A computer program product comprising a computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code executable by one or more computer processors to perform an operation comprising: receiving data including a medically-related input term in a source language; translating, using a first dictionary, the input term from the source language into a plurality of target languages to obtain a set of translated terms of the input term; for each respective translated term in the set of translated terms, translating, using a second dictionary, from one or more pre-defined domain specific dictionaries, that is specific to a medical domain, the respective translated term back into the source language to obtain an output list of normalized versions of the input term, wherein each normalized version in the output list are specific to the medical domain; assigning a respective score to each respective entry of the output list of normalized versions based on a respective probability of the respective entry being the most standard version of the input term; identifying an entry of the output list having a highest score; and provide the entry of the output list having the highest score as normalized input to a medical data processing application that processes the data. 12. The computer program product of claim 11 , wherein the computer-readable program code is further executable to: output each entry of the output list, and its score, to the medical data processing application. 13. The computer program product of claim 11 , wherein each element of the set of translations is translated back into the source language using the one or more pre-defined domain specific dictionaries. 14. The computer program product of claim 13 , wherein the one or more pre-defined domain specific dictionaries are professional, scientific, or academic. 15. The computer program product of claim 11 , wherein the score of each entry of the output list is determined by one or more of: relative frequency of the entry in the source language, relative frequency of a term in the plurality of target languages that was translated into the entry, or relative authority or prestige of the dictionary that was used to obtain the entry. 16. A system, comprising: one or more computer processors; and a memory containing a program which when executed by the one or more computer processors performs an operation, the operation comprising: receiving data including a medically-related input term in a source language; translating, using a first dictionary, the input term from the source language into a plurality of target languages to obtain a set of translated terms of the input term; for each respective translated term in the set of translated terms, translating, using a second dictionary, from one or more pre-defined domain specific dictionaries, that is specific to a medical domain, the respective translated term back into the source language to obtain an output list of normalized versions of the input term, wherein each normalized version in the output list are specific to the medical domain; assigning a respective score to each respective entry of the output list of normalized versions based on a respective probability of the respective entry being the most standard version of the input term; identifying an entry of the output list having a highest score; and providing the entry of the output list having the highest score as normalized input to a medical data processing application that processes the data. 17. The system of claim 16 , the operation further comprising: outputting each entry of the output list, and its score, to the medical data processing application. 18. The system of claim 16 , wherein each element of the set of translations is translated back into the source language using the one or more pre-defined domain specific dictionaries. 19. The system of claim 18 , wherein the one or more pre-defined domain specific dictionaries are professional, scientific, or academic. 20. The system of claim 16 , wherein the score of each entry of the output list is determined by one or more of: relative frequency of the entry in the source language, relative frequency of a term in the plurality of target languages that was translated into the entry, or relative authority or prestige of the dictionary that was used to obtain the entry.

Assignees

Inventors

Classifications

  • G06F40/58Primary

    Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

  • Version control (for software G06F8/71) · CPC title

  • Translation evaluation · CPC title

  • Statistical methods, e.g. probability models · CPC title

  • Lexical tools · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11308289B2 cover?
Method and apparatus are presented for receiving a medical or medical condition related input term or phrase in a source language, and translating the term or phrase from the source language into at least one target language to obtain a set of translated terms of the input term. For each translated term in the set of translations, the method and apparatus further translate the set of translatio…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 19 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).