Systems and methods for computing with private healthcare data
US-2020402625-A1 · Dec 24, 2020 · US
US11720757B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11720757-B2 |
| Application number | US-201916543794-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 19, 2019 |
| Priority date | Aug 19, 2019 |
| Publication date | Aug 8, 2023 |
| Grant date | Aug 8, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, apparatuses, and computer program products are provided for extracting an entity value from a sentence. An embedding set that may include one or more sentence embeddings is generated for at least part of a first sentence that is tagged to associate a first named entity in the sentence with an entity type. A plurality of candidate embeddings is also generated for at least part of a second sentence. The one or more sentence embeddings in the embedding set may be compared with each of the plurality of candidate embeddings, and a match score may be assigned to each comparison to generate a match score set. A particular match score of the match score set may be identified that exceeds a similarity threshold, and an entity value of the entity type may be extracted from the second sentence associated with the identified match score.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a processor; and a memory device that stores program code configured to be executed by the processor, the program code comprising: an embedding generator configured to: receive, via a user interface, a first sentence, an identification of a first named entity in the first sentence, and an entity type associated with the first named entity, mask the first named entity identified via the user interface in the first sentence to generate a masked first sentence, generate an embedding set that comprises a plurality of sentence embeddings generated from a plurality of tagged sentences for the entity type, the plurality of sentence embeddings including one or more sentence embeddings for at least part of the masked first sentence, extract a candidate entity value from a second sentence received by a virtual agent, mask the candidate entity value in the second sentence to generate a masked second sentence, and generate a plurality of candidate embeddings for at least part of the masked second sentence, the plurality of candidate embeddings comprising: a first candidate embedding for a first subset of terms of the masked second sentence that follow the masked candidate entity value in a forward order, and a second candidate embedding for a second subset of terms of the masked second sentence that precede the masked candidate entity value in a reverse order; an embedding comparer configured to: compare each of the plurality of sentence embeddings in the embedding set with each of the plurality of candidate embeddings, and assign a match score to each comparison to generate a match score set; and an entity value extractor configured to: identify a match score of the match score set that exceeds a similarity threshold, and extract an entity value of the entity type from the second sentence associated with the identified match score. 2. The system of claim 1 , wherein an embedding of the embedding set for at least part of the masked first sentence comprises a first vector representation generated by applying an embedding model to the at least part of the masked first sentence, and each of the plurality of candidate embeddings comprises a candidate vector representation generated by applying the embedding model to the at least part of the masked second sentence. 3. The system of claim 2 , wherein the entity value extractor is configured to extract the entity value of the entity type without retraining the embedding model. 4. The system of claim 1 , wherein the plurality of candidate embeddings are generated by masking different portions of the second sentence. 5. The system of claim 1 , wherein the plurality of candidate embeddings include at least one of a forward state embedding or a backward state embedding for the candidate entity value in the second sentence. 6. The system of claim 1 , further comprising a slot filler configured to: identify a plurality of entity values of the entity type in the second sentence, and extract a slot value in the second sentence from among the plurality of entity values. 7. The system of claim 1 , further comprising: an authoring assistor configured to: apply a language model to at least part of the first sentence to recommend at least one word span associated with the entity type; and present the at least one word span in a user interface. 8. A method, comprising: receiving, via a user interface, a first sentence, an identification of a first named entity in the first sentence, and an entity type associated with the first named entity; masking the first named entity identified via the user interface in the first sentence to generate a masked first sentence; generating an embedding set that comprises a plurality of sentence embeddings generated from a plurality of tagged sentences for the entity type, the plurality of sentence embeddings including one or more sentence embeddings for at least part of the masked first sentence; extracting a candidate entity value from a second sentence received by a virtual agent; masking the candidate entity value in the second sentence to generate a masked second sentence; generating a plurality of candidate embeddings for at least part of the masked second sentence, the plurality of candidate embeddings comprising: a first candidate embedding for a first subset of terms of the masked second sentence that follow the masked candidate entity value in a forward order, and a second candidate embedding for a second subset of terms of the masked second sentence that precede the masked candidate entity value in a reverse order; comparing each of the plurality of sentence embeddings in the embedding set with each of the plurality of candidate embeddings; assigning a match score to each comparison to generate a match score set; identifying a match score of the match score set that exceeds a similarity threshold; and extracting an entity value of the entity type from the second sentence associated with the identified match score. 9. The method of claim 8 , wherein an embedding of the embedding set for at least part of the masked first sentence comprises a first vector representation generated by applying an embedding model to the at least part of the masked first sentence, and each of the plurality of candidate embeddings comprises a candidate vector representation generated by applying the embedding model to the at least part of the masked second sentence. 10. The method of claim 9 , wherein said extracting the entity value comprises extracting the entity value of the entity type without retraining the embedding model. 11. The method of claim 8 , wherein the plurality of candidate embeddings are generated by masking different portions of the second sentence. 12. The method of claim 8 , wherein the plurality of candidate embeddings include at least one of a forward state embedding or a backward state embedding for the candidate entity value in the second sentence. 13. The method of claim 8 , further comprising: identifying a plurality of entity values of the entity type in the second sentence; and extracting a slot value in the second sentence from among the plurality of entity values. 14. The method of claim 8 , further comprising: applying a language model to at least part of the first sentence to recommend at least one word span associated with the entity type; and presenting the at least one word span in a user interface. 15. A computer-readable storage medium having computer program code recorded thereon that when executed by at least one processor causes the at least one processor to perform a method comprising: receiving, via a user interface, a first sentence, an identification of a first named entity in the first sentence, and an entity type associated with the first named entity; masking the first named entity identified via the user interface in the first sentence to generate a masked first sentence; generating an embedding set that comprises a plurality of sentence embeddings generated from a plurality of tagged sentences for the entity type, the plurality of sentence embeddings including one or more sentence embeddings for at least part of the masked first sentence; extracting a candidate entity value from a second sentence received by a virtual agent; masking the candidate entity value in the second sentence to generate a masked second sentence; generating a plurality of candidate embeddings for at least part of the masked second sentence, the plurality of candidate embeddings comprising: a first candidate embedding for a first subset of terms of
Feedforward networks · CPC title
Discourse or dialogue representation · CPC title
Tagging; Marking up (details of markup languages G06F40/143); Designating a block; Setting of attributes (style sheets, e.g. eXtensible Stylesheet Language Transformation [XSLT], G06F40/154) · CPC title
Named entity recognition · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.