Natural language understanding using vocabularies with compressed serialized tries

US10445429B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10445429-B2
Application numberUS-201815867480-A
CountryUS
Kind codeB2
Filing dateJan 10, 2018
Priority dateSep 21, 2017
Publication dateOct 15, 2019
Grant dateOct 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and processes for natural language processing using vocabularies with compressed serialized tries are described in the present disclosure. In one example process, natural language input is received. The natural language input is parsed, using a vocabulary, to determine a corresponding user intent. The parsing includes using a data structure of the vocabulary to map a first word of the natural language input to first semantic information and a second word of the natural language input to second semantic information. The data structure includes pointers that map to a same semantic data object of the vocabulary. The first semantic information and the second semantic information are determined using the same semantic data object. The user intent is determined based on the first semantic information and the second semantic information. Performance of a task corresponding to the determined user intent is initiated.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device, comprising: one or more processors; memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: receiving natural language input; determining, using a vocabulary, first semantic information for a first word of the natural language input; determining, using the vocabulary, second semantic information for a second word of the natural language input, wherein a portion of the first semantic information is identical to a portion of the second semantic information, and wherein: the vocabulary includes a data structure comprising a first sequence of states representing the first word and a second sequence of states representing the second word; a state of the first sequence of states includes a pointer to a semantic data object of the vocabulary; a state of the second sequence of states includes a pointer to the semantic data object; and the portion of the first semantic information and the portion of the second semantic information are determined from the semantic data object; determining, using the first semantic information and the second semantic information, a user intent corresponding to the natural language input; and initiating performance of a task corresponding to the determined user intent. 2. The device of claim 1 , wherein the semantic data object is a unique data object in the vocabulary. 3. The device of claim 1 , wherein the semantic data object is stored separately from the data structure. 4. The device of any of claim 1 , wherein the data structure is a compressed serialized trie. 5. The device of claim 1 , wherein the state of the first sequence of states includes a plurality of data pointers that each point to a respective data value of a plurality of data values in a binary data object, and wherein each data value of the plurality of data values contains a respective portion of the first semantic information. 6. The device of claim 5 , wherein each data pointer of the plurality of data points includes a start address and an end address that defines a location of the respective data value in the binary data object. 7. The device of claim 5 , wherein each respective data value of the plurality of data values is allocated a memory amount in the binary data object that is equal to the minimum amount of memory required to store the respective data value. 8. The device of claim 1 , wherein the data structure is generated by an external device separate from the electronic device. 9. The device of claim 8 , wherein the data structure is generated from one or more unstructured vocabulary flat files. 10. The device of claim 8 , further comprising: receiving the data structure from the external device; and initializing a natural language processing service of the electronic device by loading the data structure. 11. The device of claim 10 , further comprising: receiving an updated data structure generated by the external device; and after receiving the updated data structure, re-initializing a natural language processing service of the electronic device by loading the updated data structure in lieu of the data structure. 12. The device of claim 1 , wherein the data structure is a character trie. 13. The device of claim 1 , wherein the state of the first sequence of states and the state of the second sequence of states are each terminal states of the data structure. 14. The device of claim 1 , wherein the state of the first sequence of states maps to a plurality of semantic interpretations of the first word, and wherein each semantic interpretation of the plurality of semantic interpretations includes a respective domain of an ontology and one or more properties of the respective domain. 15. The device of claim 14 , wherein each semantic interpretation of the plurality of semantic interpretations further includes a respective saliency value of the first word. 16. The device of claim 14 , wherein each semantic interpretation of the plurality of semantic interpretations further includes a respective confidence value for the respective domain. 17. The device of claim 14 , wherein each semantic interpretation of the plurality of semantic interpretations further includes one or more interpretation rules. 18. The device of claim 1 , wherein the state of the first sequence of states includes one or more first data pointers that each point to a respective semantic data object of a plurality of semantic data objects, wherein the state of the second sequence of states includes one or more second data pointers that each point to a respective semantic data object of the plurality of semantic data objects, and wherein each semantic data object of the plurality of semantic data objects is a unique in the vocabulary. 19. A method for performing natural language processing, the method comprising: at an electronic device having a processor and memory: receiving natural language input; determining, using a vocabulary, first semantic information for a first word of the natural language input; determining, using the vocabulary, second semantic information for a second word of the natural language input, wherein a portion of the first semantic information is identical to a portion of the second semantic information, and wherein: the vocabulary includes a data structure comprising a first sequence of states representing the first word and a second sequence of states representing the second word; a state of the first sequence of states includes a pointer to a semantic data object of the vocabulary; a state of the second sequence of states includes a pointer to the semantic data object; and the portion of the first semantic information and the portion of the second semantic information are determined from the semantic data object; determining, using the first semantic information and the second semantic information, a user intent corresponding to the natural language input; and initiating performance of a task corresponding to the determined user intent. 20. A non-transitory computer-readable storage medium storing one or more programs configured to be executed by one or more processors of an electronic device, the one or more programs including instructions for: receiving natural language input; determining, using a vocabulary, first semantic information for a first word of the natural language input; determining, using the vocabulary, second semantic information for a second word of the natural language input, wherein a portion of the first semantic information is identical to a portion of the second semantic information, and wherein: the vocabulary includes a data structure comprising a first sequence of states representing the first word and a second sequence of states representing the second word; a state of the first sequence of states includes a pointer to a semantic data object of the vocabulary; a state of the second sequence of states includes a pointer to the semantic data object; and the portion of the first semantic information and the portion of the second semantic information are determined from the semantic data object; determining, using the first semantic information and the second semantic information, a user intent corresponding to the natural language input; and initiating performance of a task corresponding to the determined user intent.

Assignees

Inventors

Classifications

  • G06F40/30Primary

    Semantic analysis · CPC title

  • Parsing · CPC title

  • Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10445429B2 cover?
Systems and processes for natural language processing using vocabularies with compressed serialized tries are described in the present disclosure. In one example process, natural language input is received. The natural language input is parsed, using a vocabulary, to determine a corresponding user intent. The parsing includes using a data structure of the vocabulary to map a first word of the n…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).