Early exit for natural language processing models

US2019266236A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019266236-A1
Application numberUS-201916411763-A
CountryUS
Kind codeA1
Filing dateMay 14, 2019
Priority dateMay 14, 2019
Publication dateAug 29, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure provides a natural language processing (NLP) model arranged to operate on two lexicons, where one lexicon is a sub-set of the other lexicon. The NLP model can be arranged to generate output based on the sub-set lexicon and exit processing of the NLP model, to potentially save computation cycles.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus, comprising: a processor; and memory storing instructions and a natural language processing (NLP) inference model, the instructions when executed by the processor cause the processor to: generate, via the NLP inference model, an intermediate result and a confidence associated with the intermediate result; compare the confidence to a threshold; and based on the comparison, either: generate an output based on the intermediate result and cease computation via the NLP inference mode; or generate the output via the NLP inference model. 2 . The apparatus of claim 1 , the memory storing instructions, which when executed by the processor cause the processor to: determine whether the confidence is greater than or equal to the threshold; and generate the output based on the intermediate result and cease computation via the NLP inference mode based on a determination that the confidence is greater than or equal to the threshold. 3 . The apparatus of claim 2 , the memory storing instructions, which when executed by the processor cause the processor to generate the output via the NLP inference model based on a determination that the confidence is not greater than or equal to the threshold. 4 . The apparatus of claim 1 , the NLP model comprising a plurality of encoders, a first classifier associated with a first lexicon and a second classifier associated with a second lexicon, where the first lexicon is a sub-set of the second lexicon, the memory storing instructions, which when executed by the processor cause the processor to derive the intermediate result and the confidence based on a first portion of the plurality of encoders and the first classifier. 5 . The apparatus of claim 4 , the memory storing instructions, which when executed by the processor cause the processor to derive the output based the plurality of the encoders and the second classifier. 6 . The apparatus of claim 4 , the memory storing instructions, which when executed by the processor cause the processor to not process a second portion of the plurality of encoders to cease computation via the NLP inference model, where the second portion of the plurality of encoders is mutually exclusive of the first portion of the plurality of encoders. 7 . The apparatus of claim 4 , wherein the second lexicon comprises a vocabulary of a plurality of tokens and the first lexicon comprises a vocabulary including a sub-set of the plurality of tokens. 8 . The apparatus of claim 7 , wherein the vocabulary of the first lexicon is selected based in part on a statistical measurement of usage of the tokens of the vocabulary of the second lexicon. 9 . The apparatus of claim 1 , wherein the processor is an artificial intelligence (AI) accelerator. 10 . A non-transitory computer-readable storage medium, comprising instructions that when executed by a processor, cause the processor to: generate, via an NLP inference model, an intermediate result and a confidence associated with the intermediate result; compare the confidence to a threshold; and based on the comparison, either: generate an output based on the intermediate result and cease computation via the NLP inference mode; or generate the output via the NLP inference model. 11 . The non-transitory computer-readable storage medium of claim 10 , comprising instructions that when executed by the processor, cause the processor to: determine whether the confidence is greater than or equal to the threshold; and generate the output based on the intermediate result and cease computation via the NLP inference mode based on a determination that the confidence is greater than or equal to the threshold. 12 . The non-transitory computer-readable storage medium of claim 11 , comprising instructions that when executed by the processor, cause the processor to generate the output via the NLP inference model based on a determination that the confidence is not greater than or equal to the threshold. 13 . The non-transitory computer-readable storage medium of claim 10 , the NLP model comprising a plurality of encoders, a first classifier associated with a first lexicon and a second classifier associated with a second lexicon, where the first lexicon is a sub-set of the second lexicon, the instructions when executed by the processor cause the processor to derive the intermediate result and the confidence based on a first portion of the plurality of encoders and the first classifier. 14 . The non-transitory computer-readable storage medium of claim 13 , comprising instructions that when executed by the processor, cause the processor to derive the output based the plurality of the encoders and the second classifier. 15 . The non-transitory computer-readable storage medium of claim 13 , comprising instructions that when executed by the processor, cause the processor to not process a second portion of the plurality of encoders to cease computation via the NLP inference model, where the second portion of the plurality of encoders is mutually exclusive of the first portion of the plurality of encoders. 16 . The non-transitory computer-readable storage medium of claim 13 , wherein the second lexicon comprises a vocabulary of a plurality of tokens and the first lexicon comprises a vocabulary including a sub-set of the plurality of tokens and the vocabulary of the first lexicon is selected based in part on a statistical measurement of usage of the tokens of the vocabulary of the second lexicon. 17 . A computer-implemented method, comprising: generating, via an NLP inference model, an intermediate result and a confidence associated with the intermediate result; comparing the confidence to a threshold; and based on the comparison, either: generating an output based on the intermediate result and cease computation via the NLP inference mode; or generating the output via the NLP inference model. 18 . The computer-implemented method of claim 17 , comprising: determining whether the confidence is greater than or equal to the threshold; and generating the output based on the intermediate result and cease computation via the NLP inference mode based on a determination that the confidence is greater than or equal to the threshold. 19 . The computer-implemented method of claim 18 , comprising generating the output via the NLP inference model based on a determination that the confidence is not greater than or equal to the threshold. 20 . The computer-implemented method of claim 17 , the NLP model comprising a plurality of encoders, a first classifier associated with a first lexicon and a second classifier associated with a second lexicon, where the first lexicon is a sub-set of the second lexicon, the method comprising: deriving the intermediate result and the confidence based on a first portion of the plurality of encoders and the first classifier; and deriving the output based the plurality of the encoders and the second classifier. 21 . The computer-implemented method of claim 19 , comprising not processing a second portion of the plurality of encoders to cease computation via the NLP inference model, where the second portion of the plurality of encoders is mutually exclusive of the first portion of the plurality of encoders. 22 . The computer-implemented method of claim 19 , wherein the second lexicon comprises a vocabulary of a plurality of tokens and the first lexicon comprises a vocabulary including a sub-set of the

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • G06F40/216Primary

    using statistical methods · CPC title

  • Semantic analysis · CPC title

  • G06F40/284Primary

    Lexical analysis, e.g. tokenisation or collocates · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019266236A1 cover?
The disclosure provides a natural language processing (NLP) model arranged to operate on two lexicons, where one lexicon is a sub-set of the other lexicon. The NLP model can be arranged to generate output based on the sub-set lexicon and exit processing of the NLP model, to potentially save computation cycles.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/216. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 29 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).