Joint algorithm for sampling and optimization and natural language processing applications of same

US9753893B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9753893-B2
Application numberUS-201213525513-A
CountryUS
Kind codeB2
Filing dateJun 18, 2012
Priority dateJun 18, 2012
Publication dateSep 5, 2017
Grant dateSep 5, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In rejection sampling of a function or distribution p over a space X, a proposal distribution q (n) is refined responsive to rejection of a sample x*εX to generate a refined proposal distribution q (n+1) selected to satisfy the criteria p(x)≦q (n+1) (x)≦q (n) (x) and q (n+1) (x*)<q (n) (x*). In a sampling mode, the sample x* is obtained by random sampling of the space X, the rejection sampling accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw, and the refined proposal distribution q (n+1) is selected to minimize a norm ∥q (n+1) ∥ α where α<∞. In an optimization mode, the sample x* is obtained such that q*=q (n) (x*) maximizes q (n) over the space X, the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*), and the refined proposal distribution q (n+1) is selected to minimize a norm ∥q (n+1) ∥ ∞ =max{q (n+1) (x)}.

First claim

Opening claim text (preview).

The invention claimed is: 1. A non-transitory storage medium storing instructions executable by an electronic data processing device to perform a method comprising: performing rejection sampling to acquire at least one accepted sample of a function or distribution p over a space X; during the rejection sampling, refining a proposal distribution q (n) used in the rejection sampling responsive to rejection of a sample x*εX obtained in a current iteration of the rejection sampling to generate a refined proposal distribution q (n+1) for use in a next iteration of the rejection sampling wherein the refined proposal distribution q (n+1) is selected to satisfy the criteria: p(x)≦q (n+1) (x)≦q (n) (x) for all xεX; and q (n+1) (x*)<q (n) (x*) wherein the method performed by executing the stored instructions further includes: initializing the proposal distribution q (0) =G for the first iteration of the rejection sampling where G denotes a probabilistic context-free grammar and the space X is a space of natural language derivations; and generating the function or distribution p as an intersection of G and a weighted finite state automaton (WFSA) representing a language model so that the performed rejection sampling acquires at least one accepted sample of a weighted context-free grammar comprising the intersection of the probabilistic context-free grammar G and the WFSA representing the language model. 2. The non-transitory storage medium of claim 1 , wherein: the rejection sampling obtains the sample x* by random sampling of the space X; the rejection sampling accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw; and the refined proposal distribution q (n+1) is selected to satisfy the criteria: p(x)≦q (n+1) (x)≦q (n) (x) for all xεX; q (n+1) (x*)<q (n) (x*); and a norm ∥q (n+1) ∥ α is minimized where α<∞. 3. The non-transitory storage medium of claim 2 , wherein the rejection sampling acquires a plurality of accepted samples and terminates when a cumulative acceptance rate of the rejection sampling exceeds a threshold. 4. The non-transitory storage medium of claim 2 , where α=1. 5. The non-transitory storage medium of claim 1 , wherein: the rejection sampling obtains the sample x* such that q*=q (n) (x*) maximizes q (n) over the space X; the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*); and the refined proposal distribution q (n+1) is selected to satisfy the criteria: p(x)≦q (n+1) (x)≦q (n) (x) for all xεX; q (n+1) (x*)<q (n) (x*); and a norm ∥q (n+1) ∥ ∞ =max{q (n+1) (x)} is minimized. 6. The non-transitory storage medium of claim 5 , wherein the rejection sampling acquires a single accepted sample. 7. The non-transitory storage medium of claim 6 , wherein the rejection sampling outputs at least one of: the single accepted sample x accepted and a maximum value p max for the distribution p over the space X where p(x accepted )≦p max ≦q(x accepted ). 8. The non-transitory storage medium of claim 6 , wherein the rejection sampling outputs: a maximum value p max for the distribution p over the space X where p(x accepted )≦P max ≦q(x accepted ) where x accepted is the single accepted sample; and an error metric for the maximum value p max computed based on p(x accepted ) and q(x accepted ). 9. The non-transitory storage medium of claim 1 , wherein the non-transitory storage medium stores instructions executable by an electronic data processing device to perform said rejection sampling in one of two selectable modes including: (1) a sampling mode in which the rejection sampling obtains the sample x* by random sampling of the space X and accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw; and (2) an optimization mode in which the rejection sampling obtains the sample x* such that q*=q (n) (x*) maximizes q (n) over the space X and the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*). 10. The non-transitory storage medium of claim 9 , wherein: (1) in the sampling mode the refined proposal distribution q (n+1) is selected to minimize a norm ∥q (n+1) ∥ α where α<∞; and (2) in the optimization mode the refined proposal distribution q (n+1) is selected to minimize the norm ∥q (n+1) ∥ ∞ . 11. The non-transitory storage medium of claim 10 , where α=1. 12. The non-transitory storage medium of claim 9 , wherein: in the sampling mode the rejection sampling acquires a plurality of accepted samples and terminates when a cumulative acceptance rate of the rejection sampling exceeds a threshold; and in the optimization mode the rejection sampling acquires a single accepted sample. 13. The non-transitory storage medium of claim 1 , wherein X is a multidimensional space. 14. The non-transitory storage medium of claim 1 , wherein the refined proposal distribution q (n+1) is selected to satisfy the criteria: p(x)≦q (n+1) (x)≦q (n) (x) for all xεX; q (n+1) (x*)<q (n) (x*); and a norm ∥q (n+1) ∥ α is minimized. 15. An apparatus comprising: a non-transitory storage medium as set forth in claim 1 ; and an electronic data processing device configured to execute instructions stored on the non-transitory storage medium. 16. A method comprising: using an electronic data processing device, performing a current iteration (n) of rejection sampling including: obtaining a sample x* from a space X of natural language derivations; accepting or rejecting the sample x* based on comparison of q (n) (x*) and p(x*) where q (n) is a proposal distribution over the space X for current iteration (n) of the rejection sampling and p is a function or distribution over the space X which is an intersection of a probabilistic context-free grammar G and a weighted finite state automaton (WFSA) representing a language model; and refining the proposal distribution q (n) to generate a refined proposal distribution q (n+1) over the space X satisfying the criteria: p(x)≦q (n+1) (x)≦q (n) (x) for all xεX; q (n+1) (x*)<q (n) (x*); and a norm ∥q (n+1) ∥ α is minimized; wherein the method further comprises initializing the proposal distribution as q (0) =G for the first iteration of the rejection sampling.

Assignees

Inventors

Classifications

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • G06F17/18Primary

    for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

  • using statistical methods · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9753893B2 cover?
In rejection sampling of a function or distribution p over a space X, a proposal distribution q (n) is refined responsive to rejection of a sample x*εX to generate a refined proposal distribution q (n+1) selected to satisfy the criteria p(x)≦q (n+1) (x)≦q (n) (x) and q (n+1) (x*)<q (n) (x*). In a sampling mode, the sample x* is obtained by random sampling of the space X, the rejection samplin…
Who is the assignee on this patent?
Dymetman Marc, Bouchard Guillaume, Xerox Corp
What technology area does this patent fall under?
Primary CPC classification G06F17/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).