System and method for crowd-sourced data labeling

US9536517B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9536517-B2
Application numberUS-201113300087-A
CountryUS
Kind codeB2
Filing dateNov 18, 2011
Priority dateNov 18, 2011
Publication dateJan 3, 2017
Grant dateJan 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer-readable storage devices for crowd-sourced data labeling. The system requests a respective response from each of a set of entities. The set of entities includes crowd workers. Next, the system incrementally receives a number of responses from the set of entities until one of an accuracy threshold is reached and m responses are received, wherein the accuracy threshold is based on characteristics of the number of responses. Finally, the system generates an output response based on the number of responses.

First claim

Opening claim text (preview).

We claim: 1. A method comprising: requesting, via a processor, a respective transcription response associated with transcribing input speech, the respective transcription response being from each of a plurality of networked entities, wherein at least one entity of the plurality of networked entities comprises an automatic labeler that transcribes the input speech as its respective transcription response and at least one entity of the plurality of networked entities comprises a human crowd worker who transcribes the input speech as its respective transcription response; receiving, from each of the plurality of networked entities, the respective transcription response and, for a transcription response received from the human crowd worker, a number of times the respective human crowd worker listened to the input speech to provide the respective transcription response; determining a maximum number of transcription responses to receive from the plurality of networked entities; calculating, via the processor and using a regression model, an accuracy threshold for the transcription responses, wherein the accuracy threshold: (1) requires a number of matching responses among the transcription responses; (2) is based on a time of day when each of the transcription responses is received; and (3) is based on a number of times the respective human crowd worker listened to the input speech; incrementally receiving the transcription responses from the plurality of networked entities until one of the accuracy threshold is reached and the maximum number of transcription responses is received; and generating, via the processor, an output response to the input speech from the number of matching transcription responses, wherein the output response is a recognition candidate or a transcription of the input speech. 2. The method of claim 1 , wherein two of the transcription responses are automatic speech recognition output. 3. The method of claim 1 , further comprising training an automatic speech recognition engine using the output response. 4. The method of claim 1 , wherein the accuracy threshold is further based on one of a content, a size, a label, a duration, a location of a plurality of workers associated with the plurality of networked entities, an identity of the plurality of workers, an attribute, a confidence score, a difficulty, and a diversity. 5. The method of claim 1 , wherein the accuracy threshold is further based on a probability of correctness. 6. The method of claim 3 , wherein the accuracy threshold comprises n matching responses, and wherein n is one of less than the maximum number of transcription responses and equal to the maximum number of transcription responses. 7. A system comprising: a processor configured to perform automatic speech recognition; and a computer-readable storage device having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: requesting a respective transcription response associated with transcribing input speech, the respective transcription response being from each of a plurality of networked entities, wherein at least one entity of the plurality of networked entities comprises an automatic labeler that transcribes the input speech as its respective transcription response and at least one entity of the plurality of networked entities comprises a human crowd worker who transcribes the input speech as its respective transcription response; receiving, from each of the plurality of networked entities, the respective transcription response, and, for a transcription response received from the human crowd worker, a number of times the respective human crowd worker listened to the input speech to provide the respective transcription response; determining a maximum number of transcription responses to receive from the plurality of networked entities; calculating, via the processor and using a regression model, an accuracy threshold for the transcription responses, wherein the accuracy threshold: (1) requires a number of matching responses among the transcription responses; (2) is based on a time of day when each of the transcription responses is received; and (3) is based on a number of times the respective human crowd worker listened to the input speech; incrementally receiving the transcription responses from the plurality of networked entities until one of the accuracy threshold is reached and the maximum number of transcription responses is received, and generating, via the processor, an output response to the input speech from the number of matching transcription responses, wherein the output response is a recognition candidate or a transcription of the input speech. 8. The system of claim 7 , wherein the output response is used to train a video analysis algorithm. 9. The system of claim 7 , wherein the maximum number of transcription responses is based on a difficulty associated with the transcription of the utterance. 10. A computer-readable storage device having instructions stored which, when executed by a computing device configured to perform automatic speech recognition, cause the computing device to perform operations comprising: requesting, a respective transcription response associated with transcribing input speech, the respective transcription response being from each of a plurality of networked entities, wherein at least one entity of the plurality of networked entities comprises an automatic labeler that transcribes the input speech as its respective transcription response and at least one entity of the plurality of networked entities comprises a human crowd worker who transcribes the input speech as its respective transcription response; receiving, from each of the plurality of networked entities, the respective transcription response, and, for a transcription response received from the human crowd worker, a number of times the respective human crowd worker listened to the input speech to provide the respective transcription response; determining a maximum number of transcription responses to receive from the plurality of networked entities; calculating, via the processor and using a regression model, an accuracy threshold for the transcription responses, wherein the accuracy threshold: (1) requires a number of matching responses among the transcription responses; (2) is based on a time of day when each of the transcription responses is received; and (3) is based on a number of times the respective human crowd worker listened to the input speech; incrementally receiving the transcription responses from the plurality of networked entities until one of the accuracy threshold is reached and the maximum number of transcription responses is received, and generating, via the processor, an output response to the input speech from the number of matching transcription responses, wherein the output response is a recognition candidate or a transcription of the input speech. 11. The computer-readable storage device of claim 10 , wherein the output response is used to train a machine translation system. 12. The computer-readable storage device of claim 10 , wherein the accuracy threshold comprises n matching responses, and wherein n is one of less than the maximum number of transcription responses and equal to the maximum number of transcription responses.

Assignees

Inventors

Classifications

  • using metadata automatically derived from the content · CPC title

  • Interactive procedures · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • G10L15/063Primary

    Training · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9536517B2 cover?
Systems, methods, and computer-readable storage devices for crowd-sourced data labeling. The system requests a respective response from each of a set of entities. The set of entities includes crowd workers. Next, the system incrementally receives a number of responses from the set of entities until one of an accuracy threshold is reached and m responses are received, wherein the accuracy thresh…
Who is the assignee on this patent?
Williams Jason, Alonso Tirso, Hollister Barbara B, and 2 more
What technology area does this patent fall under?
Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).