Annotation task instruction generation

US11132500B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11132500-B2
Application numberUS-201916528293-A
CountryUS
Kind codeB2
Filing dateJul 31, 2019
Priority dateJul 31, 2019
Publication dateSep 28, 2021
Grant dateSep 28, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides a method, including: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task; assigning the subset to a plurality of annotators; obtaining (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; identifying improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying discrepancies made by the annotators in view of the response time; and generating a new set of instructions, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task, wherein the client annotations identify correct annotations for the subset of information; assigning the subset to a plurality of annotators; obtaining, from each of the plurality of annotators, (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; identifying improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying, from the comparing, discrepancies between the annotations made by the annotators and the client annotations for a piece of information within the subset, wherein the identifying discrepancies is performed in view of the response time as compared to a threshold reaction time, wherein an annotation made by an annotator exceeding the threshold reaction time is identified as a discrepancy; and generating, based on the improvements, a new set of instructions for the task, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature, wherein the identifying at least one feature comprises utilizing a feature extraction technique, based upon an entity type of the information within the task, on both the correctly annotated information and the incorrectly annotated information. 2. The method of claim 1 , wherein the identifying discrepancies comprises classifying the annotator annotations into one of (i) positive samples comprising annotator annotations matching the client annotations and (ii) negative samples comprising annotator annotations not matching the client annotations. 3. The method of claim 2 , wherein the identifying at least one feature comprises performing feature extraction on (i) entities included in the positive samples and (ii) entities included in the negative samples, thereby identifying features correlating with the positive samples. 4. The method of claim 3 , wherein the identifying at least one feature comprises ranking a feature having a greater correlation to the positive samples higher than a feature having less correlation to the positive samples. 5. The method of claim 1 , wherein the identifying at least one feature comprises using a feature extraction technique on the discrepancies. 6. The method of claim 5 , wherein the feature extraction technique utilized is based upon a domain of the information. 7. The method of claim 1 , wherein the generating comprises ranking the at least one feature; and wherein the generating an instruction comprises generating an instruction for features having a ranking above a predetermined ranking. 8. The method of claim 1 , wherein the new set of instructions comprises an example of an incorrect annotation. 9. The method of claim 1 , comprising providing the new set of instructions with the task to a plurality of annotators. 10. An apparatus, comprising: at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to receive, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task, wherein the client annotations identify correct annotations for the subset of information; computer readable program code configured to assign the subset to a plurality of annotators; computer readable program code configured to obtain, from each of the plurality of annotators, (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; computer readable program code configured to identify improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying, from the comparing, discrepancies between the annotations made by the annotators and the client annotations for a piece of information within the subset, wherein the identifying discrepancies is performed in view of the response time as compared to a threshold reaction time, wherein an annotation made by an annotator exceeding the threshold reaction time is identified as a discrepancy; and computer readable program code configured to generate, based on the improvements, a new set of instructions for the task, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature, wherein the identifying at least one feature comprises utilizing a feature extraction technique, based upon an entity type of the information within the task, on both the correctly annotated information and the incorrectly annotated information. 11. A computer program product, comprising: a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by a processor and comprising: computer readable program code configured to receive, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task, wherein the client annotations identify correct annotations for the subset of information; computer readable program code configured to assign the subset to a plurality of annotators; computer readable program code configured to obtain, from each of the plurality of annotators, (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; computer readable program code configured to identify improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying, from the comparing, discrepancies between the annotations made by the annotators and the client annotations for a piece of information within the subset, wherein the identifying discrepancies is performed in view of the response time as compared to a threshold reaction time, wherein an annotation made by an annotator exceeding the threshold reaction time is identified as a discrepancy; and computer readable program code configured to generate, based on the improvements, a new set of instructions for the task, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature, wherein the identifying at least one feature comprises utilizing a feature extraction technique, based upon an entity type of the information within the task, on both the correctly annotated information and the incorrectly annotated information. 12. The computer program product of claim 11 , wherein the identifying discrepancies comprises classifying the annotator annotations into one of (i) positive samples comprising annotator annotations matching the client annotations and (ii) negative samples comprising annotator annotations

Assignees

Inventors

Classifications

  • Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns · CPC title

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • of extracted features · CPC title

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • of extracted features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11132500B2 cover?
One embodiment provides a method, including: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task; assigning the subset to a plurality of annotators; obtaining (i) annotator annotations for the subset and (ii) a response time for providing the annotator ann…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/169. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 28 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).