Annotation probability distribution based on a factor graph

US9715486B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9715486-B2
Application numberUS-201414502319-A
CountryUS
Kind codeB2
Filing dateSep 30, 2014
Priority dateAug 5, 2014
Publication dateJul 25, 2017
Grant dateJul 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In order to address annotation bias in batch annotations, obtained via crowdsourcing, on a set of comments on user posts in a social network, a system determines an annotation probability distribution based on a factor-graph model of the batch annotations. In particular, during operation the system computes the factor-graph model that represents relationships between feature vectors that represent the comments and the annotations for the comments. Note that, for a given batch of k comments, the factor-graph model may include a statistically dependent combination of statistically independent models of the interrelationships between the feature vectors and the annotations for the k comments. Then, the system calculates the annotation probability distribution based on model parameters associated with the factor-graph model, a mapping function that maps from the feature vectors to the annotations, and an indicator function that represents the annotations for the comments in the batches.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for calculating an annotation probability distribution of annotations for a set of comments, the method comprising: accessing, at a memory location, the annotations for the set of comments, wherein the comments are associated with a group of documents; using a computer processor that is coupled to the memory location and programmed to calculate the annotation probability distribution: computing a factor-graph model that represents relationships between feature vectors that represent the comments and the annotations for the comments, wherein, for a given batch of k comments, the factor-graph model includes a statistically dependent combination of statistically independent models of the interrelationships between the feature vectors and the annotations for the k comments; and calculating the annotation probability distribution based on model parameters associated with the factor-graph model, a mapping function that maps from the feature vectors to the annotations, and an indicator function that represents the annotations for the comments in the batches; and using the calculated annotation probability distribution to select one or more subset of the set of comments, wherein the one or more subset of the set of comments are further selected based on an objective function, and wherein the objective function is optimized by maximizing an expression comprising the annotation probability distribution and the objective function. 2. The method of claim 1 , wherein: the statistically dependent combination includes a factor function; and the factor function includes the indicator function and a first model parameter in the model parameters. 3. The method of claim 2 , wherein: a given statistically independent model includes a correlation factor function; and the correlation factor function includes the mapping function and a second model parameter in the model parameters. 4. The method of claim 3 , wherein the mapping function includes a product of a representation of the annotations and the feature vectors. 5. The method of claim 4 , wherein the computing involves determining the first model parameter and the second model parameter by optimizing a likelihood function that indicates how well the factor-graph model represents the annotations for the set of comments. 6. The method of claim 1 , wherein the statistically independent models include logistic regression models. 7. The method of claim 1 , wherein, prior to computing the factor-graph model, the method further comprises determining the feature vectors that represent the set of comments. 8. The method of claim 1 , wherein, prior to computing the factor-graph model, the method further comprises selecting the given batch based on how informative expected annotations for the comments are for a classifier and a probability of occurrence of the expected annotations based on the calculated annotation probability distribution; and wherein the classifier predicts how likely the expected annotations are accurate for the comments in the given batch. 9. The method of claim 1 , wherein prior to accessing the annotations, the method further comprises obtaining the annotations by: providing the set of comments to reviewers; and receiving the annotations from the reviewers. 10. An apparatus, comprising: one or more processors; memory; and a program module, wherein the program module is stored in the memory and, during operation of the apparatus, is executed by the one or more processors to calculate an annotation probability distribution of annotations for a set of comments, the program module including: instructions for accessing, at a memory location in the memory, the annotations for the set of comments, wherein the comments are associated with a group of documents; instructions for computing a factor-graph model that represents relationships between feature vectors that represent the comments and the annotations for the comments, wherein, for a given batch of k comments, the factor-graph model includes a statistically dependent combination of statistically independent models of the interrelationships between the feature vectors and the annotations for the k comments; instructions for calculating the annotation probability distribution based on model parameters associated with the factor-graph model, a mapping function that maps from the feature vectors to the annotations, and an indicator function that represents the annotations for the comments in the batches; and instructions for using the calculated annotation probability distribution to select one or more subset of the set of comments, wherein the one or more subset of the set of comments are further selected based on an objective function, and wherein the objective function is optimized by maximizing an expression comprising the annotation probability distribution and the objective function. 11. The apparatus of claim 10 , wherein: the statistically dependent combination includes a factor function; and the factor function includes the indicator function and a first model parameter in the model parameters. 12. The apparatus of claim 11 , wherein: a given statistically independent model includes a correlation factor function; and the correlation factor function includes the mapping function and a second model parameter in the model parameters. 13. The apparatus of claim 12 , wherein the mapping function includes a product of a representation of the annotations and the feature vectors. 14. The apparatus of claim 13 , wherein the computing involves determining the first model parameter and the second model parameter by optimizing a likelihood function that indicates how well the factor-graph model represents the annotations for the set of comments. 15. The apparatus of claim 10 , wherein the statistically independent models include logistic regression models. 16. The apparatus of claim 10 , wherein the program module further includes instructions for determining the feature vectors that represent the set of comments prior to computing the factor-graph model. 17. The apparatus of claim 10 , wherein the program module further includes instructions for selecting the batches prior to computing the factor-graph model. 18. The apparatus of claim 10 , wherein: the program module further includes instructions for obtaining the annotations prior to accessing the annotations, by: providing the set of comments to reviewers; and receiving the annotations from the reviewers. 19. A system, comprising: a processing module comprising a non-transitory computer readable medium storing instructions that, when executed, cause the system to: access, at a memory location, annotations for a set of comments, wherein the comments are associated with a group of documents; compute a factor-graph model that represents relationships between feature vectors that represent the comments and the annotations for the comments, wherein, for a given batch of k comments, the factor-graph model includes a statistically dependent combination of statistically independent models of the interrelationships between the feature vectors and the annotations for the k comments; calculate an annotation probability distribution based on model parameters associated with the factor-graph model, a mapping function that maps from the feature vectors to the annotations, and an indicator function that represents the annotations for the comments in the batches; and use the calculated annotation probability distri

Assignees

Inventors

Classifications

  • G06F40/169Primary

    Annotation, e.g. comment data or footnotes · CPC title

  • G06Q10/40Primary

    Business processes related to social networking or social networking services · CPC title

  • based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Indexing; Web crawling techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9715486B2 cover?
In order to address annotation bias in batch annotations, obtained via crowdsourcing, on a set of comments on user posts in a social network, a system determines an annotation probability distribution based on a factor-graph model of the batch annotations. In particular, during operation the system computes the factor-graph model that represents relationships between feature vectors that repres…
Who is the assignee on this patent?
Linkedin Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/169. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).