What technology area does this patent fall under?

Primary CPC classification G06N7/01. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Filtering content in an online system based on text and image signals extracted from the content

US9824313B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9824313-B2
Application number	US-201514702363-A
Country	US
Kind code	B2
Filing date	May 1, 2015
Priority date	May 1, 2015
Publication date	Nov 21, 2017
Grant date	Nov 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure relates (a) a method and computer program product for training a content classifier and (b) a method and computer program product for using the trained content classifier to determine compliance of content items with a content policy of an online system. A content classifier is trained using two training sets, one containing NSFW content items and the other containing SFW content items. Content signals are extracted from each content item and used by the classifier to output a decision, which is compared against its known classification. Parameters used in the classifier are adjusted iteratively to improve accuracy of classification. The trained classifier is then used to classify content items with unknown classifications. Appropriate action is taken for each content item responsive to its classification. In alternative embodiments, multiple classifiers are implemented as part of a two-tier classification system, with text and image content classified separately.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for determining compliance of content with a content policy of the online system, the method comprising: receiving a content item comprising text and one or more images; extracting a plurality of text signals from the text; extracting a plurality of image signals from the one or more images; inputting the plurality of text signals and the plurality of image signals into a two-tier classifier system by inputting the plurality of text signals into a text classifier model of a first tier of the two-tier classifier system, inputting the plurality of image signals into an image classifier model of the first tier of the two tier-classifier system, and inputting output classifications of the text classifier model and of the image classifier model into a second-tier classifier model, the second-tier classifier model outputting a confidence value expressing a likelihood of compliance with a content policy of an online system; determining, based on the confidence value, a compliance classification of the content item. 2. The method of claim 1 , wherein the text signals are extracted according to a “bag of words” methodology, discarding order and grammar but retaining word multiplicity. 3. The method of claim 1 , wherein extracting the plurality of text signals comprises: detecting commonly appearing word pairs; and treating each word pair as a single text signal. 4. The method of claim 1 , wherein the plurality of image signals comprises a quantitative measure of a portion of the one or more images having a color within a predefined range that is consistent with skin tones. 5. The method of claim 1 , wherein the plurality of image signals comprises an indication that the one or more images contains a face. 6. The method of claim 1 , wherein the plurality of image signals comprises text contained within the image. 7. The method of claim 1 , wherein the classifier implemented is a Naïve-Bayes classifier. 8. The method of claim 1 , further comprising: responsive to the compliance classification determined for the content item indicating a likelihood of violating the content policy, performing at least one remedial action of: blocking the content item from being transmitted or displayed to a user, passing the content item to a human controller for manual review, and marking the content item with a tag. 9. A computer program product for determining compliance of content with a content policy, the computer program product comprising a non-transitory computer-readable storage medium containing computer program code for: receiving a content item comprising text and one or more images; extracting a plurality of text signals from the text; extracting a plurality of image signals from the one or more images; inputting the plurality of text signals and the plurality of image signals into a two-tier classifier system by inputting the plurality of text signals into a text classifier model of a first tier of the two-tier classifier system, inputting the plurality of image signals into an image classifier model of the first tier of the two tier-classifier system, and inputting output classifications of the text classifier model and of the image classifier model into a second-tier classifier model, the second-tier classifier outputting a confidence value expressing likelihood of compliance with a content policy of an online system; comparing the confidence value against a pre-defined threshold value; and based on the comparison, assigning a compliance classification to the content item. 10. The computer program product of claim 9 , further comprising: extracting the plurality of text signals according to a “bag of words” methodology, discarding word order and grammar but retaining word multiplicity. 11. The computer program product of claim 9 , further comprising: detecting commonly appearing word pairs; and treating each word pair as a single text signal. 12. The computer program product of claim 9 , wherein the plurality of image signals comprises a quantitative measure of a portion of the one or more images having a color within a predefined range that is consistent with skin tones. 13. The computer program product of claim 9 , wherein the plurality of image signals comprises an indication that the one or more images contains a face. 14. The computer program product of claim 9 , wherein the plurality of image signals comprises text contained within the image. 15. The computer program product of claim 9 , wherein the classifier implemented is a Naïve-Bayes classifier. 16. The computer program product of claim 9 , further comprising: responsive to the compliance classification determined for the content item indicating a likelihood of violating the content policy, performing at least one remedial action of: blocking the content item from being transmitted or displayed to a user, passing the content item to a human controller for manual review, and marking the content item with a tag.

Assignees

Flipboard Inc

Inventors

Griesmeyer Robert

Classifications

G06N7/01Primary
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N99/005
Physics · mapped topic
G06F17/30
Physics · mapped topic
G06N7/005Primary
Physics · mapped topic
G06F16/45
Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

View patent family 57205577

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9824313B2 cover?: The disclosure relates (a) a method and computer program product for training a content classifier and (b) a method and computer program product for using the trained content classifier to determine compliance of content items with a content policy of an online system. A content classifier is trained using two training sets, one containing NSFW content items and the other containing SFW content…
Who is the assignee on this patent?: Flipboard Inc
What technology area does this patent fall under?: Primary CPC classification G06N7/01. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods to identify objectionable content

Systems and methods for inferring gender by fusion of multimodal content

Assessing legibility of images

Methods and apparatus to generate a tag for media content

Filtering hidden data embedded in media files

Harmless frame filter, harmful image blocking apparatus having the same, and method for filtering harmless frames

Apparatus and method for extracting skin area to block harmful content image

Frequently asked questions