What technology area does this patent fall under?

Primary CPC classification G06F40/284. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Apr 04 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Predicting future trending topics

US2019102374A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2019102374-A1
Application number	US-201715723095-A
Country	US
Kind code	A1
Filing date	Oct 2, 2017
Priority date	Oct 2, 2017
Publication date	Apr 4, 2019
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A prediction system can predict future trending topics. The prediction system can classify social media posts by region and vertical, extract text from the posts, tokenize the extracted text, and organizing the tokens into n-grams. The prediction system can store the n-grams from the posts in a cumulative set of n-grams, with each n-gram tagged with the originating post's identified region, vertical, and a time value. The prediction system can compute, for each n-gram, a frequency within each category defined by a region/vertical pair. The prediction system an fit occurrence data for n-grams to a polynomial and identify the slope of the point on for the current time. The slope can be used as a prediction of growth or decline for the n-gram. The prediction system can identify n-grams with a comparatively large slope within that region/vertical as likely to be trending in the future.

First claim

Opening claim text (preview).

1 . A method for identifying future trending n-grams, comprising: for each particular content item of multiple content items: extracting text from the particular content item; identifying one or more classifications for the particular content item; organizing the extracted text into one or more n-grams; adding the one or more n-grams to a cumulative set of n-grams, wherein each n-gram in the cumulative set is associated with a time-based value for the particular content item; sorting the n-grams in the cumulative set into groups by the one or more classifications of the content item that the n-gram originated from; computing a frequency value, within each group, for each unique n-gram in that group; selecting unique n-grams, for at least one of the groups, that have a frequency value above a frequency threshold; computing a predicted change in frequency value for the selected unique n-grams, wherein the predicted change in frequency value is based on the time-based values for the n-grams that have the same sequence of words as the unique n-gram and that are in the same group as the unique n-gram; and selecting, as the future trending n-grams, one or more n-grams with a predicted change in frequency value above a predicted change threshold. 2 . The method of claim 1 , wherein identifying the one or more classifications for the particular content item comprises identifying a geographical region for the content item. 3 . The method of claim 2 , wherein the geographical region for the content item is identified based on region data for a user who provided the content item or for a device the content item originated from. 4 . The method of claim 1 , wherein identifying the one or more classifications for the particular content item comprises identifying a vertical for the content item based on the extracted text from the particular content item. 5 . The method of claim 1 , wherein extracting text from the particular content item comprises one or more of: converting audio associated with the particular content item to text; performing text recognition on an image associated with the particular content item; performing text recognition on video associated with the particular content item. 6 . The method of claim 1 , wherein organizing the extracted text into one or more n-grams comprises: normalizing the extracted text; tokenizing the normalized text; and grouping the tokenized text into groups of sequential tokens, the groups having a fixed number of tokens. 7 . The method of claim 6 , wherein the fixed number of tokens is two tokens. 8 . The method of claim 6 , wherein at least two of the groups of sequential tokens are overlapping in the normalized text. 9 . The method of claim 1 further comprising: identifying at least one invalid n-gram, wherein each particular invalid n-gram is identified as invalid based on an amount of words, of the particular invalid n-gram that match words on a pre-defined stop word list, being above a stop-word threshold; and removing from the cumulative set the identified invalid n-grams. 10 . A computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform operations for identifying one or more future trending n-grams, the operations comprising: for each particular content item of multiple content items: identifying one or more classifications for the particular content item; organizing text associated with the particular content item into one or more n-grams; adding, from the one or more n-grams into to a cumulative set of n-grams, at least one n-gram; computing a frequency value for each unique n-gram in the cumulative set of n-grams, the frequency value computed for a frequency within the group of n-grams in the cumulative set of n-grams that have the same one or more classifications; computing a predicted change in frequency value for at least some of the unique n-grams, wherein the predicted change in frequency value is based on time-based values associated the n-grams in the cumulative set that have the same sequence of words as the unique n-gram and that have the same one or more classifications as the unique n-gram; and selecting, as the future trending n-grams, one or more n-grams with a predicted change in frequency value above a predicted change threshold. 11 . The computer-readable storage medium of claim 10 , wherein computing the predicted change in frequency value for the at least some of the unique n-grams is performed, for each particular unique n-gram, by: fitting a polynomial to the time-based values for the n-grams that have the same sequence of words as the particular unique n-gram and have the same one or more classifications as the particular unique n-gram; and computing the predicted change in frequency value as a slope of the polynomial at a point corresponding to a current time. 12 . The computer-readable storage medium of claim 10 , wherein at least one of the one or more classifications for each particular content item is identified by performing natural language topic recognition on the text associated with the particular content item. 13 . The computer-readable storage medium of claim 10 , wherein the operations further comprise: receiving an indication of user input choosing a selected region and a selected vertical; and in response to the indication of user input, providing a subset of the selected future trending n-grams whose one or more classifications include both a region classification matching the selected region and a vertical classification matching the selected vertical. 14 . The computer-readable storage medium of claim 13 , wherein at least one chosen n-gram of the provided subset of future trending n-grams is used to generate marketing materials prior to the chosen n-gram reaching a peak in trending among users of a social media system. 15 . The computer-readable storage medium of claim 10 , wherein the operations further comprise selecting the at least some of the unique n-grams to be used in predicting a change frequency by: selecting unique n-grams, for at least one of the groups, that have a frequency value above a frequency threshold. 16 - 20 . (canceled) 21 . The computer-readable storage medium of claim 10 , wherein identifying the one or more classifications for the particular content item comprises identifying a geographical region for the content item. 22 . The computer-readable storage medium of claim 10 , wherein identifying the one or more classifications for the particular content item comprises identifying a vertical for the content item based on text associated with the particular content item. 23 . The computer-readable storage medium of claim 10 , wherein the operations further comprise extracting text from each particular content item, extracting the text from each particular content item including one or more of: converting audio associated with the particular content item to text; performing text recognition on an image associated with the particular content item; or performing text recognition on video associated with the particular content item. 24 . The computer-readable storage medium of claim 23 , wherein organizing the text into one or more n-grams comprises: normalizing the text; tokenizing the normalized text; and grouping the tokenized text into groups of sequential tokens, the groups having a fixed number of tokens. 25 . The computer-readable storage med

Assignees

Facebook Inc

Inventors

Tiwari Parth

Classifications

G06F40/284Primary
Lexical analysis, e.g. tokenisation or collocates · CPC title
G06Q10/40Primary
Business processes related to social networking or social networking services · CPC title
G06F18/24
Classification techniques · CPC title
G06F40/30
Semantic analysis · CPC title
G06F17/277Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 65898018

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019102374A1 cover?: A prediction system can predict future trending topics. The prediction system can classify social media posts by region and vertical, extract text from the posts, tokenize the extracted text, and organizing the tokens into n-grams. The prediction system can store the n-grams from the posts in a cumulative set of n-grams, with each n-gram tagged with the originating post's identified region, ver…
Who is the assignee on this patent?: Facebook Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/284. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Apr 04 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).