Filtering automated selection of hashtags for computer modeling

US9959503B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9959503-B2
Application numberUS-201414587605-A
CountryUS
Kind codeB2
Filing dateDec 31, 2014
Priority dateDec 31, 2014
Publication dateMay 1, 2018
Grant dateMay 1, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A social networking system receives messages from users that include hashtags. The social networking system may use a natural language model to identify terms in the hashtag corresponding to words or phrases of the hashtag. The words or phrases may be used to modify a string of the hashtag. The social networking system may also generate computer models to determine likely membership of a message with various hashtags. Prior to generating the computer models, the social networking system may filter certain hashtags from eligibility for computer modeling, particularly hashtags that are not frequently used or that more typically appear as normal text in a message instead of as a hashtag. The social networking system may also calibrate the computer model outputs by comparing a test message output with outputs of a calibration group that includes positive and negative examples with respect to the computer model output.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying a plurality of hashtags associated with a set of messages in an online system, the set of messages including a message having a textual string including a first hashtag and a second hashtag of the plurality of hashtags; selecting a set of hashtags for which to train a plurality of hashtag classifiers by using a filter, wherein the filter excludes at least the first hashtag and includes at least the second hashtag; training the plurality of hashtag classifiers for the selected set of hashtags, each hashtag classifier corresponding to a hashtag in the selected set of hashtags and trained with messages in the set of messages that include the corresponding hashtag, each hashtag classifier providing a classifier output indicating whether a message should be associated with the hashtag corresponding to the hashtag classifier; identifying one or more hashtags for a subject message of a user of the online system based on the classifier outputs of the plurality of hashtag classifiers applied to the subject message, wherein the one or more hashtags indicate a topic of the subject message based on text included in the subject message; selecting a candidate story likely to be of interest to the user using the topic indicated by the one or more hashtags; and sending the candidate story from the online system to a client device for presentation to the user. 2. The method of claim 1 , wherein the filter excludes the first hashtag based on a trendiness value of the first hashtag determined based on a recent frequency of occurrences of the first hashtag relative to previous occurrences of the first hashtag. 3. The method of claim 2 , wherein the filter excludes the first hashtag in response to determining that the trendiness value is below a threshold value. 4. The method of claim 1 , wherein the text included in the subject message does not include the one or more hashtags. 5. The method of claim 1 , wherein the filter excludes the first hashtag based on a hashtag frequency value of the first hashtag that indicates a frequency of occurrences of the first hashtag in the set of messages relative to terms associated with the first hashtag in the set of messages. 6. The method of claim 5 , wherein the filter excludes the first hashtag in response to determining that the frequency value is below a threshold value based on occurrences of terms in the set of messages. 7. The method of claim 5 , further comprising: determining the terms associated with the first hashtag based on an n-gram language model, wherein the terms include a plurality of words included in the first hashtag. 8. The method of claim 1 , further comprising suggesting to the user that the user include the one or more identified hashtags in the subject message. 9. A non-transitory computer-readable medium comprising instructions executable by a processor that cause the processor to perform steps of: identifying a plurality of hashtags associated with a set of messages in an online system, the set of messages including a message having a textual string including a first hashtag and a second hashtag of the plurality of hashtags; selecting a set of hashtags for which to train a plurality of hashtag classifiers by using a filter, wherein the filter excludes at least the first hashtag and includes at least the second hashtag; training the plurality of hashtag classifiers for the selected set of hashtags, each hashtag classifier corresponding to a hashtag in the selected set of hashtags and trained with messages in the set of messages that include the corresponding hashtag, each hashtag classifier providing a classifier output indicating whether a message should be associated with the hashtag corresponding to the hashtag classifier; identifying one or more hashtags for a subject message of a user of the online system based on the classifier outputs of the plurality of hashtag classifiers applied to the subject message, wherein the one or more hashtags indicate a topic of the subject message based on text included in the subject message; selecting a candidate story likely to be of interest to the user using the topic indicated by the one or more hashtags; and sending the candidate story from the online system to a client device for presentation to the user. 10. The non-transitory computer-readable medium of claim 9 , wherein the filter excludes the first hashtag based on a trendiness value of the first hashtag determined based on a recent frequency of occurrences of the first hashtag relative to previous occurrences of the first hashtag. 11. The non-transitory computer-readable medium of claim 10 , wherein the filter excludes the first hashtag in response to determining that the trendiness value is below a threshold value. 12. The non-transitory computer-readable medium of claim 9 , wherein the text included in the subject message does not include the one or more hashtags. 13. The non-transitory computer-readable medium of claim 9 , wherein the filter excludes the first hashtag based on a hashtag frequency value of the first hashtag that indicates a frequency of occurrences of the first hashtag in the set of messages relative to terms associated with the first hashtag in the set of messages. 14. The non-transitory computer-readable medium of claim 13 , wherein the filter excludes the first hashtag in response to determining that the frequency value is below a threshold value based on occurrences of terms in the set of messages. 15. The non-transitory computer-readable medium of claim 13 , wherein the steps further comprise: determining the terms associated with the first hashtag based on an n-gram language model, wherein the terms include a plurality of words included in the first hashtag. 16. The non-transitory computer-readable medium of claim 9 , wherein the steps further comprise suggesting to the user that the user include the one or more identified hashtags in the subject message. 17. A method comprising: identifying a set of messages, each message of the set of messages including at least one hashtag; identifying a plurality of candidate hashtags each corresponding to a unique hashtag included in at least one message of the set of messages; selecting a set of hashtags for training hashtag classifiers by filtering the plurality of candidate hashtags; for each hashtag in the selected set of hashtags, training a hashtag classifier for the hashtag based on messages of the set of messages that include the hashtag, the hashtag classifier providing a classifier output indicating whether a message should be associated with the hashtag; identifying one or more hashtags for a subject message of a user of an online system by applying the subject message to the hashtag classifiers for the selected set of hashtags and based on the classifier outputs of the hashtag classifiers for the subject message, wherein the one or more hashtags indicate a topic of the subject message based on text included in the subject message; selecting a candidate story likely to be of interest to the user using the topic indicated by the one or more hashtags; and sending the candidate story from the online system to a client device for presentation to the user. 18. The method of claim 17 , wherein filtering the plurality of candidate hashtags comprises excluding at least one hashtag from the plurality of candidate hashtags based on a recent frequency of occurrences of the at least one hashtag relative to previous occurrences of the at least one hashtag. 19.

Assignees

Inventors

Classifications

  • Business processes related to social networking or social networking services · CPC title

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

  • G06N5/04Primary

    Inference or reasoning models · CPC title

  • Market modelling; Market analysis; Collecting market data · CPC title

  • Electricity · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9959503B2 cover?
A social networking system receives messages from users that include hashtags. The social networking system may use a natural language model to identify terms in the hashtag corresponding to words or phrases of the hashtag. The words or phrases may be used to modify a string of the hashtag. The social networking system may also generate computer models to determine likely membership of a messag…
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).