Multi-feature balancing for natural language processors
US-2024419910-A1 · Dec 19, 2024 · US
US2016188567A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016188567-A1 |
| Application number | US-201414587651-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 31, 2014 |
| Priority date | Dec 31, 2014 |
| Publication date | Jun 30, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A social networking system receives messages from users that include hashtags. The social networking system may use a natural language model to identify terms in the hashtag corresponding to words or phrases of the hashtag. The words or phrases may be used to modify a string of the hashtag. The social networking system may also generate computer models to determine likely membership of a message with various hashtags. Prior to generating the computer models, the social networking system may filter certain hashtags from eligibility for computer modeling, particularly hashtags that are not frequently used or that more typically appear as normal text in a message instead of as a hashtag. The social networking system may also calibrate the computer model outputs by comparing a test message output with outputs of a calibration group that includes positive and negative examples with respect to the computer model output.
Opening claim text (preview).
1 . A method comprising: receiving a message in a social networking system, the message including a character string with a hashtag; identifying, a set of candidate phrases including one or more words or phrases that match one or more characters in the character string; scoring each of the candidate phrases based on a natural language model that applies a frequency-based table of words or phrases; selecting a hashtag phrase from the set of candidate phrases based on the scoring of the candidate phrases; and predicting a topic of the message based at least in part on the identified hashtag phrase. 2 . The method of claim 1 , wherein the natural language model is n-gram language model. 3 . The method of claim 1 , wherein the natural language model is trained on a corpus of messages in a social networking system. 4 . The method of claim 1 , further comprising: generating a feature vector for the message including the hashtag phrase; and training a computer model to predict an association of the hashtag with a test message, the training using the feature vector for the message that includes the hashtag phrase. 5 . The method of claim 4 , wherein the feature vector for the message includes the character string with the hashtag replaced with the hashtag phrase. 6 . A non-transitory computer-readable medium comprising instructions executable by a processor that cause the processor to perform steps of: receiving a message in a social networking system, the message including a character string with a hashtag; identifying, a set of candidate phrases including one or more words or phrases that match one or more characters in the character string; scoring each of the candidate phrases based on a natural language model that applies a frequency-based table of words or phrases; selecting a hashtag phrase from the set of candidate phrases based on the scoring of the candidate phrases; and predicting a topic of the message based at least in part on the identified hashtag phrase. 7 . The non-transitory computer-readable medium of claim 6 , wherein the natural language model is n-gram language model. 8 . The non-transitory computer-readable medium of claim 6 , wherein the natural language model is trained on a corpus of messages in a social networking system. 9 . The non-transitory computer-readable medium of claim 6 , the steps further comprising: generating a feature vector for the message including the hashtag phrase; and training a computer model to predict an association of the hashtag with a test message, the training using the feature vector for the message that includes the hashtag phrase. 10 . The non-transitory computer-readable medium of claim 9 , wherein the feature vector for the message includes the character string with the hashtag replaced with the hashtag phrase. 11 . A system comprising: a processor configured to execute instructions; a non-transitory computer-readable medium containing instructions for execution on the processor, the instructions causing the processor to perform steps of: receiving a message in a social networking system, the message including a character string with a hashtag; identifying, a set of candidate phrases including one or more words or phrases that match one or more characters in the character string; scoring each of the candidate phrases based on a natural language model that applies a frequency-based table of words or phrases; selecting a hashtag phrase from the set of candidate phrases based on the scoring of the candidate phrases; and predicting a topic of the message based at least in part on the identified hashtag phrase. 12 . The system of claim 11 , wherein the natural language model is n-gram language model. 13 . The system of claim 11 , wherein the natural language model is trained on a corpus of messages in a social networking system. 14 . The system of claim 11 , wherein the instructions further cause the processor to perform steps including: generating a feature vector for the message including the hashtag phrase; and training a computer model to predict an association of the hashtag with a test message, the training using the feature vector for the message that includes the hashtag phrase. 15 . The non-transitory computer-readable medium of claim 9 , wherein the feature vector for the message includes the character string with the hashtag replaced with the hashtag phrase.
Business processes related to social networking or social networking services · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Semantic analysis · CPC title
Phrasal analysis, e.g. finite state techniques or chunking · CPC title
for social networking applications · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.