Identifying spammer profiles

US11089048B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11089048-B2
Application numberUS-201816145032-A
CountryUS
Kind codeB2
Filing dateSep 27, 2018
Priority dateSep 27, 2018
Publication dateAug 10, 2021
Grant dateAug 10, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A spammer profile detector uses multi-stage machine learning approach, where a content-based machine learning model, a connection graph machine learning model, and a behavior-based machine learning model are used sequentially, each model generating a score indicating the likelihood that a profile is a spammer profile. The content-based machine learning model examines and evaluates information stored in a member profile. The connection graph machine learning model examines and evaluates a member's connections. The behavior-based machine learning model examines and evaluates activities of a member represented by a member profile. The score produced by the spammer profile detector can be used to determine whether the profile should be flagged as a spammer profile, whether the profile should be omitted when determining a count of the total number of active member profiles within the system, whether the profile should be restricted or removed from the system, etc.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer implemented method comprising: accessing a member profile representing a user in an on-line connection network system; calculating, using a first machine learning model that takes information stored in the member profile as input, a first score indicating consistency of the information stored in the member profile; comparing the first score to a threshold value and, based on a result of the comparing, executing a second machine learning model that takes information about connections of the member profile in the on-line connection network system as input and produces a second score; comparing the second score to a further threshold value and, based on a result of comparing the second score to the further threshold value, executing a third machine learning model that takes as input information representing events originated with the member profile in the on-line onnection network system and produces a third score; and using at least one processor, based on the third score, associating the member profile with a flag indicating that the member profile is a spammer member profile; wherein the first machine learning model is different than the second machine learning model and the third machine learning model; wherein the third machine learning model is different than the second machine learning model; wherein the first score is calculated before the second score and the second score is calculated before the third score. 2. The method of claim 1 , wherein the calculating of the first score comprises increasing the first score in response to determining that similarity between an email address associated with the member profile and a name of a member represented by the member profile is below a predetermined threshold. 3. The method of claim 1 , wherein the calculating of the first score comprises increasing the first score in response to determining that geographic location indicated in the member profile is inconsistent with geographic location associated with an Internet Protocol (IP) address associated with a login session for the member profile in the on-line connection network system. 4. The method of claim 1 , wherein the second machine learning model produces the second score based on respective ranks assigned to connections of the member profile, the respective ranks generated based on affinity between the member profile and a respective connection and a weight that reflects interaction types and intensity between the member profile and a respective connection. 5. The method of claim 1 , wherein the executing of the second machine learning model comprises increasing the second score in response to determining that a certain percentage of connections of the member profile are from a certain demographic group. 6. The method of claim 1 , wherein the executing of the second machine learning model comprises increasing the second score in response to determining that a certain percentage of connections of the member profile originated from invitations issued from the member profile. 7. The method of claim 1 , wherein the third machine learning model is a text classification model configured to recognize whether a message content is indicative of spam. 8. The method of claim 1 , wherein the executing of the third machine learning model comprises increasing the third score in response to determining that a number of follow actions initiated from the member profile within a certain time period is greater than a particular threshold value. 9. The method of claim 1 , wherein the executing of the third machine learning model comprises increasing the third score in response to determining that the member profile includes a link to a web site previously identified as a spammer web site. 10. The method of claim 1 , comprising excluding the member profile from a total count of active member profiles in the on-line connection network system based on the presence of the flag indicating that the member profile is a spammer member profile. 11. A system comprising: one or more processors; and a non-transitory computer readable storage medium comprising instructions that when executed by the one or processors cause the one or more processors to perform operations comprising: accessing a member profile representing a user in an on-line connection network system; calculating, using a first machine learning model that takes information stored in the member profile as input, a first score indicating consistency of the information stored in the member profile; comparing the first score to a threshold value and, based on a result of the comparing, executing a second machine learning model that takes information about connections of the member profile in the on-line connection network system as input and produces a second score; comparing the second score to a further threshold value and, based on a result of comparing the second score to the further threshold value, executing a third machine learning model that takes as input information representing events originated with the member profile in the on-line connection network system and produces a third score; and based on the third score, associating the member profile with a flag indicating that the member profile is a spammer member profile; wherein the first machine learning model is different than the second machine learning model and the third machine learning model; wherein the third machine learning model is different than the second machine learning model; wherein the first score is calculated before the second score and the second score is calculated before the third score. 12. The system of claim 11 , wherein the calculating of the first score comprises increasing the first score in response to determining that similarity between an email address associated with the member profile and a name of a member represented by the member profile is below a predetermined threshold. 13. The system of claim 11 , wherein the calculating of the first score comprises increasing the first score in response to determining that geographic location indicated in the member profile is inconsistent with geographic location associated with an Internet Protocol (IP) address associated with a login session for the member profile in the on-line connection network system. 14. The system of claim 11 , wherein the second machine learning model produces the second score based on respective ranks assigned to connections of the member profile, the respective ranks generated based on affinity between the member profile and a respective connection and a weight that reflects interaction types and intensity between the member profile and a respective connection. 15. The system of claim 11 , wherein the executing of the second machine learning model comprises increasing the second score in response to determining that a certain percentage of connections of the member profile are from a certain demographic group. 16. The system of claim 11 , wherein the executing of the second machine learning model comprises increasing the second score in response to determining that a certain percentage of connections of the member profile originated from invitations issued from the member profile. 17. The system of claim 11 , wherein the third machine learning model is a text classification model configured to recognize whether a message content is indicative of spam. 18. The system of claim 11 , wherein the executing of the third machine learning model comprises increasing the third score in response to determining that a n

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • using filtering or selective blocking · CPC title

  • wherein the security policies are location-dependent, e.g. entities privileges depend on current location or allowing specific operations only from locally connected terminals · CPC title

  • Traffic logging, e.g. anomaly detection · CPC title

  • Countermeasures against malicious traffic (countermeasures against attacks on cryptographic mechanisms H04L9/002) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11089048B2 cover?
A spammer profile detector uses multi-stage machine learning approach, where a content-based machine learning model, a connection graph machine learning model, and a behavior-based machine learning model are used sequentially, each model generating a score indicating the likelihood that a profile is a spammer profile. The content-based machine learning model examines and evaluates information s…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 10 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).