Methods and arrangements to distribute a fraud detection model
US-2020065816-A1 · Feb 27, 2020 · US
US11330009B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11330009-B2 |
| Application number | US-202117180592-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 19, 2021 |
| Priority date | Mar 4, 2020 |
| Publication date | May 10, 2022 |
| Grant date | May 10, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A machine learning-based system and method for content clustering and content threat assessment includes generating embedding values for each piece of content of corpora of content data; implementing unsupervised machine learning models that: receive model input comprising the embeddings values of each piece of content of the corpora of content data; and predict distinct clusters of content data based on the embeddings values of the corpora of content data; assessing the distinct clusters of content data; associating metadata with each piece of content defining a member in each of the distinct clusters of content data based on the assessment, wherein the associating the metadata includes attributing to each piece of content within the clusters of content data a classification label of one of digital abuse/digital fraud and not digital abuse/digital fraud; and identifying members or content clusters having digital fraud/digital abuse based on querying the distinct clusters of content data.
Opening claim text (preview).
What is claimed: 1. A machine learning-based method for content clustering and content threat assessment in a machine learning task-oriented threat mitigation platform, the method comprising: generating embedding values for each piece of content of one or more corpora of content data; implementing one or more unsupervised machine learning models that: (i) receive model input comprising the embeddings values of each piece of content of the one or more corpora of content data; and (ii) predict a plurality of distinct clusters of content data based on the embeddings values of the one or more corpora of content data; assessing the plurality of distinct clusters of content data; associating metadata with each of the plurality of distinct clusters of content data based on the assessment, wherein the associating the metadata includes attributing to each piece of content within the plurality of distinct clusters of content data a classification label of one of (a) an adverse label indicating digital abuse or digital fraud and (b) not digital abuse or not digital fraud; at a machine-learning threat mitigation service: receiving from a subscriber of a threat mitigation service, via an application programming interface (API), a text content query comprising a target piece of online text data associated with one or more online services of the subscriber; querying the plurality of distinct clusters of content data that have the adverse label with an embedded query representation of the text content query based on a similarity threshold, wherein the similarity threshold is set to identify clusters of content data including both: (1) embedded representations identical to the text content query, and (2) embedded representations that include character substitutions of the text content query; and identifying the target piece of online text data with the adverse label indicating digital abuse or digital fraud if one or more of the plurality of distinct clusters of content data that have the adverse label is returned in response to the text content query; and displaying, on a user interface, a content-to-user network map for at least one of the one or more of the plurality of distinct clusters of content data that have the adverse label if the one or more of the plurality of distinct clusters of content data that have the adverse label is returned in response to the text content query, wherein the content-to-user network map includes: (a) a textual summary of the at least one of the one or more of the plurality of distinct clusters of content data that have the adverse label; (b) a plurality of representations of user accounts associated with pieces of content within the at least one of the one or more of the plurality of distinct clusters of content data that have the adverse label; and (c) a plurality of graphical edges, wherein each graphical edge of the plurality of graphical edges visually connects a distinct representation of a user account of the plurality of representations of user accounts to the textual summary of the at least one of the one or more of the plurality of distinct clusters of content data that have the adverse label; and mitigating, via a bulk mitigation action, a network of user accounts associated with the plurality of representations of user accounts that prevents the network of user accounts from publishing future content on one or more online resources of the subscriber. 2. The method according to claim 1 , wherein: the application programming interface (API) is searchably connected to each of the plurality of distinct clusters of content data. 3. The method according to claim 1 , wherein: the text content query comprises text content observed from an online post or an electronic communication, the text content is converted to the embedded query representation, and the identifying includes identifying one or more of the plurality of distinct clusters of content data that include pieces of content having the embedded query representation. 4. The method according to claim 1 , further comprising: a querying interface that includes a tuning interface object that, when adjusted or acted upon by user input, tunes one or more clustering similarity thresholds to increase or decrease a number of members within a target cluster of the plurality of distinct clusters of content data. 5. The method according to claim 4 , further comprising: querying, via the querying interface, the plurality of distinct clusters of content data based on the text content query; returning one or more of the plurality of distinct clusters of content data based on the querying; and increasing or decreasing a number of members within the one or more of the plurality of distinct clusters of content data based on an input to the tuning interface object. 6. The method according to claim 1 , further comprising: creating a cluster mapping that associates a search grain with at least one cluster of the plurality of distinct clusters of content data. 7. The method according to claim 6 , wherein: the search grain comprises the target piece of online text data, and the method further comprising: using the target piece of online text data to query the plurality of distinct clusters of content data; and returning, based on the target piece of online text data, one or more clusters of a plurality of distinct clusters of identifiers of a plurality of distinct clusters of content data. 8. The method according to claim 1 , further comprising: deriving, based on the plurality of distinct clusters of content data, a plurality of distinct clusters of identifiers of a plurality of online users that post online content. 9. The method according to claim 8 , further comprising: creating a cluster mapping that associates a search grain with at least one cluster of the plurality of distinct clusters of identifiers of the plurality of online users that post online content, wherein the search grain comprises an online user identifier of a user attempting to post online content or posting online content; using the online user identifier to query the plurality of distinct clusters of identifiers of online users; and returning, based on the online user identifier, one or more clusters of the plurality of distinct clusters of identifiers of the plurality of online users. 10. The method according to claim 6 , wherein the search grain comprises an identifier of a subscriber to the machine-learning threat mitigation service, the method further comprising: using the identifier of the subscriber to query the plurality of distinct clusters of identifiers of the plurality of online users; and returning, based on the identifier of the subscriber, one or more cluster members from one or more of the plurality of distinct clusters of identifiers of the plurality of online users. 11. The method according to claim 1 , wherein the content data relates to text data, communication data, or media data that is posted to a web or Internet-accessible medium, platform, service, system, or channel. 12. The method according to claim 1 , wherein associating the metadata includes: associating the classification label, in bulk, to a target cluster of the plurality of distinct clusters of content data, wherein the associating the classification label in bulk causes an association of a single classification label to all members of the target cluster. 13. The method according to claim 1 , wherein: the identifying includes identifying the one or more of the plurality of distinct clusters of content data based on a query comprising a metadata tag, the metadata tag identifying a classification of the one or more content clusters;
Combinations of networks · CPC title
Adversarial learning · CPC title
Supervised learning · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.