Who is the assignee on this patent?

Advanced New Technologies Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06N20/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Model training method, apparatus, and device, and data similarity determining method, apparatus, and device

US11288599B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11288599-B2
Application number	US-202016777659-A
Country	US
Kind code	B2
Filing date	Jan 30, 2020
Priority date	Jul 19, 2017
Publication date	Mar 29, 2022
Grant date	Mar 29, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A model training method includes: acquiring a plurality of user data pairs, wherein data fields of two sets of user data in each user data pair have an identical part; acquiring a user similarity corresponding to each user data pair, wherein the user similarity is a similarity between users corresponding to the two sets of user data in each user data pair; determining, according to the user similarity corresponding to each user data pair and the plurality of user data pairs, sample data for training a preset classification model; and training the classification model based on the sample data to obtain a similarity classification model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A model training method, comprising: acquiring a plurality of user data pairs, wherein each user data pair is acquired by comparing data fields of acquired user data to find two sets of user data corresponding to two different users, respectively, and having data fields that share an identical part to form the user data pair corresponding to the two different users; acquiring a user similarity corresponding to each user data pair, wherein the user similarity is a similarity between users corresponding to the two sets of user data in each user data pair; determining, according to the user similarity corresponding to each user data pair and the plurality of user data pairs, sample data for training a preset classification model, wherein the determining the sample data comprises: performing feature extraction on each user data pair in the plurality of user data pairs to obtain associated user features between the two sets of user data in each user data pair; and determining, according to the associated user features between the user data in each user data pair and the user similarity corresponding to each user data pair, the sample data for training the classification model, wherein the determining comprises: selecting positive sample features and negative sample features from user features corresponding to the plurality of user data pairs according to the user similarity corresponding to each user data pair and a predetermined similarity threshold; and using the positive sample features and the negative sample features as the sample data for training the classification model; and training the classification model based on the sample data to obtain a similarity classification model. 2. The method according to claim 1 , wherein the acquiring the user similarity corresponding to each user data pair comprises: acquiring biological features of users corresponding to a first user data pair, wherein the first user data pair is any user data pair in the plurality of user data pairs; and determining a user similarity corresponding to the first user data pair according to the biological features of the users corresponding to the first user data pair. 3. The method according to claim 2 , wherein the biological features comprise a facial image feature; the acquiring the biological features of the users corresponding to the first user data pair comprises: acquiring facial images of the users corresponding to the first user data pair; and performing feature extraction on the facial images to obtain facial image features of the users corresponding to the first user data pair; and the determining the user similarity corresponding to the first user data pair according to the biological features of the users corresponding to the first user data pair comprises: determining the user similarity corresponding to the first user data pair according to the facial image features of the users corresponding to the first user data pair. 4. The method according to claim 2 , wherein the biological features comprise a speech feature; the acquiring biological features of users corresponding to the first user data pair comprises: acquiring speech data of the users corresponding to the first user data pair; and performing feature extraction on the speech data to obtain speech features of the users corresponding to the first user data pair; and the determining the user similarity corresponding to the first user data pair according to the biological features of the users corresponding to the first user data pair comprises: determining the user similarity corresponding to the first user data pair according to the speech features of the users corresponding to the first user data pair. 5. The method according to claim 1 , wherein the associated user features comprise at least one of a household registration dimension feature, a name dimension feature, a social feature, or an interest feature, wherein the household registration dimension feature comprises a feature of user identity information, the name dimension feature comprises a feature of user name information and a feature of a degree of scarcity of a user surname, and the social feature comprises a feature of social relationship information of a user. 6. The method according to claim 1 , wherein the positive sample features comprise the same quantity of features as the negative sample features. 7. The method according to claim 1 , wherein the similarity classification model is a binary classifier model. 8. The method according to claim 1 , further comprising: acquiring a to-be-detected user data pair, the to-be-detected user data pair including two sets of to-be-detected user data; performing feature extraction on each set of to-be-detected user data in the to-be-detected user data pair to obtain to-be-detected user features; and determining a similarity between users corresponding to the two sets of to-be-detected user data in the to-be-detected user data pair according to the to-be-detected user features and the similarity classification model. 9. The method according to claim 8 , further comprising: determining to-be-detected users corresponding to the to-be-detected user data pair as twins if the similarity between the users corresponding to the two sets of to-be-detected user data in the to-be-detected user data pair is greater than a predetermined similarity classification threshold. 10. A model training device, comprising: a processor; and a memory configured to store instructions, wherein the processor is configured to execute the instructions to: acquire a plurality of user data pairs, wherein each user data pair is acquired by comparing data fields of acquired user data to find two sets of user data corresponding to two different users, respectively, and having data fields that share an identical part to form the user data pair corresponding to the two different users; acquire a user similarity corresponding to each user data pair, wherein the user similarity is a similarity between users corresponding to the two sets of user data in each user data pair; determine, according to the user similarity corresponding to each user data pair and the plurality of user data pairs, sample data for training a preset classification model, wherein determining the sample data comprises: performing feature extraction on each user data pair in the plurality of user data pairs to obtain associated user features between the two sets of user data in each user data pair; and determining, according to the associated user features between the user data in each user data pair and the user similarity corresponding to each user data pair, the sample data for training the classification model, wherein the determining comprises: selecting positive sample features and negative sample features from user features corresponding to the plurality of user data pairs according to the user similarity corresponding to each user data pair and a predetermined similarity threshold; and using the positive sample features and the negative sample features as the sample data for training the classification model; and train the classification model based on the sample data to obtain a similarity classification model. 11. The device according to claim 10 , wherein the processor is further configured to execute the instructions to: acquire biological features of users corresponding to a first user data pair, wherein the first user data pair is any user data pair in the plurality of user data pairs; and determine a user similarity corresponding to the first user data pair according to the biological features of the users correspond

Assignees

Advanced New Technologies Co Ltd

Inventors

Classifications

G06N20/20Primary
Ensemble learning · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/761
Proximity, similarity or dissimilarity measures · CPC title
G06N20/00Primary
Machine learning · CPC title
G06F18/22
Matching criteria, e.g. proximity measures · CPC title

Patent family

Related publications grouped by family.

View patent family 61059789

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11288599B2 cover?: A model training method includes: acquiring a plurality of user data pairs, wherein data fields of two sets of user data in each user data pair have an identical part; acquiring a user similarity corresponding to each user data pair, wherein the user similarity is a similarity between users corresponding to the two sets of user data in each user data pair; determining, according to the user sim…
Who is the assignee on this patent?: Advanced New Technologies Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Voice control of remote device by disabling wakeword detection

Determining area of interest in a panoramic video or photo

Method and apparatus for verifying user using multiple biometric verifiers

Computer based convolutional processing for image analysis

Methods and apparatus to identify a mood of media

Speaker identification device, speaker identification method, and recording medium

Method and system for authenticating biometric data

Frequently asked questions