Multi-view deep neural network for lidar perception
US-2021150230-A1 · May 20, 2021 · US
US2022284285A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022284285-A1 |
| Application number | US-202117563812-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 28, 2021 |
| Priority date | Mar 2, 2021 |
| Publication date | Sep 8, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device for training a first machine learning-based model (MLM) for action recognition implements a training method. According to the training method, the training device obtains training data that comprises time sequences of data samples, which represent predefined subjects that are performing predefined actions. The training device trains the first MLM based on the training data, to discriminate between the predefined actions and to be adversarial to discrimination between the predefined subjects by a second MLM, and trains the second MLM based on feature data that is extracted by the first MLM for the training data, to discriminate between the predefined subjects. Thereby, the first MLM is encouraged to extract feature data that is unrelated to individual subjects, which improves action recognition performance of the trained first MLM when encountering new subjects.
Opening claim text (preview).
What is claimed is: 1 . A method of training a first machine learning-based model (MLM) for action recognition, said method comprising: obtaining training data comprising time sequences of data samples, wherein the time sequences of data samples represent predefined subjects which are performing predefined actions; training the first MLM based on the training data, to discriminate between the predefined actions; and training a second MLM based on feature data that is extracted by the first MLM for the training data, to discriminate between the predefined subjects; wherein the training of the first MLM is performed to be adversarial to the discrimination between the predefined subjects by the second MLM, wherein the training of the first MLM comprises: determining parameter values of the first MLM that minimizes a first loss function that represents a difference between action data generated by the first MLM and action reference data, which is predefined and associated with the training data, and that minimizes a second loss function that represents how much subject-related information is contained in the feature data. 2 . The method of claim 1 , wherein all parameter values of the second MLM are fixed during the training of the first MLM. 3 . The method of claim 2 , wherein all parameter values of the first MLM are fixed during the training of the second MLM. 4 . The method of claim 2 , wherein the second loss function represents a difference between subject identity data generated by the second MLM and target data, which is predefined and associated with the training data. 5 . The method of claim 4 , wherein the training of the second MLM comprises determining parameter values of the second MLM that minimizes a third loss function that represents a difference between the subject identity data generated by the second MLM and further target data, which is predefined and associated with the training data. 6 . The method of claim 5 , wherein the second loss function is a negation of the third loss function. 7 . The method of claim 4 , wherein the training of the first MLM results in a probability distribution over the predefined subjects, and wherein the target data comprises a reference probability distribution that represents fractional occurrences of the predefined subjects in the training data, and wherein the second loss function operates on the probability distribution and the reference probability distribution. 8 . The method of claim 4 , wherein the training of the first MLM, for a time sequence associated with a predefined action, results in a probability distribution over the predefined subjects, wherein the target data comprises a reference probability distribution that represents fractional occurrences of the predefined subjects in the training data for each predefined action, wherein the second loss function operates on a difference between the probability distribution and a corresponding reference probability distribution, wherein the corresponding reference probability distribution is associated with the predefined action. 9 . The method of claim 8 , wherein the second loss function aggregates, for the time sequences, differences between the probability distribution generated for each time sequence and the corresponding reference probability distribution. 10 . The method of claim 1 , wherein the subject identity data comprises a second probability value for at least one of the predefined subjects, and wherein the second loss function operates on the second probability value. 11 . The method of claim 1 , further comprising: obtaining deployment data comprising additional time sequences of data samples, wherein the additional time sequences represent additional predefined subjects performing non-categorized actions, and wherein the additional predefined subjects are included among the predefined subjects; including the deployment data in the training data; training the first MLM on at least part of the training data, to discriminate between the predefined actions, while excluding from the first loss function the action data that is generated by the first MLM for the additional time sequences; training the second MLM based on feature data extracted by the first MLM for said at least part of the training data, to discriminate between the predefined subjects; and evaluating the subject identity data and/or the action data generated by the training of the first MLM and the second MLM. 12 . The method of claim 11 , wherein said evaluating comprises: determining, based on the subject identity data generated by the second MLM, at least one selected subject among the additional predefined subjects; and indicating at least one of the additional group sequences that is performed by said at least one selected subject as a candidate to be categorized by action. 13 . The method of claim 1 , further comprising: obtaining deployment data comprising additional time sequences of data samples, wherein the additional time sequences represent additional predefined subjects performing non-categorized actions; including the deployment data in the training data; training the first MLM based on at least part of the training data, to discriminate between the predefined actions, while excluding from the first loss function the action data that is generated by the first MLM for the additional time sequences; training a third MLM based on feature data extracted by the first MLM for said at least part of the training data, to determine if the feature data originates from the deployment data; and evaluating output data generated by the third MLM during the training of the first MLM and/or the third MLM. 14 . The method of claim 13 , wherein the training of the first MLM is performed to be adversarial to the determination by the third MLM. 15 . The method of claim 13 , wherein said evaluating comprises: determining, based on the output data generated by the third MLM, at least one of the additional time sequences; and indicating the at least one of the additional time sequences as a candidate to be categorized by action. 16 . The method of claim 1 , wherein the first MLM comprises a sequence of processing layers and an action classification layer, which is directly or indirectly connected to the sequence of processing layers, wherein said feature data represents output data of at least one of the processing layers. 17 . The method of claim 16 , wherein one or more of the processing layers is a convolutional layer. 18 . The method of claim 16 , further comprising time-averaging the output data of said at least one of the processing layers, wherein the second MLM is trained based on the time-averaged output data. 19 . The method of claim 16 , wherein the second MLM is trained based on the output data of two or more processing layers in the sequence of processing layers. 20 . The method of claim 1 , wherein the second MLM comprises a plurality of subject classification networks which are operable in parallel, and wherein the subject classification networks differ by one of more of initialization values, network structure, or input data.
Static body considered as a whole, e.g. static pedestrian or occupant recognition · CPC title
Organisation of the process, e.g. bagging or boosting · CPC title
using neural networks · CPC title
Recognition of whole body movements, e.g. for sport training · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.