System and method for robot teaching based on rgb-d images and teach pendant
US-2021023694-A1 · Jan 28, 2021 · US
US12437428B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12437428-B2 |
| Application number | US-202318125675-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 23, 2023 |
| Priority date | Jul 4, 2022 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present application relates to image processing and provides a method for training an image depth recognition model, a method for recognizing image depth, and an electronic device. The method obtains static objects, dynamic objects, a dynamic position by performing an instance segmentation on the first image and the second image. A target dynamic object and a feature dynamic object are selected from the dynamic objects and the dynamic objects. A target image and a target projection image are generated according to the target dynamic object and the feature dynamic object. A depth recognition model is trained based on the target image, and the target projection image. The to-be-recognized image is recognized by the depth recognition model.
Opening claim text (preview).
What is claimed is: 1. A method for training an image depth recognition model by using an electronic device, the method comprising: obtaining a first image and a second image; obtaining a first static object, a plurality of first dynamic objects and a first dynamic position of each first dynamic object by performing an instance segmentation on the first image, obtaining a second static object and a plurality of second dynamic objects by performing an instance segmentation on the second image, through an instance segmentation model; selecting a plurality of target dynamic objects from the plurality of first dynamic objects based on a number of pixel points in each first dynamic object and preset positions, and selecting a plurality of feature dynamic objects from the plurality of second dynamic objects based on the number of pixel points in each second dynamic object and the preset positions; recognizing whether each target dynamic object has a corresponding feature dynamic object and determining the target dynamic object and corresponding feature dynamic object as recognition objects; recognizing an object state of the target dynamic object in the recognition objects according to a dynamic posture matrix corresponding to the recognition objects, a static posture matrix corresponding to the first static object, a static posture matrix corresponding to the second static object, and a preset threshold matrix; generating a target image according to the object state, the first dynamic position and the first image, and generating a target projection image according to the object state, the first dynamic position and the target image; obtaining an image depth recognition model by training a preset depth recognition network, based on a gradient error between an initial depth image corresponding to the first image and the target image and a photometric error between the target projection image and the target image. 2. The method for training an image depth recognition model according to claim 1 , wherein the instance segmentation model comprises a feature extraction layer, a classification layer, and a mapping layer, wherein obtaining a first static object, a plurality of first dynamic objects and a first dynamic position of each first dynamic object by performing an instance segmentation on the first image comprises: standardizing the first image and obtaining a standardized image; performing a feature extraction on the standardized image through the feature extraction layer and obtaining an initial feature map; segmenting the standardized image to obtain a rectangular area corresponding to each pixel point in the initial feature map, based on a multiple relation between a size of the initial feature image and a size of the standardized image and a convolution step in the feature extraction layer; classifying the initial feature map and obtains a prediction probability that each pixel point in the initial feature map belongs to a first preset category through the classification layer; determining a plurality of pixel points corresponding to the prediction probability with a value greater than a preset threshold in the initial feature map as a plurality of target pixel points; determining a plurality of rectangular areas corresponding to the plurality of target pixel points as a plurality of feature areas; mapping each feature area into the initial feature map through the mapping layer, and obtaining a plurality of mapping areas; dividing the plurality of mapping areas based on a preset quantity and obtaining a plurality of partition areas; determining a center pixel point in each partition area; calculating a pixel value of the center pixel point; pooling the pixel value of the center pixel point and obtaining a mapping probability value corresponding to each mapping area; restoring the plurality of mapping areas and obtaining a target feature map by splicing the plurality of restored mapping areas; generating the first static object, the plurality of first dynamic objects and the first dynamic position of each first dynamic object according to the target feature map, the mapping probability value, the plurality of restored mapping areas and a second preset category. 3. The method for training an image depth recognition model according to claim 2 , wherein generating the first static object, the plurality of first dynamic objects and the first dynamic position of each first dynamic object according to the target feature map, the mapping probability value, the plurality of restored mapping areas and a second preset category comprises: classifying each pixel point in the target feature map according to the mapping probability value and the second preset category, and obtaining a pixel point category of each pixel point in the restored mapping areas; determining areas composed of a plurality of pixel points corresponding to the same pixel point category in the restored mapping areas as a first object; acquiring pixel coordinates of all pixel points in the first object; determining the pixel coordinates as a first position corresponding to the first object; determining whether the first object is the first dynamic object or the first static object according to preset rules and determining the first position corresponding to the first dynamic object as the first dynamic position. 4. The method for training an image depth recognition model according to claim 1 , wherein selecting a plurality of target dynamic objects from the plurality of first dynamic objects based on the number of pixel points in each first dynamic object and preset positions comprises: calculating the number of the pixel points in each first dynamic object; sorting the plurality of first dynamic objects according to the number of pixel points; selecting the sorted first dynamic object at the preset positions as the plurality of target dynamic objects. 5. The method for training an image depth recognition model according to claim 1 , wherein recognizing whether each target dynamic object has a corresponding feature dynamic object comprises: acquiring a plurality of target element information of each target dynamic object; acquiring feature element information in the feature dynamic object with a same category as each target element information; matching each target element information with feature element information of the same category to obtain a matching value; when the matching value is within a preset interval, determining that the target dynamic object has a corresponding feature dynamic object. 6. The method for training an image depth recognition model according to claim 1 , wherein recognizing an object state of the target dynamic object in the recognition objects according to a dynamic posture matrix corresponding to the recognition objects, a static posture matrix corresponding to the first static object, a static posture matrix corresponding to the second static object, and a preset threshold matrix comprises: performing a subtraction operation on each matrix element in the static posture matrix and a corresponding matrix element in the dynamic posture matrix corresponding to the recognition objects and obtaining a plurality of posture differences; taking an absolute value of the plurality of posture differences and obtaining posture absolute values corresponding to the static posture matrix; arranging the posture absolute values and obtaining a posture absolute value matrix, according to an element position of posture absolute values; comparing each posture absolute value in the posture absolute value matrix with a corresponding posture threshold in the preset threshold matrix; when there is at least one posture absolute value greater than the corresponding posture threshold in the post
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
Range image; Depth image; 3D point clouds · CPC title
Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title
using classification, e.g. of video objects · CPC title
Proximity, similarity or dissimilarity measures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.