Real-Time Object Detection Using Depth Sensors
US-2021089841-A1 · Mar 25, 2021 · US
US11182633B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11182633-B2 |
| Application number | US-201916676397-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 6, 2019 |
| Priority date | Nov 12, 2018 |
| Publication date | Nov 23, 2021 |
| Grant date | Nov 23, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A learning method is performed by a computer. The method includes: inputting a first image to a model, which outputs, from an input image, candidates for a specific region and confidences indicating probabilities of the respective candidates being the specific region, to cause the model to output a plurality of candidates for the specific region and confidences for the respective candidates; calculating a first value for each of candidates whose confidences do not satisfy a certain criterion among the candidates output by the model, the first value increasing as the confidence increases; calculating a second value obtained by weighting the first value such that the second value decreases as the confidence increases; and updating the model such that the second value decreases.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable storage medium having stored therein a learning program for causing a computer to execute a process, the process comprising: inputting a first image to a model to cause the model to output a plurality of first candidates for a specific region and confidences for the respective first candidates; calculating a first value for each of candidates whose confidence does not satisfy a certain criterion among the plurality of first candidates output by the model, the first value increasing as the confidence increases; calculating a second value obtained by weighting the first value such that the second value decreases as the confidence increases; and updating the model such that the second value decreases, wherein in the calculating of the second value, the second value is calculated by multiplying the first value by an output value obtained by inputting the confidence to a certain function whose output value monotonically decreases with respect to an input value. 2. The non-transitory computer-readable storage medium according to claim 1 , wherein in the calculating of the second value, the second value is calculated for each of candidates ranked in a certain place or higher in ranking of magnitudes of the first values of the candidates whose confidences do not satisfy the certain criterion. 3. The non-transitory computer-readable storage medium according to claim 1 , the process further comprising: selecting, as a selected region, a candidate whose confidence satisfies the certain criterion from among the plurality of first candidates output by the model, wherein in the updating, the model is updated such that both the second value and a third value decrease, the third value indicating a magnitude of a difference of the selected region from a region that is set to be true in advance in the first image. 4. The non-transitory computer-readable storage medium according to claim 1 , wherein in the inputting, the first image is input to the model to cause the model to output the plurality of first candidates for a gripping position of an object and the confidences for the respective first candidates. 5. The non-transitory computer-readable storage medium according to claim 1 , the process further comprising: inputting a second image to the model that has been updated in the updating of the model to cause the model to output second candidates and confidences for the respective second candidates; and detecting, as a detected region, a candidate with a highest confidence among the second candidates. 6. The non-transitory computer-readable storage medium according to claim 5 , wherein in the inputting of the first image, the first image is input to the model to cause the model to output the plurality of first candidates for a gripping position of an object and confidences for the respective candidates, in the inputting of the second image, the second image is input to the model to cause the model to output the second candidates for the gripping position and the confidences for the respective second candidates, and in the detecting, the candidate with the highest confidence is detected as the gripping position among the second candidates. 7. The non-transitory computer-readable storage medium according to claim 6 , the process further comprising: outputting the gripping position detected in the detecting to a gripping apparatus that controls a robot for gripping the object. 8. The non-transitory computer-readable storage medium according to claim 1 , wherein the model is a convolutional neural network implementing teachingless learning. 9. The non-transitory computer-readable storage medium to claim 1 , wherein the candidates, whose confidences do not satisfy the certain criteria, are classified as negative example candidates, and the updating the model based on the second values reduces error within the model by correcting error relating to the negative example candidates. 10. A learning method performed by a computer, the method comprising: inputting a first image to a model to cause the model to output a plurality of first candidates for a specific region and confidences for the respective first candidates; calculating a first value for each of candidates whose confidence does not satisfy a certain criterion among the plurality of first candidates output by the model, the first value increasing as the confidence increases; calculating a second value obtained by weighting the first value such that the second value decreases as the confidence increases; and updating the model such that the second value decreases, wherein in the calculating of the second value, the second value is calculated by multiplying the first value by an output value obtained by inputting the confidence to a certain function whose output value monotonically decreases with respect to an input value. 11. The learning method according to claim 10 , the method further comprising: inputting a second image to the model that has been updated in the updating of the model to cause the model to output second candidates and confidences for the respective second candidates; and detecting, as a detected region, a candidate with a highest confidence among the second candidates. 12. A learning apparatus comprising: a memory, and a processor coupled to the memory and configured to: input a first image to a model to cause the model to output a plurality of first candidates for a specific region and confidences for the respective first candidates; calculate a first value for each of candidates whose confidence does not satisfy a certain criterion among the plurality of first candidates output by the model, the first value increasing as the confidence increases; calculate a second value obtained by weighting the first value such that the second value decreases as the confidence increases; and update the model such that the second value decreases, wherein the second value is calculated by multiplying the first value by an output value obtained by inputting the confidence to a certain function whose output value monotonically decreases with respect to an input value. 13. The learning apparatus according to claim 12 , the processor further configured to: input a second image to the model that has been updated in the update of the model to cause the model to output second candidates and confidences for the respective second candidates; and detect, as a detected region, a candidate with a highest confidence among the second candidates.
characterised by the hand, wrist, grip control · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title
using neural networks · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.