Object classification in image data using machine learning models

US10289925B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10289925-B2
Application numberUS-201615363835-A
CountryUS
Kind codeB2
Filing dateNov 29, 2016
Priority dateNov 29, 2016
Publication dateMay 14, 2019
Grant dateMay 14, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Combined color and depth data for a field of view is received. Thereafter, using at least one bounding polygon algorithm, at least one proposed bounding polygon is defined for the field of view. It can then be determined, using a binary classifier having at least one machine learning model trained using a plurality of images of known objects, whether each proposed bounding polygon encapsulates an object. The image data within each bounding polygon that is determined to encapsulate an object can then be provided to a first object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon. Further, the image data within each bounding polygon that is determined to encapsulate an object is provided to a second object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon. A final classification for each bounding polygon is then determined based on the output of the first classifier machine learning model and the output of the second classifier machine learning model.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for implementation by one or more data processors forming part of at least one computing system, the method comprising: receiving combined color and depth data for a field of view; defining, using at least one bounding polygon algorithm, at least one proposed bounding polygon for the field of view; determining, using a binary classifier having at least one machine learning model trained using a plurality of images of known objects, whether each proposed bounding polygon encapsulates an object; providing the image data within each bounding polygon that is determined to encapsulate an object to a first object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon; providing the image data within each bounding polygon that is determined to encapsulate an object to a second object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon; determining a final classification for each bounding polygon based on the output of the first classifier machine learning model and the output of the second classifier machine learning model; and providing data characterizing the final classification for each bounding polygon; wherein at least one of the binary classifier, the first object classifier, or the second object classifier utilizes a machine learning model that is selected amongst a plurality of machine learning models based on a type of object encapsulated within the corresponding bounding polygon. 2. The method of claim 1 , wherein the at least one first classifier machine learning model is a region and measurements-based convolutional neural network. 3. The method of claim 1 , wherein the combined color and depth image data is RGB-D data. 4. The method of claim 1 , wherein the first object classifier uses metadata characterizing each object. 5. The method of claim 4 , wherein the metadata is extracted from the combined color and image data. 6. The method of claim 1 , wherein the at least one machine learning model of the binary classifier is one or more of: a neural network, a convolutional neural network, a logistic regression model, a support vector machine, decision trees, ensemble model, k-nearest neighbors model, linear regression model, naïve Bayes model, a logistic regression model, and/or a perceptron model. 7. The method of claim 1 , wherein the at least one machine learning model of the first object classifier is one or more of: a neural network, a convolutional neural network, a logistic regression model, a support vector machine, decision trees, ensemble model, k-nearest neighbors model, linear regression model, naïve Bayes model, a logistic regression model, and/or a perceptron model. 8. The method of claim 1 , wherein the at least one machine learning model of the second object classifier is one or more of: a neural network, a convolutional neural network, a logistic regression model, a support vector machine, decision trees, ensemble model, k-nearest neighbors model, linear regression model, naïve Bayes model, a logistic regression model, and/or a perceptron model. 9. The method of claim 1 further comprising: discarding proposed bounding polygons determined, by the binary classifier, to not include an object. 10. The method of claim 1 , wherein the providing data characterizing the final classification for each bounding polygon comprises at least one of: displaying the data characterizing the final classification for each bounding polygon in an electronic visual display, loading the data characterizing the final classification for each bounding polygon into memory, storing the data characterizing the final classification for each bounding polygon in persistence, or transmitting the data characterizing the final classification for each bounding polygon to a remote computing device. 11. A method for implementation by one or more data processors forming part of at least one computing device, the method comprising: receiving RGB-data for a field of view; defining, using at least one bounding polygon algorithm, at least one bounding polygon for the field of view; determining, using a binary classifier machine learning model trained using a plurality of images of known objects, whether each bounding polygon encapsulates one of the known objects; providing the image data within each bounding polygon that is determined to encapsulate one of the known objects to a select one or more a plurality of classifier machine learning models trained using a plurality of images of known objects, to classify the known objects; and providing data characterizing the classification of the known objects; wherein: the select one or more of the plurality of classifier machine learning models to which the image data is provided are selected based on metadata associated with the RGB-data; the metadata associated with the RGB-data acts as a pre-classifier. 12. The method of claim 11 , wherein the RGB data is RGB-D data. 13. A system comprising: at least one data processor; and memory storing instructions which, when executed by the at least one data processor, result in operations comprising: receiving combined color and depth data for a field of view; defining, using at least one bounding polygon algorithm, at least one proposed bounding polygon for the field of view; determining, using a binary classifier having at least one machine learning model trained using a plurality of images of known objects, whether each proposed bounding polygon encapsulates an object; providing the image data within each bounding polygon that is determined to encapsulate an object to a first object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon; providing the image data within each bounding polygon that is determined to encapsulate an object to a second object classifier having at least one machine learning model trained using a plurality of images of known objects, to classify the object encapsulated within the respective bounding polygon, the first object classifier being a different type than the second object classifier, the at least one machine learning model of the second object classifier comprising a bag-of-word (BoW) model that treats image features as words; determining a final classification for each bounding polygon based on the output of the first classifier machine learning model and the output of the second classifier machine learning model; and providing data characterizing the final classification for each bounding polygon. 14. The system of claim 13 , wherein the at least one first classifier machine learning model is a region and measurements-based convolutional neural network. 15. The system of claim 13 , wherein the combined color and depth image data is RGB-D data. 16. The system of claim 13 , wherein the first object classifier uses metadata characterizing each object, the metadata being extracted from the combined color and image data. 17. The system of claim 13 , wherein the at least one machine learning model of the binary classifier is one or more of: a neural network, a convolutional neural network, a logistic regression model, a support vector machine, decision trees, ensemble model, k-nearest neighbors model, linear regression model, naïve Bayes model,

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • relating to the classification model, e.g. parametric or non-parametric approaches · CPC title

  • Multiple classes · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10289925B2 cover?
Combined color and depth data for a field of view is received. Thereafter, using at least one bounding polygon algorithm, at least one proposed bounding polygon is defined for the field of view. It can then be determined, using a binary classifier having at least one machine learning model trained using a plurality of images of known objects, whether each proposed bounding polygon encapsulates …
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 14 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).