Systems and methods for automated object recognition

US10482345B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10482345-B2
Application numberUS-201715630913-A
CountryUS
Kind codeB2
Filing dateJun 22, 2017
Priority dateJun 23, 2016
Publication dateNov 19, 2019
Grant dateNov 19, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for recognizing an object in a video stream may include receiving a video stream from a video source, the video stream comprising a plurality of video frames. The method may also include selecting at least one video frame from the video frames according to a frame selection rate. The method may also include partitioning the selected video frame into a first plurality of image blocks. The method may also include recognizing, out of the first plurality of image blocks, a second plurality of image blocks which comprise an image of an object, the recognition being based on an image recognition parameter determined by a machine-learning algorithm. The method may also include determining that at least one of the second plurality of image blocks corresponds to the object based on a likelihood metric, the likelihood metric being determined by the processor based on at least the frame selection rate. The method may further include displaying, on a display, information identifying the object. A system and non-transitory computer-readable medium may also be provided.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for recognizing an object in a video stream, comprising: a memory storing instructions; and a processor configured to execute the stored instructions to: receive the video stream from a video source, the video stream comprising a first set of video frames; select at least one video frame from the first set of the video frames according to a frame selection rate, wherein the frame selection rate determines a number of the selected video frames; partition the one or more selected video frames into one or more sets of image blocks, each set of image blocks corresponding to a respective video frame; identify, within one or more sets of image blocks, a region which comprise an image of an object, the identification being based on a machine-learning algorithm for determining regions characterized by an image recognition parameter; calculate a likelihood metric that the region corresponds to the object; adjust the frame selection rate when the likelihood metric is less than a predetermined threshold; and display, on a display, information identifying the object. 2. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to apply the machine-learning algorithm to a second set of video frames. 3. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to adjust the image recognition parameter in response to an input received from a user device. 4. The system of claim 1 , wherein determining that the region corresponds to the object comprises: comparing the likelihood metric to a predetermined threshold; and when it is determined that the likelihood metric exceeds or equals a predetermined threshold, determining that the region corresponds to the object. 5. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to adjust the frame selection rate in response to an input from a user device. 6. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to determine the frame selection rate based on at least one of an image quality of the video stream, a location of the object in the one or more selected video frames, or a viewable angle of the object in the one or more selected video frames. 7. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to determine the likelihood metric based additionally on the information identifying the object. 8. The system of claim 7 , wherein the information identifying the object comprises information specifying at least one of a price of the object, availability of the object, or a location of the object. 9. The system of claim 1 , wherein the processor is further configured to execute the stored instructions to receive the information identifying the object from at least one data server. 10. The system of claim 1 , wherein: the processor is further configured to execute the stored instructions to display the video stream; and displaying the information identifying the object comprises displaying a digital shopping cart over the displayed video stream. 11. A computer-implemented method for recognizing an object in a video stream, comprising: receiving the video stream from a video source, the video stream comprising a first set of video frames; selecting at least one video frame from the first set of the video frames according to a frame selection rate, wherein the frame selection rate determines a number of the selected video frames; partitioning the one or more selected video frames into one or more sets of image blocks, each set of image blocks corresponding to a respective video frame; identifying within one or more sets, of image blocks, a region which comprise an image of an object, the identification being based on a machine-learning algorithm for determining regions characterized by an image recognition parameter; calculating a likelihood metric that the region corresponds to the object; adjusting the frame selection rate when the likelihood metric is less than a predetermined threshold; and displaying, on a display, information identifying the object. 12. The method of claim 11 , further comprising: applying the machine-learning algorithm to a second set of video frames. 13. The method of claim 11 , wherein determining that the region corresponds to the object comprises: comparing the likelihood metric to a predetermined threshold; and when it is determined that the likelihood metric exceeds or equals a predetermined threshold, determining that the region corresponds to the object. 14. The method of claim 11 , further comprising: determining the frame selection rate based on at least one of an image quality of the video stream, a location of the object in the one or more selected video frames, or a viewable angle of the object in the one or more selected video frames. 15. The method of claim 11 , further comprising: determining the likelihood metric based additionally on the information identifying the object. 16. A non-transitory computer-readable medium storing instructions which, when executed, cause at least one processor to perform operations for recognizing an object in a video stream, the operations comprising: receiving the video stream from a video source, the video stream comprising a first set of video frames; selecting at least one video frame from the first set of the video frames according to a frame selection rate, wherein the frame selection rate determines a number of the selected video frames; partitioning the one or more selected video frames into one or more sets of image blocks, each set of image blocks corresponding to a respective video frame; identifying within one or more sets, of image blocks, a region which comprise an image of an object, the identification being based on a machine-learning algorithm for determining regions characterized by an image recognition parameter; calculating a likelihood metric that the region corresponds to the object; adjusting the frame selection rate when the likelihood metric is less than a predetermined threshold; and displaying, on a display, information identifying the object. 17. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise: applying the machine-learning algorithm to a second set of video frames. 18. The non-transitory computer-readable medium of claim 16 , wherein determining that the region corresponds to the object comprises: comparing the likelihood metric to a predetermined threshold; and when it is determined that the likelihood metric exceeds or equals a predetermined threshold, determining that the region corresponds to the object. 19. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise: determining the frame selection rate based on at least one of an image quality of the video stream, a location of the object in the one or more selected video frames, or a viewable angle of the object in the one or more selected video frames. 20. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise: determining the likelihood metric based additionally on the information identifying the object.

Assignees

Inventors

Classifications

  • using pattern recognition or machine learning (optical pattern recognition or electronic computations therefor G06V10/88) · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • utilising user interfaces specially adapted for shopping · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10482345B2 cover?
A method for recognizing an object in a video stream may include receiving a video stream from a video source, the video stream comprising a plurality of video frames. The method may also include selecting at least one video frame from the video frames according to a frame selection rate. The method may also include partitioning the selected video frame into a first plurality of image blocks. T…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 19 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).