Extracting audiovisual features from digital components

US11093692B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11093692-B2
Application numberUS-201715638304-A
CountryUS
Kind codeB2
Filing dateJun 29, 2017
Priority dateNov 14, 2011
Publication dateAug 17, 2021
Grant dateAug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for extracting audiovisual features from images and other digital components. A data processing system can extract image data and image features from an input image. The data processing system can match the image features to the image features of a plurality of image to identify candidate images. A second image can be selected from the candidate images based on a request that the data processing system received with the input image.

First claim

Opening claim text (preview).

What is claimed is: 1. A system to extract image features from input requests, comprising: a recognition engine executed by a data processing system comprising at least one hardware processor to: receive, from a computing device, a first request comprising a first query term and a first image that is different from the first query term; retrieve, from a data repository, content data for each of a plurality of content items, the content data comprising image features; extract an image feature from the first image, the image feature including at least one of an edge, a corner feature, or a ridge feature; select candidate content items from the plurality of content items based on matches between the image features of the plurality of content items and the at least one of the edge, the corner feature, or the ridge feature of the first image; select a content item from the candidate content items based at least in part on the first query term; and a network interface of the data processing system to transmit the content item to the computing device. 2. The system of claim 1 , wherein the first image is captured by a camera associated with the computing device. 3. The system of claim 1 , wherein the first request comprises an audio-based input signal. 4. The system of claim 1 , comprising the recognition engine to: apply an image feature detection to the first image to extract the image feature; and apply the image feature detection to each of the plurality of content items. 5. The system of claim 1 , comprising the recognition engine to: extract a plurality of image features from the first image; extract a second plurality of image features from each of the plurality of content items; and select the candidate content items from the plurality of content items by matching a first predetermined number of the plurality of image features with a second predetermined number of the second plurality of image features. 6. The system of claim 1 , comprising: a natural language processor component executed by the data processing system to: receive data packets comprising the first request; parse the first request to identify a trigger keyword corresponding to the first request; and the recognition engine to select the content item from the candidate content items based on the trigger keyword. 7. The system of claim 1 , comprising: a direct action application programming interface executed by the data processing system to: generate an action data structure based on the first request; transmit the action data structure to a service provider computing device to cause the service provider computing device to invoke a conversational application programming interface and establish a communication session between the service provider computing device and the computing device; and receive an indication that the service provider computing device established the communication session with the computing device. 8. The system of claim 1 , comprising: a natural language processor component executed by the data processing system to: parse the first request to identify a trigger keyword corresponding to the first request; the data processing system to: select a template based on the trigger keyword; and populate a field of the template with the first image. 9. The system of claim 8 , comprising the data processing system to: request a value from a sensor associated with the computing device; and populate a second field of the template with the value. 10. The system of claim 8 , comprising: the data processing system configured to generate, based on the template, an action data structure. 11. A method to extract image features from input requests, comprising: receiving, from a computing device by a data processing system, a first request comprising a first query term and a first image that is different from the first query term; retrieving, from a data repository by the data processing system, content data for each of a plurality of content items, the content data comprising image features; extracting, by a recognition engine executed by the data processing system, an image feature from the first image, the image feature including at least one of an edge, a corner feature, or a ridge feature; selecting, by the data processing system, candidate content items from the plurality of content items by determining matches between the image features of the plurality of content items and the at least one of the edge, the corner feature, or the ridge feature of the first image; selecting, by the data processing system, a content item from the candidate content items based at least in part on the first query term; and transmitting, via a network interface, the content item to the computing device. 12. The method of claim 11 , wherein the first image is captured by a camera associated with the computing device. 13. The method of claim 11 , wherein the first request comprises an audio-based input signal. 14. The method of claim 11 , comprising: applying an image feature detection to the first image to extract the image feature; and applying the image feature detection to each of the plurality of content items. 15. The method of claim 11 , comprising: extracting, by the recognition engine, a plurality of image features from the first image; extracting, by the recognition engine, a second plurality of image features from each of the plurality of content items; and selecting, by the data processing system, the candidate content items from the plurality of content items by matching a first predetermined number of the plurality of image features with a second predetermined number of the second plurality of image features. 16. The method of claim 11 , comprising: receiving, by a natural language processor component executed by the data processing system, data packets comprising the first request; parsing, by the natural language processor component, the first request to identify a trigger keyword corresponding to the first request; and selecting, by the data processing system, the content item from the candidate content items based on the trigger keyword. 17. The method of claim 11 , comprising: generating, by a direct action application programming interface, an action data structure based on the first request; transmitting, by the direct action application programming interface, the action data structure to a service provider computing device to cause the service provider computing device to invoke a conversational application programming interface and establish a communication session between the service provider computing device and the computing device; and receiving, by the data processing system, an indication that the service provider computing device established the communication session with the computing device. 18. The method of claim 11 , comprising: parsing, by a natural language processor component, the first request to identify a trigger keyword corresponding to the first request; selecting, by the data processing system, a template based on the trigger keyword; and populating, by the data processing system, a field of the template with the first image. 19. The method of claim 18 , comprising: requesting, by the data processing system from the computing device, a value from a sensor associated with the computing device; and populating, by the data processing system, a second field of the template with the value. 20. The method of claim 18 , comprising: generating

Assignees

Inventors

Classifications

  • G06F40/279Primary

    Recognition of textual entities · CPC title

  • G06F40/134Primary

    Hyperlinking · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Extraction of image or video features · CPC title

  • Matching; Classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11093692B2 cover?
Systems and methods for extracting audiovisual features from images and other digital components. A data processing system can extract image data and image features from an input image. The data processing system can match the image features to the image features of a plurality of image to identify candidate images. A second image can be selected from the candidate images based on a request tha…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/279. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).