Method and system for image search and cropping

US11195046B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11195046-B2
Application numberUS-201916441918-A
CountryUS
Kind codeB2
Filing dateJun 14, 2019
Priority dateJun 14, 2019
Publication dateDec 7, 2021
Grant dateDec 7, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for processing an image are described. A saliency map is generated from the image. The saliency map indicates one or more salient portions of the image that have saliency values satisfying a saliency criterion. A scene graph is generated for at least the one or more salient portions. The scene graph represents a plurality of objects detected in the image. The scene graph further represents one or more relationships between each respective object pairs. One or more dataset entries associated with the image are generated. Each of the one or more relationships for each of the one or more object pairs is indicated by a respective dataset entry. The one or more dataset entries are stored in a first dataset.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for processing an image to generate a scene graph, the method comprising: receiving an image; generating a saliency map from the image, wherein the saliency map indicates one or more salient portions of the image that have saliency values satisfying a saliency criterion; generating a scene graph for at least the one or more salient portions of the image, the scene graph comprising a plurality of nodes representing a respective plurality of objects detected in the image, one or more object pairs being formed by the plurality of objects, and the scene graph further comprising one or more connectors representing respective one or more relationships between each respective object pairs; generating one or more dataset entries associated with the image, wherein each of the one or more relationships for each of the one or more object pairs is indicated by a respective dataset entry, the respective dataset entry including a data triplet comprising labels for each object in the respective object pair and a label for a respective predicate indicating the respective relationship between the respective object pair; and storing the one or more dataset entries in a first dataset. 2. The method of claim 1 , wherein: the one or more dataset entries stored in the first dataset include dataset entries associated with different respective images. 3. The method of claim 2 , wherein the method further comprises: searching the first dataset to identify a dataset entry that satisfies a query criterion; selecting the image associated with the identified dataset entry; cropping the selected image according to the object pair indicated by the identified dataset entry by: identifying, using the identified dataset entry, a bounding box for each object in the object pair indicated by the identified dataset entry; defining a union box by combining the respective bounding boxes; and cropping the selected image around the defined union box; and outputting the cropped image. 4. The method of claim 3 , wherein the cropping further comprises: prior to outputting the cropped image, applying one or more aesthetic rules to the cropped image to determine corresponding aesthetic regions of the cropped image; and further adjusting the cropped image according to the determined corresponding aesthetic regions. 5. The method of claim 1 , wherein generating the scene graph comprises: defining the plurality of nodes of the scene graph by detecting the plurality of objects within the image and defining each node of the scene graph corresponding to a respective detected object within the image; identifying the one or more object pairs formed by the detected objects; and defining the one or more connectors of the scene graph by performing relationship extraction between each respective object pair within the salient portions of the image indicated by the saliency map to extract the respective relationship, and defining each connector, corresponding to each respective relationship, between each respective pair of nodes corresponding to each respective object pair. 6. The method of claim 5 , wherein detecting the plurality of objects further comprises: performing object localization for each detected object to generate a bounding box of the detected object; for each respective bounding box: generating corresponding location parameters of the respective bounding box; and storing an identifier of the respective bounding box and the location parameters in a second dataset. 7. The method of claim 6 , wherein the method further comprising: searching the first dataset to identify a dataset entry that satisfies a query criterion; searching the second dataset to identify location parameters that correspond to respective bounding boxes of the respective objects identified in the object pair in the one or more dataset entry; cropping the image according to the identified location parameters by: combining the respective bounding boxes to define a union box; and cropping the image around the union box; and outputting the cropped image. 8. The method of claim 1 , wherein generating the scene graph comprises: generating one or more proposed regions of the image using a region proposal network (RPN); defining the plurality of nodes of the scene graph by detecting the plurality of objects in the one or more proposed regions and defining each node of the scene graph corresponding to a respective detected object within the one or more proposed regions; identifying the one or more object pairs formed by the detected objects; and defining the one or more connectors of the scene graph by performing relationship extraction between each respective object pair within the salient portions of the image indicated by the saliency map to extract the respective relationship, and defining each connector, corresponding to each respective relationship, between each respective pair of nodes corresponding to each respective object pair. 9. A method comprising: receiving a query criterion; searching a first dataset comprising one or more dataset entries, to identify a dataset entry that satisfies the query criterion, the identified dataset entry being associated with an image, the identified dataset entry including a data triplet comprising labels for each object in an object pair formed by a pair of objects detected in the image, the data triplet also comprising a label for a predicate indicating a relationship between the objects in the object pair, the objects and the relationship being represented by a scene graph; selecting the image associated with the identified dataset entry; cropping the selected image according to the object pair indicated by the identified dataset entry; and outputting the cropped image. 10. The method of claim 9 , wherein the cropping including: prior to outputting the cropped image, applying one or more aesthetic rules to the cropped image to determine corresponding aesthetic regions of the cropped image; and further adjusting the cropped image according to the determined corresponding aesthetic regions. 11. The method of claim 9 , wherein the method further comprises cropping the selected image by: identifying, using the identified dataset entry, a bounding box for each object in the object pair indicated by the identified dataset entry; defining a union box by combining the respective bounding boxes; and cropping the selected image around the defined union box. 12. The method of claim 9 , wherein the method further comprises: receiving at least one image; and the dataset is generated by: for each received image, generating a saliency map for the received image, wherein the saliency map indicates one or more salient portions of the image that have saliency values satisfying a saliency criterion; generating a respective scene graph for at least the one or more salient portions of the received image by: defining a plurality of nodes of the scene graph by detecting a respective plurality of objects within the image and defining each node of the scene graph corresponding to a respective detected object within the image; identifying one or more object pairs formed by the detected objects; and defining one or more connectors of the scene graph by performing relationship extraction between each respective object pair within the salient portions of the image indicated by the saliency map to extract the respective relationship, and defining each connector, corresponding to the respective relationship, between each respective pair of nodes corresponding to each respective object pair; generating the one or more dataset entrie

Assignees

Inventors

Classifications

  • G06F16/538Primary

    Presentation of query results · CPC title

  • Detection; Localisation; Normalisation · CPC title

  • Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11195046B2 cover?
Methods and systems for processing an image are described. A saliency map is generated from the image. The saliency map indicates one or more salient portions of the image that have saliency values satisfying a saliency criterion. A scene graph is generated for at least the one or more salient portions. The scene graph represents a plurality of objects detected in the image. The scene graph fur…
Who is the assignee on this patent?
Rao Varshanth Ravindra, Ahmad Uzair, Dai Peng, and 4 more
What technology area does this patent fall under?
Primary CPC classification G06F16/538. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 07 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).