Systems and methods for object detection using image tiling

US12444168B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12444168-B2
Application numberUS-201917622462-A
CountryUS
Kind codeB2
Filing dateAug 5, 2019
Priority dateAug 5, 2019
Publication dateOct 14, 2025
Grant dateOct 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing system for detecting objects in an image can perform operations including generating an image pyramid that includes a first level corresponding with the image at a first resolution and a second level corresponding with the image at a second resolution. The operations can include tiling the first level and the second level by dividing the first level into a first plurality of tiles and the second level into a second plurality of tiles; inputting the first plurality of tiles and the second plurality of tiles into a machine-learned object detection model; receiving, as an output of the machine-learned object detection model, object detection data that includes bounding boxes respectively defined with respect to individual ones of the first plurality of tiles and the second plurality of tiles; and generating image object detection output by mapping the object detection data onto an image space of the image.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system comprising: at least one processor; a preliminary machine-learned object detection model configured to receive an image, and, in response to receipt of the image, output an intermediate feature representation; a machine-learned object detection model configured to receive a plurality of tiles, and, in response to receipt of the plurality of tiles, output object detection data for the plurality of tiles, the object detection data comprising a plurality of bounding boxes respectively defined with respect to individual ones of the plurality of tiles; and at least one tangible, non-transitory computer-readable medium that stores instructions that, when executed by the at least one processor, cause the at least one processor to perform operations, the operations comprising: generating an image pyramid based on the image having an image space, the image pyramid comprising a first level corresponding with the image at a first resolution and a second level corresponding with the image at a second resolution that is different than the first resolution, wherein generating the image pyramid based on the image comprises: inputting the image into the preliminary machine-learned object detection model, the image being input as a plurality of preliminary tiles; receiving, as an output of the preliminary machine-learned object detection model, the intermediate feature representation, the intermediate feature representation corresponding with the plurality of preliminary tiles; and generating the first level and the second level of the image pyramid based on the intermediate feature representation; tiling the first level and the second level by dividing the first level into a first plurality of tiles and the second level into a second plurality of tiles; inputting the first plurality of tiles and the second plurality of tiles into the machine-learned object detection model; receiving, as an output of the machine-learned object detection model, the object detection data comprising the plurality of bounding boxes respectively defined with respect to individual ones of the first plurality of tiles and the second plurality of tiles; and generating an image object detection output by mapping the object detection data onto the image space of the image. 2. The computing system of claim 1 , wherein the operations further comprise: identifying at least one bounding box of the image object detection output based on the at least one bounding box intersecting a border of one or more of the first plurality of tiles or second plurality of tiles; and removing the at least one bounding box from the image object detection output. 3. The computing system of claim 2 , wherein the at least one bounding box is identified based on the at least one bounding box spanning across the one or more of the first plurality of tiles or the second plurality of tiles such that the at least one bounding box intersects the border and an opposite border of the one or more of the first plurality of tiles or the second plurality of tiles that is parallel with the border. 4. The computing system of claim 2 , wherein the at least one bounding box is identified based on the at least one bounding box intersecting the border of the one or more of the first plurality of tiles or the second plurality of tiles and intersecting an edge of the respective level of the image pyramid. 5. The computing system of claim 2 , wherein removing the at least bounding box from the image object detection output comprises removing each bounding box that intersects any of a plurality of borders of the first plurality of tiles or second plurality of tiles. 6. A method for training a machine learned object detection model, the method comprising: for each training image of a plurality of training images: generating, by one or more computing devices, an image pyramid based on the respective training image having a respective image space, the image pyramid comprising a first level corresponding with the respective training image at a first resolution and a second level corresponding with the respective training image at a second resolution that is different than the first resolution, wherein generating the image pyramid based on the training image comprises: inputting the training image into a preliminary machine-learned object detection model, the training image being input as a plurality of preliminary tiles; receiving, as an output of the preliminary machine-learned object detection model, an intermediate feature representation, the intermediate feature representation corresponding to the plurality of preliminary tiles; and generating the first level and the second level of the image pyramid based on the intermediate feature representation corresponding; tiling, by the one or more computing devices, the first level and the second level by dividing the first level into a first plurality of tiles and the second level into a second plurality of tiles; inputting, by the one or more computing devices, the first plurality of tiles and the second plurality of tiles into a machine-learned object detection model; receiving, by the one or more computing devices and as an output of the machine-learned object detection model, object detection data comprising the plurality of bounding boxes respectively defined with respect to individual ones of the first plurality of tiles and the second plurality of tiles; generating, by the one or more computing devices, an image object detection output by mapping the object detection data onto the respective image space of the respective training image; and adjusting, by the one or more computing devices, parameters of the machine-learned object detection model based on a comparison of the image object detection output with ground truth object location data that corresponds to the respective training image of the plurality of training images. 7. The method of claim 6 , further comprising: identifying, by the one or more computing devices, at least one bounding box of the image object detection output based on the at least one bounding box intersecting a border of one or more of the first plurality of tiles or second plurality of tiles; and removing, by the one or more computing devices, the at least bounding box from the image object detection output. 8. The method of claim 7 , wherein the at least one bounding box is identified, by the one or more computing devices, based on the at least one bounding box spanning across the one or more of the first plurality of tiles or the second plurality of tiles such that the at least one bounding bod intersects the border and an opposite border of the one or more of the first plurality of tiles or the second plurality of tiles that is parallel with the border. 9. The method of claim 7 , wherein the at least one bounding box is identified, by the one or more computing devices, based on the at least one bounding box intersecting the border of the one or more of the first plurality of tiles or the second plurality of tiles and intersecting an edge of the respective level of the image pyramid. 10. The method of claim 7 , wherein removing, by the one or more computing devices, the at least bounding box from the image object detection output comprises removing, by the one or more computing devices, each bounding box that intersects any of a plurality of borders of the first plurality of tiles or second plurality of tiles. 11. The method of claim 6 , wherein: inputting the training image into the preliminary machine-learned object detection model comprises: tiling the respective training image into the plurality of preliminary tiles; inputting the pl

Assignees

Inventors

Classifications

  • Region-based matching · CPC title

  • Active pattern-learning, e.g. online learning of image or video features · CPC title

  • G06V10/765Primary

    using rules for classification or partitioning the feature space · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12444168B2 cover?
A computing system for detecting objects in an image can perform operations including generating an image pyramid that includes a first level corresponding with the image at a first resolution and a second level corresponding with the image at a second resolution. The operations can include tiling the first level and the second level by dividing the first level into a first plurality of tiles a…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06V10/765. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).