What technology area does this patent fall under?

Primary CPC classification G06F17/24. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Enhancing documents portrayed in digital images

US10628519B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10628519-B2
Application number	US-201715658289-A
Country	US
Kind code	B2
Filing date	Jul 24, 2017
Priority date	Jul 22, 2016
Publication date	Apr 21, 2020
Grant date	Apr 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods that efficiently and effectively generate an enhanced document image of a displayed document in an image frame captured from a live image feed are disclosed. For example, systems and methods described herein apply a document enhancement process to a displayed document in an image frame that result in an enhanced document image that is cropped, rectified, un-shadowed, and with dark text against a mostly white background. Additionally, systems and method described herein determine whether a stored digital content item includes a displayed document. In response to determining that a stored digital content item does include a displayed document, systems and methods described herein generate an enhanced document image of a displayed document included in the stored digital content item.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing device comprising: at least one processor; and a non-transitory computer-readable medium storing instructions thereon that, when executed by the at least one processor, cause the computing device to: provide a graphical user interface comprising a live camera image feed in response to a user selection of a first option of a set of selectable options, the set of selectable options comprising the first option for scanning a document to a cloud-computing environment and a second option for uploading a file to the cloud-computing environment; detect, within the live camera image feed, a displayed document as a visual representation of a physical document; in response to detecting the displayed document within the live camera image feed and prior to an image frame capture, provide for display, within the graphical user interface, a live document boundary indicator associated with the displayed document within the live camera image feed; detect a user interaction with the graphical user interface while providing the live document boundary indicator associated with the displayed document; based on detecting the user interaction while providing the live document boundary indicator, capture from the live camera image feed an image frame that comprises the displayed document and excludes one or more portions displayed in the live camera image feed outside of the live document boundary indicator; process the image frame to generate, for upload to a user account in the cloud-computing environment, an enhanced document image corresponding to the displayed document within the live document boundary indicator; provide, for presentation on a display of the computing device, the enhanced document image; and convert the enhanced document image to a document file format. 2. The computing device as recited in claim 1 , wherein generating the enhanced document image comprises modifying the image frame with respect to the displayed document within the image frame. 3. The computing device as recited in claim 2 , wherein modifying the image frame comprises: detecting, without receiving user input and based on the live document boundary indicator, portions of the image frame that are not part of the displayed document; and cropping the image frame to remove the portions of the image frame that are not part of the displayed document. 4. The computing device as recited in claim 3 , wherein processing the image frame to generate the enhanced document image further comprises altering the displayed document within the cropped imaged frame. 5. The computing device of claim 4 , wherein altering the displayed document comprises at least one of: rectifying the displayed document, converting the displayed document to grayscale, or denoising the displayed document. 6. The computing device as recited in claim 4 , wherein altering the displayed document comprises correcting a background of the displayed document. 7. The computing device as recited in claim 6 , wherein correcting the background of the displayed document comprises: creating a subsampled version of the displayed document; and optimizing the subsampled version of the displayed document by solving an objective function that penalizes deviations from white within the subsampled version and penalizes deviations in gradient within the subsampled version to generate an optimized subsampled version. 8. The computing device as recited in claim 7 , wherein the non-transitory computer-readable medium further comprises instructions thereon that, when executed by the at least one processor, cause the computing device to: perform a Fourier Domain transfer of the subsampled version of the displayed document; solve the objective function in the Fourier Domain; and perform an inverse Fourier Domain transfer to generate the optimized subsampled version of the displayed document. 9. The computing device as recited in claim 7 , wherein the non-transitory computer-readable medium further comprises instructions thereon that, when executed by the at least one processor, cause the computing device to upsample the optimized subsampled version of the displayed document to generate a tri-map version of the displayed document that identifies background pixels, foreground pixels, and unknown pixels. 10. The computing device as recited in claim 9 , wherein the non-transitory computer-readable medium further comprises instructions thereon that, when executed by the at least one processor, cause the computing device to assign each of the unknown pixels as either a background pixel or a foreground pixel by estimating a background color of each of the unknown pixels. 11. A non-transitory computer-readable medium storing instructions thereon that, when executed by at least one processor, cause a computer system to: provide a graphical user interface comprising a live camera image feed in response to a user selection of a first option of a set of selectable options, the set of selectable options comprising the first option for scanning a document to a cloud-computing environment and a second option for uploading a file to the cloud-computing environment; detect, within the live camera image feed, a displayed document as a visual representation of a physical document; in response to detecting the displayed document within the live camera image feed and prior to an image frame capture, provide for display, within the graphical user interface, a live document boundary indicator associated with the displayed document within the live camera image feed; detect a user interaction with the graphical user interface while providing the live document boundary indicator associated with the displayed document; based on detecting the user interaction while providing the live document boundary indicator, capture from the live camera image feed an image frame that comprises the displayed document and excludes one or more portions displayed in the live camera image feed outside of the live document boundary indicator; process the image frame to generate, for upload to a user account in the cloud-computing environment, an enhanced document image corresponding to the displayed document within the live document boundary indicator; and provide, for presentation on a display of the computer system, the enhanced document image; and convert the enhanced document image to a document file format. 12. The non-transitory computer-readable medium recited in claim 11 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: receive user input indicating one or more edits to the enhanced document image; and modify the enhanced document image in accordance with the one or more edits. 13. The non-transitory computer-readable medium recited in claim 11 , wherein processing the image frame to generate the enhanced document image comprises altering a border of the displayed document to create a rectangular enhanced document image. 14. The non-transitory computer-readable medium recited in claim 11 , wherein generating the enhanced document image comprises: converting the displayed document from a color version to a grayscale version; and recoloring the displayed document prior to providing the enhanced document image. 15. A method comprising: receiving, at an online content management system and from a client device, a digital content item; determining, by at least one processor, that the digital content item comprises a displayed document; associating metadata that includes a digital tag or line item with the digital content it

Assignees

Dropbox Inc

Inventors

Classifications

G06F40/103
Formatting, i.e. changing of presentation of documents (automatic justification G06F40/189; automatic line break hyphenation G06F40/191) · CPC title
G06F40/106
Display of layout of documents; Previewing · CPC title
G06T2210/22
Cropping · CPC title
G06T2207/20056
Discrete and fast Fourier transform, [DFT, FFT] · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

View patent family 60988062

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10628519B2 cover?: Systems and methods that efficiently and effectively generate an enhanced document image of a displayed document in an image frame captured from a live image feed are disclosed. For example, systems and methods described herein apply a document enhancement process to a displayed document in an image frame that result in an enhanced document image that is cropped, rectified, un-shadowed, and wit…
Who is the assignee on this patent?: Dropbox Inc
What technology area does this patent fall under?: Primary CPC classification G06F17/24. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Live document detection in a captured video stream

Causation of rendering of information indicative of a printed document interaction attribute

Synchronized, interactive augmented reality displays for multifunction devices

Method to use augumented reality to function as hmi display

Prepopulating application forms using real-time video analysis of identified objects

Frequently asked questions