What technology area does this patent fall under?

Primary CPC classification G06T3/4053. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 17 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Creating augmented reality self-portraits using machine learning

US10839577B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10839577-B2
Application number	US-201816177408-A
Country	US
Kind code	B2
Filing date	Oct 31, 2018
Priority date	Sep 8, 2017
Publication date	Nov 17, 2020
Grant date	Nov 17, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, apparatuses and non-transitory, computer-readable storage mediums are disclosed for generating AR self-portraits or “AR selfies.” In an embodiment, a method comprises: capturing, by a first camera of a mobile device, image data, the image data including an image of a subject in a physical, real-world environment; receiving, by a depth sensor of the mobile device, depth data indicating a distance of the subject from the camera in the physical, real-world environment; receiving, by one or more motion sensors of the mobile device, motion data indicating at least an orientation of the first camera in the physical, real-world environment; generating a virtual camera transform based on the motion data, the camera transform for determining an orientation of a virtual camera in a virtual environment; and generating a composite image data, using the image data, a matte and virtual background content selected based on the virtual camera orientation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: capturing, by a first camera of a mobile device, image data, the image data including an image of a subject in a physical, real-world environment; capturing, by a depth sensor of the mobile device, depth data indicating a distance of the subject from the camera in the physical, real-world environment; capturing, by one or more motion sensors of the mobile device, motion data indicating at least an orientation of the first camera in the physical, real-world environment; generating, by one or more processors of the mobile device, a virtual camera transform based on the motion data, the camera transform for determining an orientation of a virtual camera in a virtual environment; generating, by the one or more processors, a matte from the image data and the depth data, wherein generating the matte includes: inputting the image data and the depth data into a neural network; generating, by the neural network, a low-resolution matte using the image data and the depth data; and processing the low-resolution matte to remove artifacts in the low-resolution matte; generating a high-resolution matte from the processed low-resolution matte, where the high-resolution matte has higher resolution than the low-resolution matte; generating, by the one or more processors, a composite image data, using the image data, the high-resolution matte and a virtual background content, the virtual background content selected from the virtual environment using the camera transform; and causing to display, by the one or more processors, the composite image data on a display of the mobile device. 2. The method of claim 1 , wherein processing the low-resolution matte to remove artifacts in the low-resolution matte, further comprises: generating an inner matte and an outer matte from at least one of a bounding box including a face of the subject or a histogram of the depth data; generating a hole-filled matte from the inner matte; generating a shoulder/torso matte from the hole-filled inner matte; dilating the inner matte using a first kernel; dilating the outer matte using a second kernel smaller than the first kernel; generating a garbage matte from an intersection of the dilated inner matte and the dilated outer matte; combining the low-resolution matte with the garbage matte to create a face matte; combining the face matte and the shoulder/torso matte into a denoised matte; and generating the high-resolution matte from the denoised matte. 3. The method of claim 2 , further comprising: applying a temporal filter to the high-resolution matte to generate a final matte; and generating the composite image data, using the image data, the final matte and the virtual background content. 4. The method of claim 3 , wherein applying the temporal filter to the high-resolution matte to generate a final matte further comprises: generating a per-pixel similarity map based on the image data and previous image data; and applying the temporal filter to the high-resolution matte using the similarity map and a previous final matte. 5. The method of claim 4 , wherein the temporal filter is a linear weighted average of two frames with weights calculated per-pixel dependent on pixel similarity represented by the per-pixel similarity map. 6. The method of claim 2 , wherein generating the high-resolution matte from the processed low-resolution matte, further comprises: generating a luma image from the image data; and upsampling, using a guided filter and the luma image, the denoised matte to the high-resolution matte. 7. The method of claim 2 , wherein generating the shoulder/torso matte from the hole-filled inner matte further comprises: dilating the inner matte to generate the hole-filled matte; and eroding the hole-filled matte to generate the shoulder/torso matte. 8. The method of claim 2 , wherein the inner matte includes depth data that is less than a depth threshold, the outer matte includes depth data that is less than the depth threshold or is unknown, and the depth threshold is determined by an average depth of a center region of the subject's face detected in the image data and an offset to include the back of the subject's head. 9. The method of claim 1 , wherein the neural network is a convolutional neural network for image segmentation. 10. A method comprising: presenting a preview on a display of a mobile device, the preview including sequential frames of preview image data captured by a forward-facing camera of a mobile device positioned in close range of a subject, the sequential frames of preview image data including close range image data of the subject and image data of a background behind the subject in a physical, real world environment; receiving a first user input to apply a virtual environment effect; capturing, by a depth sensor of the mobile device, depth data indicating a distance of the subject from the forward-facing camera in the physical, real-world environment; capturing, by one or more sensors of the mobile device, orientation data indicating at least an orientation of the forward-facing camera in the physical, real-world environment; generating, by one or more processors of the mobile device, a camera transform based on the orientation data, the camera transform describing an orientation of a virtual camera in a virtual environment; generating, by the one or more processors, a matte from the sequential frames of image data and the depth data, wherein generating the matte includes: inputting the image data and the depth data into a neural network; generating, by the neural network, a low-resolution matte using the image data and the depth data; and processing the low-resolution matte to remove artifacts in the low-resolution matte; generating a high-resolution matte from the processed low-resolution matte, where the high-resolution matte has higher resolution than the low-resolution matte; generating, by the one or more processors, composite sequential frames of image data, including the sequential frames of image data, the high-resolution matte and a virtual background content, the virtual background content selected from the virtual environment using the camera transform; and causing display, by the one or more processors, of the composite sequential frames of image data. 11. A system comprising: a display; a camera; a depth sensor; one or more motion sensors; one or more processors; memory coupled to the one or more processors and storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations comprising: capturing, by the camera, image data, the image data including an image of a subject in a physical, real-world environment; capturing, by the depth sensor, depth data indicating a distance of the subject from the camera in the physical, real-world environment; capturing, by the one or more motion sensors, motion data indicating at least an orientation of the camera in the physical, real-world environment; generating a virtual camera transform based on the motion data, the camera transform for determining an orientation of a virtual camera in a virtual environment; generating a matte from the image data and the depth data, wherein generating the matte includes: inputting the image data and the depth data into a neural network; generating, by the neural network, a low-resolution matte using the image data and the depth data; and processing the low-resolution matte to remove artifacts in the low-resolution matte; generating a high-resolution matte from the processed low-resolution matte, where the high-resolution matte has higher re

Assignees

Apple Inc

Inventors

Classifications

H04N23/632
for displaying or modifying preview images prior to image capturing, e.g. variety of image resolutions or capturing parameters · CPC title
H04N23/631
Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters · CPC title
G06T3/4053Primary
based on super-resolution, i.e. the output image resolution being higher than the sensor resolution · CPC title
H04N5/272
Means for inserting a foreground image in a background image, i.e. inlay, outlay · CPC title
G06T11/00
Two-dimensional [2D] image generation · CPC title

Patent family

Related publications grouped by family.

View patent family 65631377

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10839577B2 cover?: Systems, methods, apparatuses and non-transitory, computer-readable storage mediums are disclosed for generating AR self-portraits or “AR selfies.” In an embodiment, a method comprises: capturing, by a first camera of a mobile device, image data, the image data including an image of a subject in a physical, real-world environment; receiving, by a depth sensor of the mobile device, depth data in…
Who is the assignee on this patent?: Apple Inc
What technology area does this patent fall under?: Primary CPC classification G06T3/4053. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 17 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).