Device with speaker and image sensor

US12393396B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12393396-B2
Application numberUS-202318211515-A
CountryUS
Kind codeB2
Filing dateJun 19, 2023
Priority dateJun 21, 2022
Publication dateAug 19, 2025
Grant dateAug 19, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one implementation, a method of playing audio data is performed at a device including a frame configured for insertion into an outer ear, a speaker coupled to the frame, an image sensor coupled to the frame, one or more processors, and non-transitory memory. The method includes capturing, using the image sensor, one or more images of a physical environment. The method includes generating audio data based on the one or more images of the physical environment. The method includes playing, via the speaker, the audio data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: at a device including a frame configured for insertion into an outer ear, a speaker coupled to the frame, an image sensor coupled to the frame, one or more processors, and non-transitory memory: capturing, using the image sensor, one or more images of a physical environment; generating audio data based on the one or more images of the physical environment, wherein the image sensor has a device field-of-view different than a user field-of-view and the audio data is based on portions of the one or more images of the physical environment outside the user field-of-view; and playing, via the speaker, the audio data. 2. The method of claim 1 , wherein generating the audio data based on the one or more images of the physical environment includes transmitting, to a peripheral device, the one or more images of the physical environment and receiving, from the peripheral device, the audio data. 3. The method of claim 1 , wherein generating the audio data includes creating an audio signal. 4. The method of claim 1 , wherein generating the audio data includes altering an audio stream. 5. The method of claim 1 , wherein the device further includes a microphone configured to generate ambient sound data and wherein generating the audio data is further based on the ambient sound data. 6. The method of claim 5 , wherein the ambient sound data includes a vocal input. 7. The method of claim 1 , wherein the device further includes a microphone configured to generate ambient sound data and wherein generating the audio data is independent of the ambient sound data. 8. The method of claim 1 , wherein the device further includes an inertial measurement unit (IMU) configured to generate pose data and wherein generating the audio data is further based on the pose data. 9. The method of claim 1 , wherein the audio data is played spatially from a location based on the one or more images of the physical environment. 10. A device comprising: a frame configured for insertion into an outer ear; one or more processors coupled to the frame; a speaker coupled to the frame and configured to output sound based on audio data received from the one or more processors; and an image sensor coupled to the frame and configured to provide one or more images of a physical environment to the one or more processors, wherein the one or more processors are configured to generate the audio data based on the one or more images of the physical environment, and wherein the image sensor has a device field-of-view different than a user field-of-view and the audio data is based on portions of the one or more images of the physical environment outside the user field-of-view. 11. The device of claim 10 , wherein the one or more processors are configured to generate the audio data based on the one or more images of the physical environment by transmitting, to a peripheral device, the one or more images of the physical environment and receiving, from the peripheral device, the audio data. 12. The device of claim 10 , further comprising a microphone configured to generate ambient sound data, wherein the one or more processors are configured to generate the audio data further based on the ambient sound data. 13. The device of claim 10 , further comprising a microphone configured to generate ambient sound data, wherein the one or more processors are configured to generate the audio data independent of the ambient sound data. 14. The device of claim 10 , further comprising an inertial measurement unit (IMU) configured to generate pose data, wherein the one or more processors are configured to generate the audio data further based on the pose data. 15. The device of claim 10 , wherein the audio data is played spatially from a location based on the one or more images of the physical environment. 16. The device of claim 10 , wherein the image sensor includes a fisheye lens. 17. The device of claim 10 , wherein the frame is not physically coupled to a display. 18. A non-transitory memory storing one or more programs, which, when executed by one or more processors of a device including a frame configured for insertion into an outer ear, a speaker coupled to the frame, and an image sensor coupled to the frame cause the device to: capture, using the image sensor, one or more images of a physical environment; generate audio data based on the one or more images of the physical environment, wherein the image sensor has a device field-of-view different than a user field-of-view and the audio data is based on portions of the one or more images of the physical environment outside the user field-of-view; and play, via the speaker, the audio data.

Assignees

Inventors

Classifications

  • for receiving images from a single remote source · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Head tracking input arrangements · CPC title

  • the I/O peripheral being integrated loudspeakers · CPC title

  • Wearable computers, e.g. on a belt · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12393396B2 cover?
In one implementation, a method of playing audio data is performed at a device including a frame configured for insertion into an outer ear, a speaker coupled to the frame, an image sensor coupled to the frame, one or more processors, and non-transitory memory. The method includes capturing, using the image sensor, one or more images of a physical environment. The method includes generating aud…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/165. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 19 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).