Deep learning for three dimensional (3d) gaze prediction
US-2021042520-A1 · Feb 11, 2021 · US
US11527082B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11527082-B2 |
| Application number | US-201916764313-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 12, 2019 |
| Priority date | Jun 17, 2019 |
| Publication date | Dec 13, 2022 |
| Grant date | Dec 13, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to the techniques of this disclosure, a method includes capturing, using a camera system of a vehicle, at least one image of an occupant of the vehicle, determining, based on the at least one image of the occupant, a location of one or more eyes of the occupant within the vehicle, and determining, based on the at least one image of the occupant, an eye gaze vector. The method may also include determining, based on the eye gaze vector, the location of the one or more eyes of the occupant, and a vehicle data file of the vehicle, a region of interest from a plurality of regions of interests of the vehicle at which the occupant is looking, wherein the vehicle data file specifies respective locations of each of the plurality of regions of interest, and selectively performing, based on the region of interest, an action.
Opening claim text (preview).
What is claimed is: 1. A method comprising: obtaining, via a camera system of a vehicle, at least one image of an occupant of the vehicle; identifying one or more facial landmarks in the at least one image; determining, based on the one or more facial landmarks, a pitch angle, a roll angle, and a yaw angle of a facial plane of the occupant; determining, based the facial plane, a first initial eye gaze vector; determining, based on the at least one image of the occupant, a location of one or more eyes of the occupant within the vehicle; determining, based on the location of the one or more eyes, a second initial eye gaze vector; determining an eye gaze vector by at least combining the first initial eye gaze vector and the second initial eye gaze vector; determining, based on a projection of the eye gaze vector from the location of the one or more eyes, and a vehicle data file of the vehicle, a region of interest at which the occupant is looking from a plurality of regions of interests of the vehicle, wherein the vehicle data file specifies respective locations of each of the plurality of regions of interest, and wherein the projection of the eye gaze vector intersects the region of interest; and selectively performing, by a computing system of the vehicle and based on the region of interest at which the occupant is looking, an action. 2. The method of claim 1 , wherein determining the second initial eye gaze vector comprises: determining, based on the at least one image, an angle of at least one pupil the occupant; and determining, based on the angle of the at least one pupil, the second initial eye gaze vector. 3. The method of claim 1 , wherein determining the eye gaze vector comprises: applying at least one machine-learned model to the at least one image, wherein the machine-learned model outputs the eye gaze vector. 4. The method of claim 1 , wherein the at least one image comprises at least one respective image captured by each of two or more different cameras of the camera system, and wherein determining the location of the one or more eyes of the occupant within the vehicle comprises: determining, based on the at least one respective image captured by each of the two or more different cameras, a parallax angle; determining, based on respective locations of each of the two or more different cameras and the parallax angle, a distance from at least one of the two or more different cameras to the one or more eyes of the occupant; and determining, based on the distance and the respective locations of each of the two or more different cameras, the location of the one or more eyes of the occupant. 5. The method of claim 1 , wherein the at least one image comprises an image captured using an infrared camera of the camera system, and wherein determining the location of the one or more eyes of the occupant within the vehicle comprises: determining, based on distortion of the image, a distance from the infrared camera to the one or more eyes of the occupant; and determining, based on the location of the infrared camera and the distance, the location of the one or more eyes of the occupant. 6. The method of claim 1 , wherein the location of the one or more eyes of the occupant within the vehicle is specified using a camera-based coordinate system having one camera of the camera system as a centroid, wherein the respective locations of each of the plurality of regions of interest are specified using a vehicle-based coordinate system having a centroid located in an interior of the vehicle and is different from the location of the one camera, and wherein determining the region of interest at which the occupant is looking comprises: transforming the location of the one or more eyes from the camera-based coordinate system to the vehicle-based coordinate system; determining whether the projection of the eye gaze vector from the location of the one or more eyes specified using the vehicle-based coordinate system intersects with any of the plurality of regions of interest; and responsive to determining that the eye gaze vector intersects a particular region of interest from the plurality of regions of interest, determining that the particular region of interest is the region of interest at which the occupant is looking. 7. The method of claim 1 , wherein the vehicle data file includes data structured in accordance with extensible markup language, wherein the vehicle data file includes a respective set of coordinates for each region of interest from the plurality of regions of interest, wherein each of the respective coordinate sets are defined relative to a centroid of a sphere that encompasses an interior of the vehicle, and wherein each of the respective sets of coordinate define a two-dimensional plane. 8. A computing device comprising: at least one processor; a camera system; and memory comprising instructions that, when executed by the at least one processor, cause the at least one processor to: obtain, via the camera system, at least one image of an occupant of a vehicle; identify one or more facial landmarks in the at least one image; determine, based on the one or more facial landmarks, a pitch angle, a roll angle, and a yaw angle of a facial plane of the occupant; determine, based the facial plane, a first initial eye gaze vector; determine, based on the at least one image of the occupant, a location of one or more eyes of the occupant within the vehicle; determine, based on the location of the one or more eyes, a second initial eye gaze vector; determine an eye gaze vector by at least combining the first initial eye gaze vector and the second initial eye gaze vector; determine, based on a projection of the eye gaze vector from the location of the one or more eyes, and a vehicle data file of the vehicle, a region of interest at which the occupant is looking from a plurality of regions of interests of the vehicle, wherein the vehicle data file specifies respective locations of each of the plurality of regions of interest, and wherein the projection of the eye gaze vector intersects the region of interest; and selectively perform, based on the region of interest at which the occupant is looking, an action. 9. The computing device of claim 8 , wherein the instructions are executable by the at least one processor to determine the second initial eye gaze vector by at least being executable to: determine, based on the at least one image, an angle of at least one pupil the occupant; and determine, based on the angle of the at least one pupil, the second initial eye gaze vector. 10. The computing device of claim 8 , wherein the instructions are executable by the at least one processor to determine the eye gaze vector by at least being executable to: apply at least one machine-learned model to the at least one image, wherein the machine-learned model outputs the eye gaze vector. 11. The computing device of claim 8 , wherein: the camera system includes two or more different cameras; the at least one image comprises at least one respective image captured by each of the two or more different cameras; and the instructions are executable by the at least one processor to determine the location of the one or more eyes of the occupant within the vehicle by at least being executable to: determine, based on the at least one respective image captured by each of the two or more different cameras, a parallax angle; determine, based on respective locations of each of the two or more different cameras and the parallax angle, a distance from at least one of the two or more different cameras to the one or more eyes of the occupant; and determine, based on the dis
Sensors therefor · CPC title
Recognising the driver's state or behaviour, e.g. attention or drowsiness · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
Sensing or illuminating at different wavelengths · CPC title
Infrared image · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.