Automated inspection system
US-2024420305-A1 · Dec 19, 2024 · US
US12380640B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12380640-B2 |
| Application number | US-202218071821-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 30, 2022 |
| Priority date | Nov 30, 2022 |
| Publication date | Aug 5, 2025 |
| Grant date | Aug 5, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint. A depth adaptor processes depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene. The adapted depth map, color data, and scene geometry information from an external dataset are provided to a discriminator for selection of a 3D representation of the input scene.
Opening claim text (preview).
What is claimed is: 1. A method of generating a three-dimensional (3D) object or scene from non-aligned generic camera priors, the method comprising: producing, by a 3D scene generator, a tri-plane representation for an input scene received in random latent code; obtaining, by a camera generator, a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions of the generic camera priors; volumetrically rendering, by a volume renderer, an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint; processing, by a depth adaptor, the depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene; providing the adapted depth map and color data to a discriminator; providing external scene geometry information from an external dataset to the discriminator; and selecting, by the discriminator, a 3D representation of the input scene based on the color data, adapted depth map, and external scene geometry information. 2. The method of claim 1 , wherein obtaining color and density data from the random latent code and generic camera priors comprises using a shallow 2-layer multi-layer perceptron (MLP) decoder to sample arbitrary camera viewpoints captured from ball-in-sphere camera parameterizations provided to the camera generator, the ball-in-sphere camera parameterization having four additional degrees of freedom including a field of view and pitch, yaw and radius of an inner sphere specifying a look-at point within an outer sphere of the ball-in-sphere camera parameterizations. 3. The method of claim 1 , further comprising learning the arbitrary camera viewpoint during training for each input dataset. 4. The method of claim 1 , further comprising pushing derivatives of predicted camera parameters with respect to prior camera parameters to either one or minus one to arrive at a camera gradient penalty L φi : ℒ φ i = ❘ "\[LeftBracketingBar]" ∂ φ i ∂ φ i ′ ❘ "\[RightBracketingBar]" + ❘ "\[LeftBracketingBar]" ∂ φ i ∂ φ i ′ ❘ "\[RightBracketingBar]" - 1 , where φ′ i ∈ φ′ is a camera sampled from a prior camera distribution and φ i ∈ φ is produced by the camera generator. 5. The method of claim 1 , wherein volumetrically rendering the image of the input scene from the color and density data comprises rendering depths d by volumetric rendering as follows: d = ∫ t n t f T ( t ) σ ( r ( t ) ) tdt , where t n and t f are near/far planes, T(t) is accumulated transmittance, and r(t) is a ray. 6. The method of claim 5 , wherein volumetrically rendering the image of the input scene from the color and density data further comprises shifting and scaling a depth d from a range of [t n , t f ] into [−1, 1] to obtain normalized depth d : d _ = 2 · d - ( t n + t f + b ) / 2 t f - t n - b , where b ∈ [0, (t n +t f )/2] is an additional learnable shift that accounts for empty space in front of a camera. 7. The method of claim 1 , wherein processing the depth values comprises producing the adapted depth map as a function of a normalized depth where the depth values are concatenated with RGB color data input and passed to the discriminator. 8. The method of claim 7 , wherein processing the depth values comprises using a convolutional network to
Color image · CPC title
using neural networks · CPC title
Determination of colour characteristics · CPC title
Depth or shape recovery · CPC title
Perspective computation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.