Techniques for interactive image segmentation networks

US12112482B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12112482-B2
Application numberUS-202117161139-A
CountryUS
Kind codeB2
Filing dateJan 28, 2021
Priority dateJan 28, 2021
Publication dateOct 8, 2024
Grant dateOct 8, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments are generally directed to techniques for image segmentation utilizing context, such as with a machine learning (ML) model that injects context into various training stages. Many embodiments utilize one or more of an encoder-decoder model topology and select criteria and parameters in hyper-parameter optimization (HPO) to conduct the best model neural architecture search (NAS). Some embodiments are particularly directed to resizing context frames to a resolution that corresponds with a particular stage of decoding. In several embodiments, the context frames are concatenated with one or more of data from a previous decoding stage and data from a corresponding encoding stage prior to being provided as input to a next decoding stage.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: interface circuitry; instructions; and one or more first processor circuits to utilize the instructions to: conduct a neural architecture search to select respective values for hyperparameters of a machine learning (ML) model, the ML model associated with a neural architecture having one or more encoders, one or more decoders, and one or more skip connections between the one or more encoders and the one or more decoders, the hyperparameters indicative of a first number of the one or more encoders and the one or more decoders to be included in the neural architecture, a second number of convolutional layers within respective ones of the one or more decoders, and a size of dilation in the respective ones of the one or more decoders; and cause the interface circuitry to deploy the ML model with the respective values for the hyperparameters, the ML model to cause one or more second processor circuits to: based on a first encoding frame with a first resolution, generate, with a first encoder, first output data including a second encoding frame with a second resolution; and generate, with a first decoder, second output data including a first decoding frame with the first resolution based on a second decoding frame with the second resolution and a context frame with the second resolution. 2. The apparatus of claim 1 , wherein the context frame is a first context frame, and the ML model is to cause at least one of the one or more second processor circuits to resize a second context frame with an original resolution to produce the first context frame with the second resolution. 3. The apparatus of claim 1 , wherein the ML model is to cause at least one of the one or more second processor circuits to concatenate the second decoding frame with the second resolution and the context frame with the second resolution prior to generation of the second output data with the first decoder. 4. The apparatus of claim 1 , wherein the ML model is to cause at least one of the one or more second processor circuits to generate, with the first decoder, the second output data including the first decoding frame with the first resolution based on the second decoding frame with the second resolution, the context frame with the second resolution, and one or more portions of the first output data generated by the first encoder. 5. The apparatus of claim 4 , wherein the ML model is to cause at least one of the one or more second processor circuits to: concatenate the second decoding frame with the second resolution, the context frame with the second resolution, and the one or more portions of the first output data generated by the first encoder to produce decoder input data; and provide the decoder input data to the first decoder for generation of the second output data including the first decoding frame with the first resolution. 6. The apparatus of claim 1 , wherein at least one of the first encoding frame with the first resolution, the second encoding frame with the second resolution, the first decoding frame with the first resolution, or the second decoding frame with the second resolution includes at least one feature map. 7. The apparatus of claim 1 , wherein the ML model is to cause at least one of the one or more second processor circuits to perform a convolution operation on the first encoding frame with the first resolution to generate the first output data including the second encoding frame with the second resolution. 8. The apparatus of claim 1 , wherein the ML model is to cause at least one of the one or more second processor circuits to perform a de-convolution operation on the second decoding frame with the second resolution to generate the second output data including the first decoding frame with the first resolution. 9. At least one non-transitory computer-readable medium comprising instructions that cause one or more first processor circuits to: conduct a neural architecture search to select respective values for hyperparameters of a machine learning (ML) model, the ML model associated with a neural architecture having one or more encoders, one or more decoders, and one or more skip connections between the one or more encoders and the one or more decoders, the hyperparameters indicative of a first number of the one or more encoders and the one or more decoders to be included in the neural architecture, a second number of convolutional layers within respective ones of the one or more decoders, and a size of dilation in the respective ones of the one or more decoders; and cause interface circuitry to deploy the ML model with the respective values for the hyperparameters, the ML model to cause one or more second processor circuits to: based on a first encoding frame with a first resolution, generate, with a first encoder, first output data including a second encoding frame with a second resolution; and generate, with a first decoder, second output data including a first decoding frame with the first resolution based on a second decoding frame with the second resolution and a context frame with the second resolution. 10. The at least one non-transitory computer-readable medium of claim 9 , wherein the context frame is a first context frame, and the ML model is to cause at least one of the one or more second processor circuits to resize a second context frame with an original resolution to produce the first context frame with the second resolution. 11. The at least one non-transitory computer-readable medium of claim 9 , wherein the ML model is to cause at least one of the one or more second processor circuits to concatenate the second decoding frame with the second resolution and the context frame with the second resolution prior to generation of the second output data with the first decoder. 12. The non-transitory computer-readable medium of claim 9 , wherein the ML model is to cause at least one of the one or more second processor circuits to generate, with the first decoder, the second output data including the first decoding frame with the first resolution based on the second decoding frame with the second resolution, the context frame with the second resolution, and one or more portions of the first output data generated by the first encoder. 13. The non-transitory computer-readable medium of claim 12 , wherein the ML model is to cause at least one of the one or more second processor circuits to: concatenate the second decoding frame with the second resolution, the context frame with the second resolution, and the one or more portions of the first output data generated by the first encoder to produce decoder input data; and provide the decoder input data to the first decoder for generation of the second output data including the first decoding frame with the first resolution. 14. An apparatus, comprising: means for optimizing respective values for hyperparameters of a machine learning (ML) model, the ML model associated with a neural architecture having one or more encoders, one or more decoders, and one or more skip connections between the one or more encoders and the one or more decoders, the hyperparameters indicative of a first number of the one or more encoders and the one or more decoders to be included in the neural architecture, a second number of convolutional layers within respective ones of the one or more decoders, and a size of dilation in the respective ones of the one or more decoders; and means for deploying the ML model with the respective values for the hyperparameters, the ML model to cause: means for generating, with a first encoder, first output data based on a first encodi

Assignees

Inventors

Classifications

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12112482B2 cover?
Various embodiments are generally directed to techniques for image segmentation utilizing context, such as with a machine learning (ML) model that injects context into various training stages. Many embodiments utilize one or more of an encoder-decoder model topology and select criteria and parameters in hyper-parameter optimization (HPO) to conduct the best model neural architecture search (NAS…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).