Method of dynamically updating points of interest, electronic map system and cloud server device
US-2024328817-A1 · Oct 3, 2024 · US
US12555371B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12555371-B2 |
| Application number | US-202218080993-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 14, 2022 |
| Priority date | Dec 14, 2022 |
| Publication date | Feb 17, 2026 |
| Grant date | Feb 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mobile vision transformer network for use on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The mobile vision transformer network considers factors including number of parameters, latency, and model performance, as they reflect disk storage, mobile frames per second (FPS), and application quality, respectively. The mobile vision transformer network processes images, e.g., for image classification, segmentation, and detection. The mobile vision transformer network has a fine-grained architecture including a search algorithm performing latency-driven slimming that jointly improves model size and speed.
Opening claim text (preview).
What is claimed is: 1 . A system comprising a vision transformer, comprising: a convolution stem configured to embed an image, wherein the convolutional stem is represented by: 𝕏 i ❘ "\[RightBracketingBar]" i = 1 , j ❘ "\[RightBracketingBar]" j = 1 B , C j ❘ "\[RightBracketingBar]" j = 1 , H 4 , W 4 = stem ( 𝕏 0 B , 3 , H , W ) where B denotes a batch size, C refers to a channel dimension, H and W are a height and a width of a feature, j is a feature in stage j, j E {1, 2, 3, 4}, and i indicates the i-th layer; a unified feed forward network (FNN) coupled to the convolution stem and configured to capture local information, wherein the unified FNN is represented by: 𝕏 i + 1 , j B , C j , H 2 j + 1 , W 2 j + 1 = S i , j · FFN C j , E i , j ( 𝕏 i , j ) + 𝕏 i , j where S i,j is a learnable layer scale and the unified FNN is constructed by a stage width C j and a per-block expansion ratio E i,j ; global multi head self attention (MHSA) blocks coupled to the FNN and configured to model spatial dependencies of the image; and a learnable attention bias coupled to the MHSA blocks and configured to perform position encoding. 2 . The system of claim 1 , wherein the vision transformer comprises a fine-grained architecture including a search algorithm configured to perform latency-driven slimming that jointly improves model size and speed. 3 . The system of claim 1 , wherein the vision transformer network has a 4-stage hierarchical design. 4 . The system of claim 3 , wherein the vision transformer is configured to obtain feature sizes in ¼, ⅛, 1/16 and 1/32 of input resolution of the image. 5 . The system of claim 1 , wherein the global MHSA blocks are represented by: 𝕏 i + 1 , j B , C j , H 2 j + 1 , W 2 j + 1 = S i , j · MHSA ( Proj ( 𝕏 i , j ) ) + 𝕏 i ,
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
structured as a network, e.g. client-server architectures · CPC title
using neural networks · CPC title
Feedforward networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.