Machine-Learned Models for User Interface Prediction, Generation, and Interaction Understanding

US2022382565A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022382565-A1
Application numberUS-202117335596-A
CountryUS
Kind codeA1
Filing dateJun 1, 2021
Priority dateJun 1, 2021
Publication dateDec 1, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generally, the present disclosure is directed to user interface understanding. More particularly, the present disclosure relates to training and utilization of machine-learned models for user interface prediction and/or generation. A machine-learned interface prediction model can be pre-trained using a variety of pre-training tasks for eventual downstream task training and utilization (e.g., interface prediction, interface generation, etc.).

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for training and utilization of machine-learned models for user interface interaction understanding, comprising: obtaining, by a computing system comprising one or more computing devices, interface data that comprises a sequence of two or more user interfaces obtained through performance of one or more user interactions which result in generation of the sequence of two or more user interfaces, wherein, for each user interface in the sequence of two or more user interfaces, the interface data comprises one or more interface images depicting the user interface; determining, by the computing system, a plurality of intermediate embeddings based at least in part on the interface data; processing, by the computing system, the plurality of intermediate embeddings with a machine-learned interface prediction model to obtain one or more user interface embeddings; and performing, by the computing system, a pre-training task based at least in part on the one or more user interface embeddings to obtain a pre-training output. 2 . The computer-implemented method of claim 1 , wherein the method further comprises: evaluating, by the computing system, a loss function that evaluates a difference between ground truth data and the pre-training output; and adjusting, by the computing system, one or more parameters of the machine-learned interface prediction model based at least in part on the loss function. 3 . The computer-implemented method of claim 1 , wherein the interface data further comprises data descriptive of one or more link components, wherein the one or more link components comprise elements of the user interfaces which are the targets of the user interactions. 4 . The computer-implemented method of claim 3 , wherein the data descriptive of one or more link components comprises images of the one or more link components. 5 . The computer-implemented method of claim 1 , wherein: performing the pre-training task comprises processing, by the computing system, the one or more user interface embeddings with the machine-learned interface prediction model or a separate pre-training prediction head to obtain the pre-training output; and the pre-training output comprises a predicted link component for a sequential pair of the two or more user interfaces, the predicted link component comprising an element of a first user interface of the sequential pair that was a target of the user interaction to result in a second user interface of the sequential pair. 6 . The computer-implemented method of claim 1 , wherein: prior to determining the plurality of intermediate embeddings, the method comprises replacing, by the computing system, one or more of user interfaces in the sequence of two or more user interfaces with a replacement user interface; and performing the one or more pre-training tasks comprises processing, by the computing system, the one or more user interface embeddings with the machine-learned interface prediction model or a separate pre-training prediction head to obtain the pre-training output; and the pre-training output indicates whether each pair of user interfaces in the sequence of two or more user interfaces are consecutive user interfaces achievable via a single user interaction. 7 . The computer-implemented method of claim 1 , wherein the interface data further comprises structural data that is indicative of one or more positions of each of a plurality of interface elements included in the user interface, and wherein the structural data for each user interface comprises view hierarchy data associated with the user interface. 8 . The computer-implemented method of claim 7 , wherein: prior to determining the plurality of intermediate embeddings, the method comprises masking, by the computing system, one or more portions of the view hierarchy data; and performing the pre-training task comprises processing, by the computing system, the one or more user interface embeddings with the machine-learned interface prediction model or a separate pre-training prediction head to obtain the pre-training output; and the pre-training output comprises a predicted completion for the one or more portions of the view hierarchy data that have been masked. 9 . The computer-implemented method of claim 8 , wherein the predicted completion for the one or more portions of the view hierarchy data that have been masked comprises a textual completion. 10 . The computer-implemented method of claim 1 , wherein the plurality of intermediate embeddings comprises one or more image embeddings, one or more textual embeddings, and one or more positional embeddings. 11 . The computer-implemented method of claim 10 , wherein determining the plurality of intermediate embeddings comprises: determining, by the computing system, the one or more image embeddings from the one or more interface images, wherein the one or more image embeddings are respectively associated with at least one interface element of a plurality of interface elements included in the user interfaces; and determining, by the computing system based at least in part on the interface data, the one or more textual embeddings from textual content depicted in the one or more interface images. 12 . The computer-implemented method of claim 1 , wherein: determining the plurality of intermediate embeddings comprises processing, by the computing system, one or more of the one or more interface images, or textual content depicted in the one or more interface images with an embedding portion of the machine-learned interface prediction model to obtain the plurality of intermediate embeddings; and processing the plurality of intermediate embeddings with the machine-learned interface prediction model comprises processing, by the computing system, the plurality of intermediate embeddings with a transformer portion of the machine-learned interface prediction model to obtain the one or more user interface embeddings. 13 . A computing system, comprising: one or more processors; one or more tangible, non-transitory computer readable media storing computer-readable instructions that store a machine-learned interface prediction model configured to generate learned representations for user interfaces, the machine-learned interface prediction model having been trained by performance of operations, the operations comprising: obtaining interface data that comprises a sequence of two or more user interfaces obtained through performance of one or more user interactions which result in generation of the sequence of two or more user interfaces, wherein, for each user interface in the sequence of two or more user interfaces, the interface data comprises one or more interface images depicting the user interface; determining a plurality of intermediate embeddings based at least in part on the interface data; processing the plurality of intermediate embeddings with a machine-learned interface prediction model to obtain one or more user interface embeddings; and performing a pre-training task based at least in part on the one or more user interface embeddings to obtain a pre-training output. 14 . The computing system of claim 13 , wherein the operations further comprise: evaluating, by the computing system, a loss function that evaluates a difference between ground truth data and the pre-training output; and adjusting, by the computing system, one or more parameters of the machine-learned interface prediction model based at least in part on the loss function. 15 . The computing system of claim 13 , wherein the inter

Assignees

Inventors

Classifications

  • G06F9/451Primary

    Execution arrangements for user interfaces · CPC title

  • Combinations of networks · CPC title

  • nonlinear criteria, e.g. embedding a manifold in a Euclidean space · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022382565A1 cover?
Generally, the present disclosure is directed to user interface understanding. More particularly, the present disclosure relates to training and utilization of machine-learned models for user interface prediction and/or generation. A machine-learned interface prediction model can be pre-trained using a variety of pre-training tasks for eventual downstream task training and utilization (e.g., in…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/451. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).