Computer-vision based execution of graphical user interface (GUI) application actions

US10474440B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10474440-B2
Application numberUS-201816206625-A
CountryUS
Kind codeB2
Filing dateNov 30, 2018
Priority dateSep 2, 2015
Publication dateNov 12, 2019
Grant dateNov 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Using computer-vision based training information, a user interface (UI) component of an application-level user interface of an application and rendering coordinates of the UI component within the application-level user interface are recognized. A functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI component as part of the application-level user interface is identified in accordance with the computer-vision based training information. A replica object of the identified functional class is instantiated within a user interface container separately from the application. An operating system-level event that specifies a functional operation of the UI component and the recognized rendering coordinates of the UI component is generated from the instantiated replica object on an operating system event queue that provides inputs to the application.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: recognizing, using computer-vision based training information, a user interface (UI) component of an application-level user interface of an application and rendering coordinates of the UI component within the application-level user interface, where the computer-vision based training information is created by: generating, by processing UI component training screenshot images, digitized image vectors as a training user interface data set; applying machine learning to the training user interface data set; training a graphical user interface (GUI) classifier module to recognize user interface images by use of the digitized image vectors; mapping the trained digitized image vectors to user interface classes and user interface functional actions of the user interface classes; and generating user interface execution scripts that specify the user interface classes and the user interface functional actions of replica object instances instantiated from the user interface classes; and where the method further comprises: identifying, in accordance with the computer-vision based training information, a functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI component as part of the application-level user interface; instantiating a replica object of the identified functional class within a user interface container separately from the application; and generating, from the instantiated replica object on an operating system event queue that provides inputs to the application, an operating system-level event that specifies a functional operation of the UI component and the recognized rendering coordinates of the UI component. 2. The computer-implemented method of claim 1 , where recognizing, using the computer-vision based training information, the user interface (UI) component of the application-level user interface of the application and the rendering coordinates of the UI component within the application-level user interface comprises: capturing rendering data of the UI component; classifying the captured rendering data of the UI component according to a user interface component type in accordance with the computer-vision based training information; identifying, by the user interface component type, a class label and an instance identifier of the UI component mapped within the computer-vision based training information; and identifying screen coordinates of the UI component within the rendering data of the UI component. 3. The computer-implemented method of claim 1 , where identifying, in accordance with the computer-vision based training information, the functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI component as part of the application-level user interface comprises: identifying the functional class within the computer-vision based training information using an instance identifier of the UI component identified from captured rendering data of the UI component. 4. The computer-implemented method of claim 1 , where the application comprises a first application that is not in focus within a graphical user interface (GUI) to receive operating system inputs generated by an input device, where a second application is in focus within the GUI to receive operating system inputs generated by the input device, and where: recognizing, using the computer-vision based training information, the UI component of the application-level user interface of the application and the rendering coordinates of the UI component within the application-level user interface comprises: recognizing the UI component of the application-level user interface within one of a plurality of display buffers that is associated with a non-focused display view; and generating, from the instantiated replica object on the operating system event queue that provides the inputs to the application, the operating system-level event that specifies the functional operation of the UI component and the recognized rendering coordinates of the UI component comprises: generating the operating system-level event from the instantiated replica object as a non-focus-based input to the application without bringing the application-level user interface into focus within the GUI, where the second application retains the focus within the GUI to receive operating system inputs generated by the input device. 5. The computer-implemented method of claim 1 , further comprising creating the computer-vision based training information. 6. A system, comprising: a memory; and a processor programmed to: recognize, using computer-vision based training information in the memory, a user interface (UI) component of an application-level user interface of an application and rendering coordinates of the UI component within the application-level user interface, where the computer-vision based training information is created by the processor being programmed to: generate, by processing UI component training screenshot images, digitized image vectors as a training user interface data set; apply machine learning to the training user interface data set; train a graphical user interface (GUI) classifier module to recognize user interface images by use of the digitized image vectors; map the trained digitized image vectors to user interface classes and user interface functional actions of the user interface classes; and generate user interface execution scripts that specify the user interface classes and the user interface functional actions of replica object instances instantiated from the user interface classes; and where the processor is further programmed to: identify, in accordance with the computer-vision based training information, a functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI component as part of the application-level user interface; instantiate a replica object of the identified functional class within a user interface container separately from the application; and generate, from the instantiated replica object on an operating system event queue that provides inputs to the application, an operating system-level event that specifies a functional operation of the UI component and the recognized rendering coordinates of the UI component. 7. The system of claim 6 , where, in being programmed to recognize, using the computer-vision based training information in the memory, the user interface (UI) component of the application-level user interface of the application and the rendering coordinates of the UI component within the application-level user interface, the processor is programmed to: capture rendering data of the UI component; classify the captured rendering data of the UI component according to a user interface component type in accordance with the computer-vision based training information; identify, by the user interface component type, a class label and an instance identifier of the UI component mapped within the computer-vision based training information; and identify screen coordinates of the UI component within the rendering data of the UI component. 8. The system of claim 6 , where, in being programmed to identify, in accordance with the computer-vision based training information, the functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI component as part of the application-level user interface, the processor is programmed to: identify the functional class with

Assignees

Inventors

Classifications

  • Execution arrangements for user interfaces · CPC title

  • G06F8/38Primary

    for implementing user interfaces · CPC title

  • for test version control, e.g. updating test cases to a new software version · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10474440B2 cover?
Using computer-vision based training information, a user interface (UI) component of an application-level user interface of an application and rendering coordinates of the UI component within the application-level user interface are recognized. A functional class that is mapped within the computer-vision based training information to the UI component and that is used to instantiate the UI compo…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/38. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).