Systems and Methods for Capturing Images and Annotating the Captured Images with Information
US-2016167226-A1 · Jun 16, 2016 · US
US9555544B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9555544-B2 |
| Application number | US-201615094063-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 8, 2016 |
| Priority date | Jul 2, 2015 |
| Publication date | Jan 31, 2017 |
| Grant date | Jan 31, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automating a manual process. The methods, systems, and apparatus include actions of identifying a process that (i) is manually performed by a user interacting with a computer, and (ii) is to be automated for performance by a robot that is configured to interact with another computer. Additional actions include obtaining one or more images taken of a display of the computer while the user is interacting with the computer in manually performing the process and applying a computer vision technique to identify one or more activities associated with the process. Further actions include, for each of the one or more identified activities, generating activity information associated with the activity and generating a process definition for use in causing the robot to automatically perform the process.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method comprising: identifying a process that (i) is manually performed by a user interacting with a computer, and (ii) is to be automated for performance by a robot that is configured to interact with the computer or another computer; obtaining images taken of a display of the computer while the user is interacting with the computer in manually performing the process; applying a computer vision technique, to the images taken of the display of the computer while the user is interacting with the computer in manually performing the process, to determine that a change in images taken of the display of the computer while the user is interacting with the computer in manually performing the process satisfies a predetermined threshold corresponding to a scene change; in response to determining that the change in images taken of the display of the computer while the user is interacting with the computer in manually performing the process satisfies the predetermined threshold corresponding to a scene change, identifying a first activity corresponding to receipt of input from the user during the process followed by a second activity corresponding to a scene change during the process; for each of the identified activities, generating activity information associated with the activity; generating a process definition for use in causing the robot to automatically perform the process by interacting with the computer or the other computer, where the process definition indicates the first activity corresponding to receipt of the input from the user followed by the second activity corresponding to the scene change and, for each indicated activity, indicates the activity information associated with the activity; and storing the process definition for later use in causing the robot to automatically perform the process by interacting with the computer or the other computer. 2. The method of claim 1 , wherein obtaining images taken of a display of the computer while the user is interacting with the computer in manually performing the process comprises: obtaining images taken of the display of the computer from a camera. 3. The method of claim 1 , wherein the one or more images comprise a video. 4. The method of claim 1 , wherein generating activity information associated with the activity comprises: generating a snapshot of a portion of a screen shown on the display before the activity is performed. 5. The method of claim 1 , wherein generating activity information associated with the activity comprises: generating a screenshot of a screen shown on the display after the activity is performed. 6. The method of claim 1 , wherein the activities comprise two or more of a key press, a mouse click, a screen touch, a process in a foreground change, or a scene change. 7. The method of claim 1 , comprising: accessing the process definition; and automatically instructing the robot to physically interact with the computer or the other computer based on the activities and activity information indicated by the process definition. 8. The method of claim 7 , wherein automatically instructing the robot to interact with the computer or the other computer based on the activities and activity information indicated by the process definition comprises: identifying a portion of a screen shown on another display of the other computer that visually matches a snapshot indicated by the activity information for a particular activity; and instructing the robot to physically touch the other display at a location corresponding to the center of the portion. 9. The method of claim 7 , wherein automatically instructing the robot to interact with the other computer based on the activities and activity information indicated by the process definition comprises: determining that a screen shown on the other display corresponds to a screenshot indicated by the activity information for a particular activity; and in response to determining that the screen shown on the other display corresponds to a screenshot indicated by the activity information for a particular activity, instructing the robot to physically interact with the other computer based on a subsequent activity indicated by the process definition. 10. The method of claim 7 , wherein automatically instructing the robot to interact with the computer or the other computer based on the activities and activity information indicated by the process definition comprises: identifying a portion of a screen shown on another display of the other computer that visually matches a snapshot indicated by the activity information for a particular activity; and instructing the robot to provide an electronic signal to the other computer to receive a click on coordinates corresponding to the center of the portion of the screen that visually matches the snapshot. 11. The method of claim 1 , comprising: generating a graphical version of the process definition that a person can use to modify the process. 12. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: identifying a process that (i) is manually performed by a user interacting with a computer, and (ii) is to be automated for performance by a robot that is configured to interact with the computer or another computer; obtaining images taken of a display of the computer while the user is interacting with the computer in manually performing the process; applying a computer vision technique, to the images taken of the display of the computer while the user is interacting with the computer in manually performing the process, to determine that a change in images taken of the display of the computer while the user is interacting with the computer in manually performing the process satisfies a predetermined threshold corresponding to a scene change; in response to determining that the change in images taken of the display of the computer while the user is interacting with the computer in manually performing the process satisfies the predetermined threshold corresponding to a scene change, identifying a first activity corresponding to receipt of input from the user during the process followed by a second activity corresponding to a scene change during the process; for each of the identified activities, generating activity information associated with the activity; generating a process definition for use in causing the robot to automatically perform the process by interacting with the computer or the other computer, where the process definition indicates the first activity corresponding to receipt of the input from the user followed by the second activity corresponding to the scene change and, for each indicated activity, indicates the activity information associated with the activity; and storing the process definition for later use in causing the robot to automatically perform the process by interacting with the computer or the other computer. 13. The system of claim 12 , wherein obtaining images taken of a display of the computer while the user is interacting with the computer in manually performing the process comprises: obtaining images taken of the display of the computer from a camera. 14. The system of claim 12 , wherein the one or more images comprise a video. 15. The system of claim 12 , wherein generating activity information associated with the activity comprises: generating a snapshot of a portion of a screen shown on the dis
learning, adaptive, model based, rule based expert control · CPC title
Input/output · CPC title
Record actions of human expert, teach by showing · CPC title
Vision controlled systems · CPC title
Learn by operator observation, symbiosis, show, watch · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.