What technology area does this patent fall under?

Primary CPC classification G06F11/3636. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 06 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for interception of synchronization objects in graphics application programming interfaces for frame debugging

US9910760B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9910760-B2
Application number	US-201514845123-A
Country	US
Kind code	B2
Filing date	Sep 3, 2015
Priority date	Aug 7, 2015
Publication date	Mar 6, 2018
Grant date	Mar 6, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An aspect of the present invention proposes a solution for correctly intercepting, capturing, and replaying tasks (such as functions and methods) in an interception layer operating between an application programming interface (API) and the driver of a processor by using synchronization objects such as fences. According to one or more embodiments of the present invention, the application will use what appears to the application to be a single synchronization object to signal (from a processor) and to wait (on a processor), but will actually be two separate synchronization objects in the interception layer. According to one or more embodiments, the solution proposed herein may be implemented as part of an module or tool that works as an interception layer between an application and an API exposed by a device driver of a resource, and allows for an efficient and effective approach to frame-debugging and live capture and replay of function bundles.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for performing application-based synchronization between two or more processors, the method comprising: in a computing system comprising a plurality of processors, the plurality of processors comprising at least a first processor and a second processor, performing, in the first and second processor, a first and second plurality of tasks, respectively, the first and second plurality of tasks being comprised from a sequence of commands issued by an application executing in the computing system; suspending, via a waiting synchronization object, a performance of a third plurality of tasks in the second processor when the second plurality of tasks is completed by the second processor; signaling a signaling synchronization object when the first plurality of tasks is completed by the first processor; propagating a signal from the signaling synchronization object to the waiting synchronization object; performing the third plurality of tasks in the second processor based on the propagated signal, wherein the waiting synchronization object and the signaling synchronization object are generated in an interception layer and appear as a single synchronization object to the application, the interception layer executing between an Application Programming Interface (API) of the application and a driver of at least one of the first and second processors. 2. The method according to claim 1 , wherein the third plurality of tasks comprises at least one task that is dependent on a completion of at least one task of the first plurality of tasks performed by the first processor. 3. The method according to claim 1 , wherein the waiting synchronization object and the signaling synchronization object are generated internally in the interception layer in response to a request by the application to create a single synchronization object with both waiting and signaling functionality. 4. The method according to claim 1 , wherein the waiting synchronization object comprises a waiting fence object and the signaling synchronization object comprises a signaling fence object. 5. The method according to claim 1 , wherein the signaling synchronization object has a value corresponding to a state of a progress of a performance of an assigned plurality of tasks in at least one of the first and second processors. 6. The method according to claim 1 , wherein the waiting synchronization object has a value corresponding to a state of a progress of a performance of an assigned plurality of tasks in at least one of the first and second processors as indicated by the signaling synchronization object after processing and propagation by an interception layer. 7. The method according to claim 6 , wherein a current state of the performance of the assigned plurality of tasks in the application corresponds to the value of the waiting synchronization layer. 8. The method according to claim 1 , wherein an interception layer performs an operation after the signaling synchronization object is signaled but before propagating the signal to the waiting synchronization object. 9. The method according to claim 8 , wherein the operation is comprised from a group of operations consisting of: data verification; task verification; data logging; data analysis; consistency checking; and data profiling. 10. The method according to claim 1 , further wherein the first processor is operable to perform additional tasks from the plurality of tasks after signaling the signaling synchronization object. 11. A system for frame debugging and synchronization, the system comprising: a memory device comprising a plurality of programmed instructions; a first processor; a second processor; an application executing on at least one of the first and second processors based on the programmed instructions, the application using an Application Programming Interface (API); and an interception layer operating between the API and a driver of at least one of the first and second processors, the interception layer being configured to: generate a first signaling synchronization object and a separate first waiting synchronization object, to intercept signal commands and wait commands from the application, to apply the signal commands to the first signaling synchronization object and to propagate wait commands to the first waiting synchronization object, further wherein the first signaling synchronization object and the first waiting synchronization object appear as a single synchronization object to the application. 12. The system according to claim 11 , wherein at least one of the first and second processors is a central processing unit (CPU). 13. The system according to claim 11 , wherein at least one of the first and second processors is a graphics processing unit (GPU). 14. The system according to claim 11 , wherein the first signaling synchronization object comprises a signaling fence primitive and the first waiting synchronization object comprises a waiting fence primitive. 15. The system according to claim 11 , wherein the interception layer is further configured to apply at least one of: a signal operation to the first signaling synchronization object, a query operation to the first waiting synchronization object, and a wait operation to the first waiting synchronization object, based on a command from the application. 16. The system according to claim 11 , further comprising: a first value corresponding to a state of progress of the application in submitting the first and second plurality of tasks to be performed by the first and second processors; a second value corresponding to a value of the first signaling synchronization object; a second value corresponding to the first waiting synchronization object; and a third value corresponding to the state of progress perceived by the application for performed tasks of the first and second plurality of tasks. 17. The system according to claim 16 , further wherein the first value is indicative of a state of progress of a performance of a plurality of tasks in at least one of the first and second processors, the second value corresponds to the state of progress indicated by the first value and propagated by the interception layer to the first waiting synchronization object, and the third value corresponds to a state of progress of the performance of the plurality of tasks as perceived by the application and is based on the second value. 18. The system according to claim 11 , wherein the interception layer is further configured to generate a second signaling synchronization object and a second waiting synchronization object, and to record a plurality of parameters and a state of a performance of a plurality of tasks by redirecting commands intended for the first signaling synchronization object to the second signaling synchronization object and commands intended for the first waiting synchronization object to the second waiting synchronization object. 19. The system according to claim 18 , wherein the interception layer is further configured to replay the recorded plurality of parameters and the state of the plurality of tasks based on user input. 20. A method for performing application-based frame debugging, the method comprising: in a computing system comprising a first processor and a second processor, generating a first pair of synchronization objects and a second pair of synchronization objects, the first pair of synchronization objects comprising a first signaling synchronization object and a first wa

Assignees

Nvidia Corp

Inventors

Classifications

G06F11/3419
by assessing time · CPC title
G06F11/3636Primary
by tracing the execution of the program · CPC title
G06F11/3688
for test execution, e.g. scheduling of test suites · CPC title
G06F9/526
Mutual exclusion algorithms · CPC title
G06F9/52
Program synchronisation; Mutual exclusion, e.g. by means of semaphores · CPC title

Patent family

Related publications grouped by family.

View patent family 58052624

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9910760B2 cover?: An aspect of the present invention proposes a solution for correctly intercepting, capturing, and replaying tasks (such as functions and methods) in an interception layer operating between an application programming interface (API) and the driver of a processor by using synchronization objects such as fences. According to one or more embodiments of the present invention, the application will us…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06F11/3636. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 06 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and system for implementing a multi-threaded API stream replay

Suspending and resuming a graphics application executing on a target device for debugging

System, method, and computer program product for debugging graphics programs locally utilizing a system with a single GPU

Debugging and perfomance analysis of applications

Extracting Rich Performance Analysis from Simple Time Measurements

User interface with real-time visual playback along with synchronous textual analysis log display and event/time index for anomalous behavior detection in applications

Frequently asked questions