Techniques for resonant rotary clocking for die-to-die communication
US-2024429865-A1 · Dec 26, 2024 · US
US2025291692A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025291692-A1 |
| Application number | US-202418747404-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 18, 2024 |
| Priority date | Mar 17, 2024 |
| Publication date | Sep 18, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Computing system performance monitors provide on-chip control, selection, collection, coalescing and communication of behavior and other processing-indicating data of high performance single- and multi-die computing and processing systems, such as for use in multi-chip-module and/or multi-instanced graphics processing units (GPUs) and/or systems-on-chips (SOCs). Commands and data records can be forwarded between modules to abstract the processing system from profilers and other data report consumers. Quality of Service and security isolation for different command and data report streams is maintained.
Opening claim text (preview).
1 . A graphics processor comprising: a first semiconductor die including a first control path circuit or processor that determines a first monitoring parameter and sends a forwarding command packet indicating the first monitoring parameter to a second semiconductor die; and the second semiconductor die including a second control path circuit or processor that determines a second monitoring parameter and determines a global monitoring parameter in response to the forwarding command packet and the determined second monitoring parameter. 2 . The graphics processor of claim 1 wherein the first and second monitoring parameters are each temporal. 3 . A method comprising: determining a first temporal region of interest local to a first semiconductor die; determining a second temporal region of interest local to a second semiconductor die; forwarding information indicating the first temporal region of interest from the first semiconductor die to the second semiconductor die; and determining a global temporal region of interest in response to the forwarded information and the determined second temporal region of interest. 4 . The method of claim 3 wherein determining the first temporal region of interest is based on an engine start command and an engine stop command, the engine disposed on the first semiconductor die. 5 . The method of claim 4 wherein determining the first temporal region of interest is also based on a further engine start command and a further engine stop command, the further engine also disposed on the first semiconductor die. 6 . The method of claim 5 wherein determining the first temporal region of interest comprises selecting the first temporal region of interest relative to the engine start command, the engine stop command, the further engine start command and the further engine stop command. 7 . The method of claim 6 wherein selecting comprises defining the global temporal region of interest between a first start command from any engine and a last stop command from any engine. 8 . The method of claim 6 wherein selecting comprises defining the global temporal region of interest between a first start command from any engine and a first stop command from any engine. 9 . The method of claim 3 wherein determining the global temporal region of interest comprises selecting the global temporal region of interest relative to the first temporal region of interest and the second temporal region of interest. 10 . The method of claim 3 further including triggering to snapshot performance data or propagating command and control information to a first data generator on the first semiconductor die during the global temporal region of interest, and triggering to snapshot performance data or propagating command and control information to a second data generator on the second semiconductor die during the global temporal region of interest. 11 . The method of claim 10 wherein at least one of the first data generator and the second data generator comprises a performance data monitor. 12 . A processing system comprising: a first semiconductor die including a first control path circuit or processor that determines a first temporal region of interest local to the first die and forwards information indicating the first temporal region of interest to a second semiconductor die; and the second semiconductor die including a second control path circuit or processor that determines a second temporal region of interest local to the second semiconductor die and determines a global temporal region of interest in response to the forwarded information and the determined second temporal region of interest. 13 . The processing system of claim 12 wherein the first control path circuit or processor determines the first temporal region of interest based on an engine start command and an engine stop command, the engine disposed on the first semiconductor die. 14 . The processing system of claim 13 wherein the first control path circuit or processor determines the first temporal region of interest also based on a further engine start command and a further engine stop command, the further engine also disposed on the first semiconductor die. 15 . The processing system of claim 13 wherein the first control path circuit or processor determines the first temporal region of interest by selecting the first temporal region of interest relative to the engine start command, the engine stop command, the further engine start command and the further engine stop command. 16 . The processing system of claim 15 wherein selecting comprises defining the global temporal region of interest between a first start command from any engine and a last stop command from any engine. 17 . The processing system of claim 15 wherein selecting comprises defining the global temporal region of interest between a first start command from any engine and a first stop command from any engine. 18 . The processing system of claim 12 wherein the second control path circuit or processor selects the global temporal region of interest relative to the first temporal region of interest and the second temporal region of interest. 19 . The processing system of claim 12 further including a first trigger that triggers a first data generator on the first semiconductor die to monitor a first engine on the first semiconductor die during the global temporal region of interest, and a second trigger that triggers a second data generator on the second semiconductor die to monitor a second engine on the second semiconductor die during the global temporal region of interest. 20 . The processing system of claim 19 wherein at least one of the first data generator and the second data generator comprises a counter, a workload execution timeline data or a performance monitor. 21 . A GPU comprising: a first virtualizer that enables a first tenant to use first fractional parts of a first die and a second die, and enables a second tenant to use second fractional parts of the first die and the second die, wherein at least some of the first fractional parts are distinct from the second fractional parts; a controller that enables the first tenant to issue first performance monitoring commands for the first fractional parts and enables the second tenant to issue second performance monitoring commands for the second fractional parts; and communication paths on the first die and the second die that keep the first monitoring commands and the second monitoring commands separate while communicating the first monitoring commands to the first fractional parts on the first die and the second die and communicating the second monitoring commands to the second fractional parts on the first die and the second die.
Processor architectures; Processor configuration, e.g. pipelining · CPC title
System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package · CPC title
where the computing system component is a central processing unit [CPU] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.