Methods and apparatus for a vector memory subsystem for use with a programmable mixed-radix DFT/IDFT processor

US11829322B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11829322-B2
Application numberUS-202017124442-A
CountryUS
Kind codeB2
Filing dateDec 16, 2020
Priority dateDec 31, 2015
Publication dateNov 28, 2023
Grant dateNov 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output memory addresses that are used to unload vector data from the memory banks. The input memory addresses are used to shuffle the input data in the memory bank based on a radix factorization associated with an N-point DFT, and the output memory addresses are used to unload the vector data from the memory bank to compute radix factors of the radix factorization.

First claim

Opening claim text (preview).

What is claimed is: 1. A programmable network processing unit configured to facilitate discrete Fourier transform (“DFT”) operations for data processing, comprising: a memory containing a “ping” and “pong” memory banks for facilitating selection of read or write operation to enhance efficiency of data flows; a vector load unit coupled to the memory and configured to retrieve a data flow from the memory in accordance with address information from an address generator; a vector dynamic scaling unit coupled to the vector loading unit and configure to scale the data flow to generate parallel scaled samples operating within a predefined amplitude in a bit-width of a data-path for facilitating radix computation; and a vector data twiddle multiplier coupled to the vector dynamic scaling unit and configured to multiply scaled samples with twiddle factors. 2. The programmable network processing unit of claim 1 further comprising a vector staging buffer coupled to the vector dynamic scaling unit and operable to facilitate parallel data output to the vector data twiddle multiplier. 3. The programmable network processing unit of claim 2 , wherein the vector staging buffer stores scaled vector data in a temporary memory in a first order, and outputs the scaled vector data from the temporary memory in a second order. 4. The programmable network processing unit of claim 1 , wherein the vector load unit, vector dynamic scaling unit, and the vector data twiddle multiplier form a sequence of operations as a vector data-path pipeline. 5. The programmable network processing unit of claim 1 , further comprising a finite state machine controller coupled to the memory and configured to generate radix engine control signals in accordance an input of index. 6. The programmable network processing unit of claim 1 , further comprising a mixed radix engine coupled to the memory and capable of being reconfigurable based on radix engine control signals for facilitating generation of a radix result in accordance with scaled vector data from a vector data-path pipeline. 7. The programmable network processing unit of claim 1 , further comprising an output staging buffer coupled to a mixed radix engine and configured to buffer intermediate radix results generated by the mixed radix engine. 8. The programmable network processing unit of claim 1 , further comprising an output interface streamer coupled to the memory and configured to retrieve result from a staging buffer for ordering data sequence. 9. The programmable network processing unit of claim 1 , further comprising an output vector ping-pong buffer coupled to an output interface streamer and configured to generate discrete Fourier transform (“DFT”) and inverse DFT (“IDFT”) data for a downstream entity in a sequential order. 10. The programmable network processing unit of claim 1 , further comprising a programmable vector mixed-radix engine coupled to the vector data twiddle multiplier and operable to perform a selected radix computation selected from a plurality of radix computations for generating a radix result. 11. The programmable network processing unit of claim 10 , wherein the plurality of radix computations includes radix3, radix4, radix5, and radix6 computations. 12. The programmable network processing unit of claim 1 , further comprising a configuration look up table (“LUT”) coupled to the memory and configured to store index values selectable by an index. 13. A method for processing unit configured to facilitate discrete Fourier transform (“DFT”) and inverse DFT (“IDFT”) operations for data processing, comprising: loading a data stream from a ping-pong memory bank and forwarding the data stream to vector load unit for passing through a vector data-path pipeline; generating multiple samples in response to the data stream and forwarding the samples to a vector dynamic scaling unit; scaling the samples to keep amplitudes with in a predefined bit-width of a data-path for radix computation; and forwarding scaled samples to a vector data twiddle multiplier for multiplying the scaled samples with twiddle factors. 14. The method of claim 13 , wherein loading the data stream includes transmitting parallel data to a vector data twiddle multiplier for facilitating a vector mixed-radix computation. 15. The method of claim 13 , further comprising: receiving an index from an external component; and selecting one of index values representing size of DFT/IDFT stored in a configuration look up table (“LUT”) based on the index. 16. The method of claim 13 , further comprising instructing a vector input shuffling controller to store input data in a vector memory bank based on selected index value. 17. The method of claim 13 , further comprising facilitating to program a programmable vector mixed-radix engine in accordance with an index value. 18. The method of claim 13 , further comprising: generating input memory addresses for storing input data into a vector memory bank; and generating output memory addresses for retrieving vector data from the vector memory bank to compute radix factors of a radix factorization. 19. The method of claim 13 , further comprising staging vector data from a memory bank and staging outputs of vector data from a temporary memory. 20. An apparatus for processing unit configured to facilitate discrete Fourier transform operations for data processing, comprising: means for loading a data stream from a ping-pong memory bank and forwarding the data stream to vector load unit for passing through a vector data-path pipeline; means for generating multiple samples in response to the data stream and forwarding the samples to a vector dynamic scaling unit; means for scaling the samples to keep amplitudes with in a predefined bit-width of a data-path for radix computation; and means for forwarding scaled samples to a vector data twiddle multiplier for multiplying the scaled samples with twiddle factors. 21. The apparatus of claim 20 , wherein means for loading the data stream includes means for transmitting parallel data to a vector data twiddle multiplier for facilitating a vector mixed-radix computation. 22. The apparatus of claim 20 , further comprising: means for receiving an index from an external component; and means for selecting one of index values representing size of discrete Fourier transform (“DFT”) and inverse DFT (“IDFT”) stored in a configuration look up table (“LUT”) based on the index. 23. The apparatus of claim 20 , further comprising means for instructing a vector input shuffling controller to store input data in a vector memory bank based on selected index value. 24. The apparatus of claim 20 , further comprising means for facilitating to program a programmable vector mixed-radix engine in accordance with an index value.

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Details on data memory access · CPC title

  • Discrete Fourier transforms · CPC title

  • Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11829322B2 cover?
A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output me…
Who is the assignee on this patent?
Cavium Llc, Marvell Asia Pte Ltd
What technology area does this patent fall under?
Primary CPC classification G06F15/8061. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).