Network switch for transmitting data according to an auto-negotiated data rate
US-9961006-B1 · May 1, 2018 · US
US11449538B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11449538-B2 |
| Application number | US-201916259326-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 28, 2019 |
| Priority date | Nov 13, 2006 |
| Publication date | Sep 20, 2022 |
| Grant date | Sep 20, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are methods and systems for integrating an enterprise's structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. In accordance with exemplary embodiments, the generation of feature vectors about unstructured data can be hardware-accelerated by processing streaming unstructured data through a reconfigurable logic device, a graphics processor unit (GPU), or chip multi-processor (CMP) to determine features that can aid clustering of similar data objects.
Opening claim text (preview).
What is claimed is: 1. A method for low latency and high throughput feature vector extraction, the method comprising: receiving streaming unstructured data into a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), and (3) a chip multi-processor (CMP), the streaming unstructured data comprising a plurality of data objects, wherein the data objects include a plurality of words, and wherein the member has a plurality of parallel processing engines deployed thereon; the parallel processing engines analyzing the data objects while the data objects stream through the member to perform a plurality of feature vector extraction operations on the streaming data objects that determine a plurality of features of the streaming data objects, wherein the determined features include a frequency of words within the data objects; and creating an association that is physically represented in memory between the determined features and the data objects. 2. The method of claim 1 wherein the analyzing step comprises generating a word count for a plurality of the words in the streaming data objects. 3. The method of claim 1 wherein the analyzing step comprises generating histograms with respect to a plurality of the words in the streaming data objects. 4. The method of claim 1 further comprising: performing clustering of the data objects based on the determined features to find clusters of the data objects that share similar features according to clustering criteria. 5. The method of claim 1 wherein the member comprises the reconfigurable logic device. 6. The method of claim 5 wherein the reconfigurable logic device comprises a field programmable gate array (FPGA). 7. The method of claim 6 wherein the FPGA comprises a plurality of FPGAs. 8. The method of claim 7 wherein the parallel processing engines are partitioned across a plurality of the FPGAs. 9. The method of claim 1 wherein the member comprises the GPU. 10. The method of claim 1 wherein the member comprises the CMP. 11. The method of claim 1 further comprising the parallel processing engines creating an index of the streaming data objects based on the determined features. 12. The method of claim 11 wherein the index is stored as structured data in the database, the method further comprising: storing the streaming unstructured data in a data store of unstructured data; receiving a query that is directed toward a combination of structured data and unstructured data; accessing structured data in the database according to the classification index in response to the query to identify a subset of the unstructured data that is to be analyzed against the query; and performing a query-specified data analysis operation on the identified subset of unstructured data to thereby generate data for a response to the query; wherein the accessing step is conducted by a processor; and wherein the step of performing the query-specified data analysis operation is conducted by the member. 13. An apparatus for low latency and high throughput feature vector extraction, the apparatus comprising: a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), and (3) a chip multi-processor (CMP), the member configured to receive streaming unstructured data, the streaming unstructured data comprising a plurality of data objects, wherein the data objects include a plurality of words, and wherein the member has a plurality of parallel processing engines deployed thereon; the parallel processing engines configured to (1) analyze the data objects while the data objects stream through the member to perform a plurality of feature vector extraction operations on the streaming data objects that determine a plurality of features of the streaming data objects, wherein the determined features include a frequency of words within the data objects, and (2) create an association that is physically represented in memory between the determined features and the data objects. 14. The apparatus of claim 13 wherein the parallel processing engines include a parallel processing engine configured to generate a word count for a plurality of the words in the streaming data objects. 15. The apparatus of claim 13 wherein the parallel processing engines include a parallel processing engine configured to generate histograms with respect to a plurality of the words in the streaming data objects. 16. The apparatus of claim 13 further comprising: a processor configured to cluster the data objects based on the determined features to find clusters of the data objects that share similar features according to clustering criteria. 17. The apparatus of claim 13 wherein the member comprises the reconfigurable logic device. 18. The apparatus of claim 17 wherein the reconfigurable logic device comprises a field programmable gate array (FPGA). 19. The apparatus of claim 18 wherein the FPGA comprises a plurality of FPGAs. 20. The apparatus of claim 19 wherein the parallel processing engines are partitioned across a plurality of the FPGAs. 21. The apparatus of claim 13 wherein the member comprises the GPU. 22. The apparatus of claim 13 wherein the member comprises the CMP. 23. The apparatus of claim 13 wherein the parallel processing engines include a parallel processing engine configured to create an index of the streaming data objects based on the determined features. 24. The apparatus of claim 23 further comprising: a database in which the index is stored; a data store in which the streaming unstructured data is stored; and a processor configured to (1) receive a query that is directed toward a combination of structured data and unstructured data and (2) access structured data in the database according to the index in response to the query to identify a subset of the unstructured data that is to be analyzed against the query; and wherein the member is configured to perform a query-specified data analysis operation on the identified subset of unstructured data to thereby generate data for a response to the query.
Query execution (filtering based on additional data G06F16/335) · CPC title
Indexing structures · CPC title
Indexing structures · CPC title
Approximate or statistical queries · CPC title
Relational databases · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.