Intelligent data storage and processing using FPGA devices
US-9176775-B2 · Nov 3, 2015 · US
US9396222B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9396222-B2 |
| Application number | US-201414531255-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 3, 2014 |
| Priority date | Nov 13, 2006 |
| Publication date | Jul 19, 2016 |
| Grant date | Jul 19, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein is a method and system for integrating an enterprise's structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. In accordance with exemplary embodiments, the generation of metadata indexes about unstructured data can be hardware-accelerated by processing streaming unstructured data through a reconfigurable logic device to generate the metadata about the unstructured data for the index.
Opening claim text (preview).
What is claimed is: 1. A method for building a metadata index for unstructured data for a plurality of different data sources, the method comprising: receiving streaming unstructured data into a reconfigurable logic device, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules analyzing the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analyzing including detecting whether a term relating to a name is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and generating an index about the streaming unstructured data from the generated metadata, the index for subsequent querying to locate data items of interest based on associations between the metadata and the data items. 2. The method of claim 1 wherein the data items comprise at least two members of the group consisting of (1) a plurality of news reports, (2) a plurality of web pages, (3) a plurality of market analyses, (4) a plurality of emails, (5) a plurality of social network communications, and (6) a plurality of corporate documents. 3. The method of claim 2 wherein the reconfigurable logic device comprises a field programmable gate array (FPGA), the FPGA having the pipelined firmware application modules deployed thereon. 4. The method of claim 3 wherein the analyzing step comprises the pipelined firmware application modules performing a classification operation on the streaming unstructured data to determine classification information for the data items, the generated metadata including the determined classification information; and wherein the index generating step further comprises generating an index about the streaming unstructured data based on the determined classification information. 5. The method of claim 4 wherein the classification operation performing step includes the pipelined firmware application modules generating word counts for the data items, the determined classification information being based on the generated word counts. 6. The method of claim 2 further comprising storing the generated index in a database for subsequent querying. 7. The method of claim 2 further comprising: streaming a plurality of the data items into the reconfigurable logic device from a plurality of remote data sources via a network interface. 8. The method of claim 2 further comprising: performing a lookup using the generated index as part of a reputation analysis operation for an enterprise. 9. The method of claim 2 wherein the detecting comprises: the pipelined firmware application modules identifying a plurality of names that are found within the data items, the index indexing the data items by the found names; wherein the method further comprises: performing a plurality of lookups relating to a plurality of the names using the generated index; and determining a connectedness for a plurality of individuals based on the lookups. 10. The method of claim 9 wherein the at least two members comprise the emails and the social network communications. 11. The method of claim 9 the data items comprise at least three members of the group consisting of (1) a plurality of news reports, (2) a plurality of web pages, (3) a plurality of market analyses, (4) a plurality of emails, (5) a plurality of social network communications, and (6) a plurality of corporate documents. 12. The method of claim 2 wherein the detecting comprises: the pipelined firmware application modules identifying a plurality of names that are found within the data items, the index indexing the data items by the found names; wherein the method further comprises: performing a plurality of lookups relating to a plurality of the names using the generated index; and determining a connectedness for a plurality of organizations based on the lookups. 13. The method of claim 12 wherein the at least two members comprise the emails and the corporate documents. 14. The method of claim 12 the data items comprise at least three members of the group consisting of (1) a plurality of news reports, (2) a plurality of web pages, (3) a plurality of market analyses, (4) a plurality of emails, (5) a plurality of social network communications, and (6) a plurality of corporate documents. 15. The method of claim 2 wherein the detecting comprises: the pipelined firmware application modules identifying a plurality of names that are found within the data items, the index indexing the data items by the found names; wherein the method further comprises: performing a plurality of lookups relating to a plurality of the names using the generated index; and determining a connectedness for a plurality of individuals and organizations based on the lookups. 16. The method of claim 2 further comprising: integrating the index with structured data relating to the name in a structured database. 17. An apparatus for building a metadata index for unstructured data for a plurality of different data sources, the apparatus comprising: a reconfigurable logic device; and a memory; wherein the reconfigurable logic device is configured to receive streaming unstructured data, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules configured to perform analysis of the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analysis including a detection by the pipelined firmware application modules whether a term relating to a name is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and the memory configured to store an index about the streaming unstructured data from the generated metadata, the index for querying to locate data items of interest based on associations between the metadata and the data items. 18. A method for integrating unstructured data for a plurality of different data sources, the method comprising: streaming unstructured data through a field programmable gate array (FPGA), the unstructured data comprising at least two members of the group consisting of (1) a plurality of emails, (2) a plurality of social network communications, (3) a plurality of corporate documents, and (4) a plurality of news reports; the FPGA performing a metadata generation operation on the unstructured data streamed therethrough to thereby generate metadata about the unstructured data; storing the unstructured data in a data store of unstructured data; storing the metadata about the unstructured data in a database of structured data; and determining a connectedness of a plurality of subjects based on an analysis of the stored unstructured data and the stored metadata. 19. The method of claim 18 wherein the metadata includes an identification of where the unstructured data is stored in the data store of unstructured data. 20. The method of claim 19 wherein the performing step further comprises: the
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Query augmenting and refining, e.g. inexact access · CPC title
into predefined classes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.