Inferring entity attribute values
US-9501503-B2 · Nov 22, 2016 · US
US2016019272A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016019272-A1 |
| Application number | US-201414557347-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 1, 2014 |
| Priority date | Jul 15, 2014 |
| Publication date | Jan 21, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention extends to methods, systems, and computer program products for managing data ingestion. Aspects of the invention include a pluggable architecture channel service (e.g., a push/pull channel service) to ingest raw data. Aspects of the invention also include a pluggable architecture formatter to convert ingested raw data into a common format, such as, for example, key value pairs. Aspects of the invention also include an EAV storage with functionality allowing consumers to define multiple entities on (and spanning) ingested data sets. Accordingly, data can be ingested without data loss, without having to define extraction logic, and without having to define a storage schema.
Opening claim text (preview).
What is claimed: 1 . At a computer system, a method for managing data ingestion, the method comprising: receiving raw data from a data source, the raw data in a raw data format; ingesting the raw data in the raw data format, ingesting the raw data including using a pluggable channel adaptor configured for an access mechanism and a security context associated with data source; formatting the raw data into formatted data using a formatting plug-in configured to understand the raw data format, the formatted data in a common format including key-value pairs; storing the formatted data into an entity-attribute-value storage, the entity-attribute-value storage including other data that was formatted into the common format from one or more other raw data formats. 2 . The method of claim 1 , further comprising: receiving second raw data from a second data source, the second raw data in a second different raw data format; ingesting the second raw data in the second raw data format, ingesting the second raw data including using a pluggable channel adaptor configured for an access mechanism and a security context associated with the second data source; formatting the second raw data into second formatted data using a second formatting plug-in configured to understand the second raw data format, the second formatted data in the common format including key-value pairs; and storing the second formatted data into the entity-attribute-value storage along with the formatted data. 3 . The method of claim 2 , wherein the first raw data format is eXstensbile Markup Language (XML) and the second raw data format is Character Separated Value (CSV). 4 . The method of claim 1 , wherein storing the formatted data into an entity-attribute-value storage comprises storing the formatted data into an entity-attribute-value set, the entity-attribute-value storage including other data stored as a plurality of other entity-attribute-value data sets. 5 . The method of claim 4 , further comprising enriching the entity-attribute-value set using a pluggable enrichment service. 6 . The method of claim 4 , further comprising: receiving a consumer selection of attributes spanning one or more entity-attribute-value data sets; and defining one or more entities of interest to the consumer based on the selected attributes. 7 . The method of claim 6 , wherein defining one or more entities of interest to the consumer comprises formulating one or more schemas defining a data layout for the one or more entities. 8 . The method of claim 7 , further comprising: receiving an application request for data associated with at least one entity of interest selected from among the defined one or more entities of interest; and returning the requested data to the application in accordance with the one or more schemas. 9 . A computer program product for use at a computer system, the computer program product for implementing a method for managing data ingestion, the computer program product comprising one or more computer storage media having stored thereon computer-executable instructions that, when executed at a processor, cause the computer system to perform the method, including the following: receive raw data from a data source, the raw data in a raw data format; ingest the raw data in the raw data format, ingesting the raw data including using a pluggable channel adaptor configured for an access mechanism and a security context associated with data source; format the raw data into formatted data using a formatting plug-in configured to understand the raw data format, the formatted data in a common format including key-value pairs; and store the formatted data into an entity-attribute-value storage, the entity-attribute-value storage including other data that was formatted into the common format from one or more other raw data formats. 10 . The computer program product of claim 9 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive second raw data from a second data source, the second raw data in a second different raw data format; ingest the second raw data in the second raw data format, ingesting the second raw data including using a pluggable channel adaptor configured for an access mechanism and a security context associated with the second data source; format the second raw data into second formatted data using a second formatting plug-in configured to understand the second raw data format, the second formatted data in the common format including key-value pairs; and store the second formatted data into the entity-attribute-value storage along with the formatted data. 11 . The computer program product of claim 9 , wherein computer-executable instructions that, when executed, cause the computer system to store the formatted data into an entity-attribute-value storage comprise computer-executable instructions that, when executed, cause the computer system to store the formatted data into an entity-attribute-value set, the entity-attribute-value storage including other data stored as a plurality of other entity-attribute-value data sets. 12 . The computer program product of claim 11 , further comprising computer-executable instructions that, when executed, cause the computer system to enrich the entity-attribute-value set using a pluggable enrichment service. 13 . The computer program product of claim 11 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive a consumer selection of attributes spanning one or more entity-attribute-value data sets; and define one or more entities of interest to the consumer based on the selected attributes. 14 . The computer program product of claim 13 , wherein computer-executable instructions that, when executed, cause the computer system to define one or more entities of interest to the consumer comprise computer-executable instructions that, when executed, cause the computer system to formulating one or more schemas defining a data layout for reading data from the one or more entities. 15 . The computer program product of claim 14 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive an application request for data associated with at least one entity of interest selected from among the defined one or more entities of interest; and return the requested data to the application in accordance with the one or more schemas. 16 . A computer system, the computer system comprising: one or more processors; system memory; entity-attribute-value (EAV) storage for storing data in a common format including key-value pairs; and one or more computer storage devices having stored thereon computer executable instructions representing one or more channels and a formatter, the one or more channels configured to: receive raw data from a data source, the raw data in a raw data format; and ingest the raw data in the raw data format, ingesting the raw data including using a pluggable channel adaptor configured for an access mechanism and a security context associated with data source; and wherein the formatter is configured to: format the raw data into formatted data using a formatting plug-in configured to understand the raw data format, the formatted data in the common format; and store the formatted data into the entity-attribute-value (EAV) storage, the entity-attribute-value (EAV) storage including other data that was formatted into the common format from one or more other raw data formats
Physics · mapped topic
Physics · mapped topic
Data format conversion from or to a database · CPC title
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.