Inferring entity attribute values
US-9501503-B2 · Nov 22, 2016 · US
US9870411B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9870411-B2 |
| Application number | US-201414557347-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 1, 2014 |
| Priority date | Jul 15, 2014 |
| Publication date | Jan 16, 2018 |
| Grant date | Jan 16, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention extends to methods, systems, and computer program products for managing data ingestion. Aspects of the invention include a pluggable architecture channel service (e.g., a push/pull channel service) to ingest raw data. Aspects of the invention also include a pluggable architecture formatter to convert ingested raw data into a common format, such as, for example, key value pairs. Aspects of the invention also include an EAV storage with functionality allowing consumers to define multiple entities on (and spanning) ingested data sets. Accordingly, data can be ingested without data loss, without having to define extraction logic, and without having to define a storage schema.
Opening claim text (preview).
What is claimed: 1. At a computer system, a method for supplementing a data consumer defined data entity with additional data from a new data source, the data consumer defined data entity spanning one or more ingested data sets from one or more data sources, the method comprising: receiving a new data set from the new data source, the new data set in a raw data format used by the new data source, the new data source in addition to the one or more data sources; responsive to receiving the new data set from the new data source: ingesting the new data set in the raw data format, ingesting the new-data set including utilizing a combined access mechanism and security context matched to the new data source; converting the new data set into a common format using a formatting plug-in configured to understand the raw data format; and storing new data set into storage, the storage including the one or more ingested data sets, the one or more ingested data sets having been previously formatted into the common format from one or more other raw data formats used by the one or more data sources, the one or more data sets previously ingested from the one or more data sources using combined access mechanisms and security contexts matched to each of the one or more data sources; and applying a schema to stored data in the common format, the stored data from the new ingested data set and the one or more ingested data sets, the stored data associated with data consumer selected attributes included in the data consumer defined data entity. 2. The method of claim 1 , further comprising: receiving a further data set from a further data source, the further data set in a further raw data format used by the further data source, the further raw data format differing from the raw data format; ingesting the further data set in the second raw data format, ingesting the second raw data including using a further combined access mechanism and a security context matched to the further data source the further combined access mechanism and security context differing from the combined access mechanism and security context; converting the further data set into the common format using a further formatting plug-in configured to understand the further raw data format; and supplementing the stored data by storing the further data set into the storage along with the new data set. 3. The method of claim 2 , wherein the first raw data format is eXstensbile Markup Language (XML) and the second raw data format is Character Separated Value (CSV). 4. The method of claim 1 , wherein storing the new data set into storage comprises storing the new data set into an entity-attribute-value data set. plurality of other entity attribute value data sets. 5. The method of claim 4 , further comprising enriching the entity-attribute-value data set with additional data from a pluggable enrichment service. 6. The method of claim 1 , further comprising: receiving the consumer selection of attributes indicating that the consumer defined data entity is to span a plurality of data sets. 7. The method of claim 6 , further comprising formulating the schema, the schema defining a data layout for returning data associated with the data consumer defined entity. 8. The method of claim 7 , further comprising: receiving an application request for data associated with the data consumer defined data entity; and returning the requested data from the storage to the application in the defined data layout in accordance with the schema. 9. A computer program product for use at a computer system, the computer program product for implementing a method for supplementing a data consumer defined data entity with additional data from a new data source, the data consumer defined data entity spanning one or more ingested data sets from one or more data sources, the computer program product comprising one or more computer storage media having stored thereon computer-executable instructions that, when executed at a processor, cause the computer system to perform the method, including the following: receive a new data set from the new data source, the new data set in a raw data format used by the new data source, the new data source in addition to the one or more data sources; responsive to receiving the new data set from the new data source: ingest the new data set in the raw data format, ingesting the new data set including utilizing a combined access mechanism and security context matched to the new data source; convert the new data set into a common format using a formatting plug-in configured to understand the raw data formate; and store new data set into storage, the storage including the one or more ingested data sets, the one or more ingested data sets having been previously formatted into the common format from one or more other raw data formats used by the one or more data sources, the one or more data sets previously ingested from the one or more data sources using combined access mechanisms and security contexts matched to each of the one or more data sources; and apply a schema to stored data in the common format, the stored data from the new ingested data set and the one or more ingested data sets, the stored data associated with data consumer selected attributes included in the data consumer defined data entity. 10. The computer program product of claim 9 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive a further data set from a further data source, the further data set in a further raw data format used by the further data source, the further raw data format differing from the raw data format; ingest the further data set in the second raw data format, ingesting the second raw data including using a further combined access mechanism and a security context matched to the further data source the further combined access mechanism and security context differing from the combined access mechanism and security context; convert the further data set into the common format using a further formatting plug-in configured to understand the further raw data format; and supplement the stored data by storing the further data set into the storage along with the new data set. 11. The computer program product of claim 9 , wherein computer-executable instructions that, when executed, cause the computer system to store the data set into storage comprise computer-executable instructions that, when executed, cause the computer system to store the data set into an entity-attribute-value data set. 12. The computer program product of claim 11 , further comprising computer-executable instructions that, when executed, cause the computer system to enrich the entity-attribute-value data set with additional data from a pluggable enrichment service. 13. The computer program product of claim 11 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive the consumer selection of attributes indicating that the consumer defined data entity is to span a plurality of data sets. 14. The computer program product of claim 13 , further computer-executable instructions that, when executed, cause the computer system to formulate the schema, the schema defining a data layout for returning data associated with the data consumer defined entity. 15. The computer program product of claim 14 , further comprising computer-executable instructions that, when executed, cause the computer system to: receive an application request for data associated with the data consumer defined data entity; and return
Physics · mapped topic
Data format conversion from or to a database · CPC title
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.