Providing a search service including updating aspects of a document using a configurable schema
US-2015370791-A1 · Dec 24, 2015 · US
US9251182B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9251182-B2 |
| Application number | US-201313798229-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 13, 2013 |
| Priority date | May 29, 2012 |
| Publication date | Feb 2, 2016 |
| Grant date | Feb 2, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for supplementing structured information within a data system for entities based on unstructured data analyzes a document with unstructured data and extracts attribute values from the unstructured data for one or more entities of the data system. Entity records with structured information are retrieved from the data system based on the extracted attribute values. Entity references for corresponding entities of the data system are constructed based on a comparison of the retrieved entity records and the extracted attribute values. The entity references are linked to the corresponding entities within the data system, with the entity references including extracted attributes from the unstructured data for corresponding linked entities.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of supplementing structured information within a data system for entities based on unstructured data comprising: analyzing documents with unstructured data specifying two or more entities of the structured information and interactions between those two or more entities; identifying from the interactions within the unstructured data of the documents one or more relationships between entities of the structured information; extracting attribute values from the unstructured data for one or more entities of the structured information base on a comparison of the unstructured data with one or more dictionaries each including values for a corresponding attribute of an entity within the data system, wherein extracting attribute values from the unstructured data includes: generating tokens from the unstructured data and comparing the tokens to the values within the one or more dictionaries, wherein at least one value within a dictionary includes a plurality of tokens; retrieving entity records with structured information form the data system based on the extracted attribute values; constructing entity references for corresponding one or more entities of the data system based on a comparison of the retrieved entity records and the extracted attribute values; linking the entity references to the corresponding one or more entities within the data system to supplement the structured information for the corresponding one or more entities with information extracted from the unstructured data, wherein the entity references include extracted attributes from the unstructured data for corresponding linked entities; and linking entities of the structured information to each other within the structured information to indicate related entities based on the one or more relationships between those entities identified form the interactions specified within the unstructured data of the documents. 2. The computer-implemented method of claim 1 , wherein the data system includes a master data management system and the documents are received from a content management system. 3. The computer-implemented method of claim 1 , wherein extracting attribute values from the unstructured data further includes: extracting the attribute values from the unstructured data based on an attribute value within the unstructured data and a dictionary value including a common portion of an attribute value and being within a certain distance. 4. The computer-implemented method of claim 1 , wherein an attribute of an entity includes a plurality of atomic attributes, and extracting attribute values from the unstructured data includes: extracting attribute values for each of the individual atomic attributes from the unstructured data. 5. The computer-implemented method of claim 1 , wherein constructing entity references includes: constructing entity references for corresponding one or more entities of the data system based on a fuzzy match of the retrieved entity records and the extracted attribute values. 6. The computer-implemented method of claim 1 , wherein linking the entity references includes: merging the entity references with records of the corresponding one or more entities within the data system. 7. The computer-implemented method of claim 1 , wherein linking the entity references includes: inserting the entity references into one of the data system and an external data source based on a comparison of matching scores for the entity references with corresponding thresholds.
Physics · mapped topic
Physics · mapped topic
Querying · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.