Methods and systems for automatically identifying keywords of very large text datasets
US-2018239741-A1 · Aug 23, 2018 · US
US11314784B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11314784-B2 |
| Application number | US-201916696916-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 26, 2019 |
| Priority date | Nov 26, 2019 |
| Publication date | Apr 26, 2022 |
| Grant date | Apr 26, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Relating data in various distributed data sources for use in data analysis is described. The data sources are generally related by first generating a keyword model for a plurality of data sources, which includes a plurality of weighted keywords, and providing a visual representation of the keyword model, such as a word cloud, to a user. The user interacts with the visual representation to modify, update, and select various aspects of the visual representation. The user also identifies keywords and data sources of interest such that a plurality of relational models are generated based on the user interest. Relating the data sources also includes providing the plurality of relational models to the user, receiving a user selection of the plurality of relational models, and generating a combined dataset model which relates one or more of the data sources according to the selected relational models.
Opening claim text (preview).
What is claimed is: 1. A method comprising: generating a keyword model for a plurality of data sources comprising a plurality of weighted keywords sorted according to a weighted score of each of the plurality of weighted keywords; providing a visual representation of the keyword model to a user; receiving a user query identifying keywords and data sources of interest to the user; generating a plurality of relational models for the plurality of data sources using the plurality of weighted keywords and the user query, wherein each relational model of the plurality of relational models comprises a relational link between at least two data sources of the plurality of data sources; providing the plurality of relational models to the user; receiving a relational model update from the user; updating the plurality of relational models based on the relational model update; receiving a user selection input comprising a selection of at least one of the plurality of relational models; and generating a combined dataset model from the plurality of data sources using the user selection input by associating the plurality of data sources according to at least a relational link in the at least one of the plurality of relational models in the user selection input. 2. The method of claim 1 , further comprising: upon providing the keyword model to the user, receiving a keyword model update from the user; removing one or more weighted keywords from the plurality of weighted keywords in the keywords model based on the keyword model update; and updating a weighting method for the removed weighted keywords. 3. The method of claim 1 , wherein providing the keyword model to the user comprises at least one of: rendering the keyword model as a word cloud to the user, wherein the word cloud comprises the plurality of weighted keyword, wherein the word cloud comprises a visual distinction between the plurality of weighted keywords based on relative assigned scores of the plurality of weighted keywords; and rendering the keyword model as a sorted list to the user, wherein the sorted listed comprises the plurality of weighted keyword sorted according to the relative assigned scores; and wherein receiving the user query comprises at least one of: receiving a word cloud selection from the user via the word cloud; receiving a sorted list selection from the user via the sorted list; and receiving a user query string input. 4. The method of claim 1 , wherein generating the plurality of relational models comprises: determining a set of related keywords from the user query; identifying a subset of data sources associated with the set of related keywords from the plurality of data sources; and generating the plurality of relational models between data sources in the subset of data sources based on the set of related keywords and the plurality of weighted keywords. 5. The method of claim 4 , wherein providing the plurality of relational models to the user comprises: rendering a visual representation of each of data source in the subset of data sources; and rendering a plurality of visual links between the visual representations based on the generated plurality of relational models. 6. The method of claim 5 , wherein the relational model update comprises at least one of: removing a data source from the subset of data sources; adding a data source to the subset of data sources; removing a relational model between data sources in the subset of data sources; and adding a relational model between data sources in the subset of data sources. 7. The method of claim 1 , wherein generating the plurality of relational models comprises: generating a primary set of relational models based on the plurality of weighted keywords and the user query; and generating one or more alternate relational models based on the plurality of weighted keywords and the user query. 8. A system, comprising: a processor; and a memory comprising instructions which, when executed on the processor, performs an operation, the operation comprising: generating a keyword model for a plurality of data sources comprising a plurality of weighted keywords sorted according to a weighted score of each of the plurality of weighted keywords; providing a visual representation of the keyword model to a user; receiving a user query identifying keywords and data sources of interest to the user; generating a plurality of relational models for the plurality of data sources using the plurality of weighted keywords and the user query, wherein each relational model of the plurality of relational models comprises a relational link between at least two data sources of the plurality of data sources; providing the plurality of relational models to the user; receiving a relational model update from the user; updating the plurality of relational models based on the relational model update; receiving a user selection input comprising a selection of at least one of the plurality of relational models; and generating a combined dataset model from the plurality of data sources using the user selection input by associating the plurality of data sources according to at least a relational link in the at least one of the plurality of relational models in the user selection input. 9. The system of claim 8 , wherein the operation further comprises: upon providing the keyword model to the user, receiving a keyword model update from the user; removing one or more weighted keywords from the plurality of weighted keywords in the keywords model based on the keyword model update; and updating a weighting method for the removed weighted keywords. 10. The system of claim 8 , wherein providing the keyword model to the user comprises at least one of: rendering the keyword model as a word cloud to the user, wherein the word cloud comprises the plurality of weighted keyword, wherein the word cloud comprises a visual distinction between the plurality of weighted keywords based on relative assigned scores of the plurality of weighted keywords; and rendering the keyword model as a sorted list to the user, wherein the sorted listed comprises the plurality of weighted keyword sorted according to the relative assigned scores; and wherein receiving the user query comprises at least one of: receiving a word cloud selection from the user via the word cloud; receiving a sorted list selection from the user via the sorted list; and receiving a user query string input. 11. The system of claim 8 , wherein generating the plurality of relational models comprises: determining a set of related keywords from the user query; identifying a subset of data sources associated with the set of related keywords from the plurality of data sources; and generating the plurality of relational models between data sources in the subset of data sources based on the set of related keywords and the plurality of weighted keywords. 12. The system of claim 11 , wherein providing the plurality of relational models to the user comprises: rendering a visual representation of each of data source in the subset of data sources; and rendering a plurality of visual links between the visual representations based on the generated plurality of relational models. 13. The system of claim 12 , wherein the relational model update comprises at least one of: removing a data source from the subset of data sources; adding a data source to the subset of data sources; removing a relational model between data sources in the subset of data sources; and adding a relational model between data sources in the subset of data sources.
Natural language query formulation · CPC title
Interactive query statement specification based on a database schema · CPC title
with adaptation to user needs · CPC title
Visualization; Browsing · CPC title
Presentation of query results · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.