Deep mining of enterprise data sources

US2024184793A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024184793-A1
Application numberUS-202218073314-A
CountryUS
Kind codeA1
Filing dateDec 1, 2022
Priority dateDec 1, 2022
Publication dateJun 6, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus are disclosed for deep mining of data sources. A deep miner provides extended reach into available structured databases and/or unstructured data sources. Direct evaluation of columns for relevance to a client query provides a wider array of columns having potential relevance, compared to conventional tools relying on table evaluation. Direct column evaluation is extended to unstructured data sources. A broad interface extends the reach of search seamlessly across a wide range of structured and unstructured data sources. Disclosed techniques provide superior results with reduced computing resource utilization. Limitations of human expertise are overcome. Further efficiencies are achieved through caching, ranking of columns or results, search refinement, and customized responses.

First claim

Opening claim text (preview).

1 . A computer-implemented method comprising: based one or more search attributes extracted from a client query, directly identifying one or more columns relevant to the client query, from a search universe comprising one or more data sources; executing respective database queries on the identified one or more columns to obtain results for the client query; determining respective hierarchy paths for the results; formulating a response to the client query based on the results and the respective hierarchy paths; and transmitting the response. 2 . The computer-implemented method of claim 1 , wherein the client query is a natural language query. 3 . The computer-implemented method of claim 1 , wherein the one or more data sources comprise a structured database and an unstructured data source. 4 . The computer-implemented method of claim 1 , wherein the search universe comprises an unstructured data source, and the method further comprises: prior to the directly identifying, learning a map of the unstructured data source; wherein a given column, among the one or more columns and within the unstructured data source, is directly identified using the map. 5 . The computer-implemented method of claim 1 , wherein: the search universe comprises a structured database having one or more dictionaries; and a given column, among the one or more columns and within the structured database, is directly identified using the one or more dictionaries. 6 . The computer-implemented method of claim 1 , further comprising: receiving the client query from a client; and determining the search universe from one or more certificates storing authorization of the client for the one or more data sources. 7 . The computer-implemented method of claim 1 , wherein a given one of the results is obtained from a given column of the identified one or more columns, and the determining the hierarchy path of the given result comprises tracing upwards from the given column to a corresponding data source within the search universe. 8 . The computer-implemented method of claim 1 , wherein the identified one or more columns comprise a plurality of columns and the method further comprises: assigning ranks to the columns according to a predetermined criterion; wherein the formulating is further based on the ranks. 9 . The computer-implemented method of claim 1 , wherein the database queries are first database queries, the results comprise a given result, and the method further comprises: generating a second database query targeting a column distinct from the identified columns, to search for additional results similar to the given result; and executing the second database query to obtain the additional results; wherein the response includes the additional results and additional hierarchy paths of the additional results. 10 . The computer-implemented method of claim 1 , further comprising: maintaining a cache for future client queries; and updating the cache with on one or more of: the client query, the search attributes, the identified columns, the database queries, the results, or the hierarchy paths; wherein the search universe comprises a plurality of data sources having distinct data formats, and the results from the plurality of data sources are stored in the cache in a common intermediate format. 11 . The computer-implemented method of claim 10 , wherein the client query is a first client query, the results are first results, a given one of the database queries targets a first data source, and the method further comprises: receiving a second client query overlapping the first client query; executing the given database query on an increment of the first data source to obtain incremental results; retrieving the first results and at least a first one of the hierarchy paths from the cache; and formulating a second response based on the first results, the incremental results, and the first hierarchy path. 12 . The computer-implemented method of claim 1 , wherein the formulating further comprises: associating ranks with the results; and for one or more highest ranking results among the results, incorporating one or more encompassing data structures into the response. 13 - 15 . (canceled) 16 . A system comprising: one or more hardware processors with memory coupled thereto; and computer-readable media storing instructions which, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations comprising: based one or more search attributes extracted from a client query, directly identifying one or more columns relevant to the client query; executing respective database queries on the identified one or more columns to obtain results for the client query; determining respective hierarchy paths for the results; formulating a response to the client query based on the results and the respective hierarchy paths; and transmitting the response. 17 . A host computing environment, comprising: the system of claim 16 ; and one or more software applications configured to: provide a plurality of client queries, including the client query, to the system; and receive a corresponding plurality of responses, including the response, from the system; wherein the instructions are inaccessible from outside the host computing environment. 18 . The system of claim 17 , wherein a given column of the identified one or more columns is external to the host computing environment. 19 . The system of claim 16 operated as a stand-alone service. 20 . The system of claim 16 , wherein the one or more columns are identified from a search universe comprising at least one structured database and at least one unstructured data source. 21 . The computer-implemented method of claim 3 , wherein the client query is associated with a search key and a search range, the identified one or more columns comprise a first column of the structured database and a second column of the unstructured data source, and further wherein: the first and second columns are identified as being relevant to the search key; the identifying the first column is performed using one or more dictionaries of the structured database; the identifying the second column is performed using metadata of the unstructured data source; the executing the database queries for the first and second columns are performed over the search range and lead respectively to first and second results of the results; the formulating the response comprises collating the first and second results with the hierarchy paths for the first and second results; and the response is transmitted to a client from which the client query was received. 22 . The computer-implemented method of claim 21 , wherein the directly identifying the first column bypasses evaluation of any data structures encompassing the first column, for relevance to the search key. 23 . The computer-implemented method of claim 21 , wherein the directly identifying the second column bypasses evaluation of any data structures encompassing the second column, for relevance to the search key.

Assignees

Inventors

Classifications

  • Data mining · CPC title

  • Presentation of query results · CPC title

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • Query execution · CPC title

  • in federated or virtual databases · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024184793A1 cover?
Methods and apparatus are disclosed for deep mining of data sources. A deep miner provides extended reach into available structured databases and/or unstructured data sources. Direct evaluation of columns for relevance to a client query provides a wider array of columns having potential relevance, compared to conventional tools relying on table evaluation. Direct column evaluation is extended t…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F16/2465. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).