Natural language database interface

US2024346021A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024346021-A1
Application numberUS-202318301739-A
CountryUS
Kind codeA1
Filing dateApr 17, 2023
Priority dateApr 17, 2023
Publication dateOct 17, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides an approach for training a machine learning model. Embodiments include receiving text comprising a natural language request. Embodiments include providing one or more inputs to a source machine learning model based on the text, wherein the source machine learning model has been trained using source training data corresponding to a plurality of databases. Embodiments include receiving, from the source machine learning model in response to the one or more inputs, a database query in a syntax corresponding to a target database. Embodiments include generating training data for a target machine learning model based on the text and the database query received from the source machine learning model, wherein the target machine learning model has been trained using a smaller amount of training data than the source training data that was used to train the source machine learning model.

First claim

Opening claim text (preview).

We claim: 1 . A method of training a machine learning model, the method comprising: receiving text comprising a natural language request; providing one or more inputs to a source machine learning model based on the text, wherein the source machine learning model has been trained using source training data corresponding to a plurality of databases; receiving, from the source machine learning model in response to the one or more inputs, a database query in a syntax corresponding to a target database; generating training data for a target machine learning model based on the text and the database query received from the source machine learning model, wherein the target machine learning model has been trained using a smaller amount of training data than the source training data that was used to train the source machine learning model; and training the target machine learning model based on the training data. 2 . The method of claim 1 , wherein the generating of the training data is further based on user feedback with respect to the database query received from the source machine learning model. 3 . The method of claim 1 , wherein the target machine learning model is used to determine a new query in the syntax corresponding to the target database based on new text comprising a new natural language request. 4 . The method of claim 1 , wherein the source machine learning model was trained by a third party, and wherein internal logic of the source machine learning model is not available for analysis with respect to the database query received from the source machine learning model. 5 . The method of claim 4 , wherein respective internal logic of the target machine learning model is available for analysis. 6 . The method of claim 1 , wherein the source machine learning model utilizes a larger amount of physical computing resources than the target machine learning model. 7 . The method of claim 6 , further comprising determining whether to use the source machine learning model or the target machine learning model for subsequently-received text based on a comparison of the subsequently-received text with the text. 8 . The method of claim 7 , wherein the comparison is based on generating embeddings of the subsequently-received text and the text. 9 . The method of claim 1 , further comprising determining whether to stop using the source machine learning model based on a determined accuracy of the target machine learning model. 10 . A system for training a machine learning model, comprising: at least one memory; and at least one processor coupled to the at least one memory, the at least one processor and the at least one memory configured to: receive text comprising a natural language request; provide one or more inputs to a source machine learning model based on the text, wherein the source machine learning model has been trained using source training data corresponding to a plurality of databases; receive, from the source machine learning model in response to the one or more inputs, a database query in a syntax corresponding to a target database; generate training data for a target machine learning model based on the text and the database query received from the source machine learning model, wherein the target machine learning model has been trained using a smaller amount of training data than the source training data that was used to train the source machine learning model; and train the target machine learning model based on the training data. 11 . The system of claim 10 , wherein the generating of the training data is further based on user feedback with respect to the database query received from the source machine learning model. 12 . The system of claim 10 , wherein the target machine learning model is used to determine a new query in the syntax corresponding to the target database based on new text comprising a new natural language request. 13 . The system of claim 10 , wherein the source machine learning model was trained by a third party, and wherein internal logic of the source machine learning model is not available for analysis with respect to the database query received from the source machine learning model. 14 . The system of claim 13 , wherein respective internal logic of the target machine learning model is available for analysis. 15 . The system of claim 10 , wherein the source machine learning model utilizes a larger amount of physical computing resources than the target machine learning model. 16 . The system of claim 15 , wherein the at least one processor and the at least one memory are further configured to determine whether to use the source machine learning model or the target machine learning model for subsequently-received text based on a comparison of the subsequently-received text with the text. 17 . The system of claim 16 , wherein the comparison is based on generating embeddings of the subsequently-received text and the text. 18 . The system of claim 10 , wherein the at least one processor and the at least one memory are further configured to determine whether to stop using the source machine learning model based on a determined accuracy of the target machine learning model. 19 . A non-transitory computer readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to: receive text comprising a natural language request; provide one or more inputs to a source machine learning model based on the text, wherein the source machine learning model has been trained using source training data corresponding to a plurality of databases; receive, from the source machine learning model in response to the one or more inputs, a database query in a syntax corresponding to a target database; generate training data for a target machine learning model based on the text and the database query received from the source machine learning model, wherein the target machine learning model has been trained using a smaller amount of training data than the source training data that was used to train the source machine learning model; and train the target machine learning model based on the training data. 20 . The non-transitory computer readable medium of claim 19 , wherein the generating of the training data is further based on user feedback with respect to the database query received from the source machine learning model.

Assignees

Inventors

Classifications

  • Translation of natural language queries to structured queries · CPC title

  • G06F40/20Primary

    Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024346021A1 cover?
The present disclosure provides an approach for training a machine learning model. Embodiments include receiving text comprising a natural language request. Embodiments include providing one or more inputs to a source machine learning model based on the text, wherein the source machine learning model has been trained using source training data corresponding to a plurality of databases. Embodime…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24522. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).