What technology area does this patent fall under?

Primary CPC classification G06F16/954. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data retrieval using machine learning

US12067068B1 · US · B1

Patent metadata
Field	Value
Publication number	US-12067068-B1
Application number	US-202318309512-A
Country	US
Kind code	B1
Filing date	Apr 28, 2023
Priority date	Apr 28, 2023
Publication date	Aug 20, 2024
Grant date	Aug 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides techniques for data retrieval using machine learning. One example method includes receiving a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states, computing, based on the plurality of training episodes, total counts of a plurality of values in the states, initializing, for each state of the sequence of states in each training episode of the plurality of training episodes, a reward based on the total counts of the plurality of values, and training a reinforcement learning agent using the rewards.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, at a dictionary builder, a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states; identifying, by the dictionary builder, a set of dictionary values in the training episodes, the set of dictionary values being specified based on a task and being predetermined for the task; computing, based on the plurality of training episodes, for each dictionary value of the set of dictionary values, a total count of the dictionary value in all states of the plurality of training episodes; initializing a reward for each state in the sequence of states based on the dictionary values present in the state and, for each dictionary value from the set of dictionary values present in the state, a total count of the dictionary value in the sequence of states; and training a reinforcement learning agent to retrieve requested data by providing the training episodes and the initialized reward for each state of the sequence of states as inputs to a reinforcement learning framework, wherein the training comprises: using a reward function to learn intermediate rewards for intermediate states of the sequence of states based on the initialized reward for each state of the sequence of states, wherein the reward function is smoothed using a filter function; and suppressing one or more actions available to perform in the state based on reward values determined using the reward function for the one or more actions such that the one or more actions are ignored for purposes of reward calculation. 2. The method of claim 1 , further comprising encoding each state of the sequence of states in each training episode using a neural network. 3. The method of claim 2 , wherein the neural network comprises one or more of a convolutional neural network (CNN) or Bidirectional Encoder Representations from Transformers (BERT). 4. The method of claim 1 , wherein each dictionary value of the set of dictionary values comprises a keyword and a count of the keyword in one or more training episodes. 5. The method of claim 4 , wherein the keyword comprises a transaction code associated with a transaction or a selection from: “welcome,” “login,” “transaction,” or “status”. 6. A system, comprising: a memory including computer executable instructions; and a processor configured to execute the computer executable instructions and cause the system to: receive a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states; compute, based on the plurality of training episodes, total counts of a plurality of values in the states corresponding to a plurality of dictionary values of a dictionary; initialize, for each state of the sequence of states in each training episode of the plurality of training episodes, a reward based on the dictionary and on the total counts of the plurality of dictionary values; and train a reinforcement learning agent using the rewards; receive, at a dictionary builder, a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states; identify, by the dictionary builder, a set of dictionary values in the training episodes, the set of dictionary values being specified based on a task and being predetermined for the task; compute, based on the plurality of training episodes, for each dictionary value of the set of dictionary values, a total count of the dictionary value in all states of the plurality of training episodes; initialize a reward for each state in the sequence of states based on the dictionary values present in the state and, for each dictionary value from the set of dictionary values present in the state, a total count of the dictionary value in the sequence of states; and train a reinforcement learning agent to retrieve requested data using by providing the training episodes and the initialized reward for each state of the sequence of states as inputs to a reinforcement learning framework, wherein the training comprises: using a reward function to learn intermediate rewards for intermediate states of the sequence of states based on the initialized reward for each state of the sequence of states, wherein the reward function is smoothed using a filter function; and suppressing one or more actions available to perform in the state based on reward values determined using the reward function for the one or more actions such that the one or more actions are ignored for purposes of reward calculation. 7. The system of claim 6 , wherein the computer executable instructions further cause the system to encode each state of the sequence of states in each training episode using a neural network. 8. The system of claim 7 , wherein the neural network comprises one or more of a convolutional neural network (CNN) or Bidirectional Encoder Representations from Transformers (BERT). 9. A non-transitory computer readable medium comprising instructions to be executed in a computer system, wherein the instructions when executed in the computer system perform a method on a computing device, comprising: receiving, at a dictionary builder, a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states; identifying, by the dictionary builder, a set of dictionary values in the training episodes, the set of dictionary values being specified based on a task and being predetermined for the task; computing, based on the plurality of training episodes, for each dictionary value of the set of dictionary values, a total count of the dictionary value in all states of the plurality of training episodes; initializing a reward for each state in the sequence of states based on the dictionary values present in the state and, for each dictionary value from the set of dictionary values present in the state, a total count of the dictionary value in the sequence of states; and training a reinforcement learning agent to retrieve requested data by providing the training episodes and the initialized reward for each state of the sequence of states as inputs to a reinforcement learning framework, wherein the training comprises: using a reward function to learn intermediate rewards for intermediate states of the sequence of states based on the initialized reward for each state of the sequence of states, wherein the reward function is smoothed using a filter function; and suppressing one or more actions available to perform in the state based on reward values determined using the reward function for the one or more actions such that the one or more actions are ignored for purposes of reward calculation. 10. The non-transitory computer readable medium of claim 9 , wherein the method further comprises encoding each state of the sequence of states in each training episode using a neural network. 11. The non-transitory computer readable medium of claim 10 , wherein the neural network comprises one or more of a convolutional neural network (CNN) or Bidirectional Encoder Representations from Transformers (BERT).

Assignees

Intuit Inc

Inventors

Margolin Itay

Classifications

G06F16/954Primary
Navigation, e.g. using categorised browsing · CPC title

Patent family

Related publications grouped by family.

View patent family 92305694

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12067068B1 cover?: The present disclosure provides techniques for data retrieval using machine learning. One example method includes receiving a plurality of training episodes associated with different environments, wherein each training episode of the plurality of training episodes includes a sequence of states, computing, based on the plurality of training episodes, total counts of a plurality of values in the …
Who is the assignee on this patent?: Intuit Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/954. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

A crawler of web automation scripts

Device, computer program and computer-implemented method for machine learning

Automatic hyperlinking of documents

Systems providing a learning controller utilizing indexed memory and methods thereto

Automatically constructing training sets for electronic sentiment analysis

Frequently asked questions