Identification of embedded browsers in application for automated software testing
US-2024303183-A1 · Sep 12, 2024 · US
US2024256433A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024256433-A1 |
| Application number | US-202418418835-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 22, 2024 |
| Priority date | Jan 25, 2023 |
| Publication date | Aug 1, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Modern reinforcement learning systems produce many high-quality policies throughout the learning process. However, to choose which policy to actually deploy in the real world, they must be tested under an intractable number of environmental conditions. A process, called Robust Population Optimization for a Small Set of Test cases (“RPOSST”), can select a small set of test cases from a larger pool based on a relatively small number of sample evaluations. RPOSST can treat the test case selection problem as a two-player game and can optimize a solution with provable k-of-N robustness, bounding the error relative to a test that used all the test cases in the pool. Empirical results demonstrate that RPOSST finds a small set of test cases that identify high quality policies in a toy one-shot game, poker datasets, and a high-fidelity racing simulator.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for determining a subset of test cases, selected from a set of test cases, that identify candidate deployment policies from a set of policies, comprising: evaluating each tuning policy of a subset of tuning policies, from the set of policies, with each test case, from the set of test cases, to generate a result matrix of test case results; utilizing a two-player game formulation for determining a loss function for each of m test cases sampled on each of N sampled policies from the set of policies; and selecting the subset of test cases based on a plurality of rounds of the two-player game formulation, wherein the subset of test cases are operable for determining the candidate deployment policies. 2 . The computer-implemented method of claim 1 , wherein the two-player formulation includes: choosing an m-tuple of test cases, from the set of test cases, and weights for each of the m-tuple of test cases; sampling N policies to test and target distributions for each of the N policies from an uncertainty distribution; choosing the k worst policies and respective target distributions that maximize the loss function; and sampling one of the k words policies and respective target distribution to provide a payoff. 3 . The computer-implemented method of claim 1 , wherein the selected test case and the tuning policy choice are performed simultaneously. 4 . The computer-implemented method of claim 1 , wherein the selected test case and the tuning policy choice are performed in sequence. 5 . The computer-implemented method of claim 1 , wherein the selected test case and the tuning policy choice are performed in sequence and hyperparameter values and constraints are applied, resulting in deterministic behavior. 6 . The computer-implemented method of claim 1 , wherein the tuning policies are selected from a reinforcement learning process. 7 . The computer-implemented method of claim 1 , wherein the tuning policies are selected to include a collection of skilled and unskilled policies with random variations. 8 . The computer-implemented method of claim 1 , wherein the tuning policies are selected with architectural and algorithmic similarities to future development candidate policies. 9 . The computer-implemented method of claim 2 , wherein the target distribution is based on a fixed uniform distribution over the m-tuple of test cases. 10 . The computer-implemented method of claim 2 , wherein test case selection is robust against differences between the tuning policies and the candidate deployment policies. 11 . The computer-implemented method of claim 2 , wherein test case selection is robust against differences between the target distribution used during training and an actual target distribution. 12 . The computer-implemented method of claim 1 , further comprising determining a weighting for each of the subset of test cases. 13 . The computer-implemented method of claim 12 , wherein the weighting on each of the subset of test cases is determined by expert guidance. 14 . The computer-implemented method of claim 1 , wherein the candidate deployment policies are policy for an artificial agent in a competitive racing simulation. 15 . A method for selecting policies to use in a racing simulation, comprising: accessing, by a development server, a set of candidate policies, where each policy is a collection of data stored in a policy database and represents at least one behavior for an agent operating a car in a racing simulation; selecting, by the development server, one or more tuning policies from the set of candidate policies; accessing, by the development server, a set of candidate test cases, where each test case is a collection of data stored in a test case database and represents at least one condition in an environment in the racing simulation; selecting, by the development server, one or more test cases from the candidate cases; first reviewing, by the development server, a performance of using the selected test cases with the tuning policies, where the first reviewing includes iteratively using machine learning; selecting, by the development server, one or more test cases as application test cases based on results of the first reviewing; second reviewing, by the development server, performance of using the application test cases with one or more candidate policies; and selecting, by the development server, one or more policies as deployment policies based on the results of the second reviewing. 16 . The method of claim 15 , where the first reviewing includes using a protagonist and an adversary. 17 . The method of claim 15 , further comprising sending the deployment policies to a game system. 18 . A computer-implemented method for identifying candidate deployment policies from a set of policies, comprising: evaluating each tuning policy of a subset of tuning policies, from the set of policies, with each test case, from a set of test cases, to generate a result matrix of test case results; utilizing a two-player game formulation for determining a loss function for each of m test cases sampled on each of N sampled policies from the set of policies; selecting a subset of test cases based on a plurality of rounds of the two-player game formulation; and sampling the set of policies on the subset of test cases to determine the candidate deployment policies. 19 . The computer-implemented method of claim 18 , further comprising determining a weighting for each of the subset of test cases. 20 . The computer-implemented method of claim 18 , wherein the tuning policies are selected to include a collection of skilled and unskilled policies with random variations.
Probabilistic graphical models, e.g. probabilistic networks · CPC title
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
for test design, e.g. generating new test cases · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.