Behavior generation for situationally-aware social robots
US-2024326256-A1 · Oct 3, 2024 · US
US2025086471A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025086471-A1 |
| Application number | US-202418733226-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 4, 2024 |
| Priority date | Sep 11, 2023 |
| Publication date | Mar 13, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for generating a small language model are provided. In particular, a computing device may obtain a general dataset including a plurality of general data, annotate a subset of the general dataset based on one or more classifier metrics indicative of a quality of the general dataset, train a classifier based on the annotated subset of the general dataset and the one or more classifier metrics, analyze each general data of the general dataset to determine a score for each of the one or more classifier metrics associated with the respective general data using the trained classifier, generate a filtered general dataset by filtering the general dataset based on one or more filters, train the small language model with the filtered general dataset, generate a synthetic dataset for refining the small language model, and train the small language model with the synthetic dataset.
Opening claim text (preview).
What is claimed is: 1 . A method for generating a small language model, the method comprising: obtaining a general dataset, the general dataset including a plurality of general data; annotating a subset of the general dataset based on one or more classifier metrics indicative of a quality of the general dataset, the subset of the general dataset being representative of the general dataset; training a classifier based on the annotated subset of the general dataset and the one or more classifier metrics; analyzing each general data of the general dataset to determine a score for each of the one or more classifier metrics associated with the respective general data using the trained classifier; generating a filtered general dataset by filtering the general dataset based on one or more filters, the one or more filters indicative of threshold scores for corresponding classifier metrics; training the small language model with the filtered general dataset; generating a synthetic dataset for refining the small language model; and subsequent to training the small language model with the filtered general dataset, training the small language model with the synthetic dataset. 2 . The method of claim 1 , wherein each general data of the general dataset is associated with a score for each of the one or more classifier metrics. 3 . The method of claim 1 , wherein the one or more classifier metrics comprise factual knowledge, everyday knowledge, scientific knowledge, human behavior, toxicity, completeness, obscenity, obscurity, commonality, reasoning, promotional content, and/or unwanted content. 4 . The method of claim 1 , wherein generating the filtered general dataset by filtering the general dataset based on the one or more filters comprises: generating the one or more filters for the one or more classifier metrics, each filter corresponding to a respective classifier metric and indicative of a threshold score assigned for the respective classifier metric; and filtering the general dataset based on the one or more filters. 5 . The method of claim 1 , wherein generating a synthetic dataset for refining the small language model comprises: identifying one or more deficit skills in the small language model; determining one or more data formats to address the one or more deficit skills; generating the one or more prompts for generating the one or more data formats; injecting sources of randomization and diversity in the one or more prompts; and generating the synthetic dataset based on the one or more prompts using a generative transformer, the synthetic dataset including the one or more data formats. 6 . The method of claim 5 , wherein the one or more deficit skills include any skill or topic for boosting the capability of the small language model. 7 . The method of claim 5 , wherein the generative transformer is a multimodal large language model. 8 . The method of claim 1 , further comprising: prior to training the small language model with the filtered general dataset, performing a warm start by copying weights from an existing trained model into the small language model. 9 . A computing device for generating a small language model, the computing device comprising: a processor; and a memory having a plurality of instructions stored thereon that, when executed by the processor, causes the computing device to: generate a small language model, the method comprising: obtain a general dataset, the general dataset including a plurality of general data; annotate a subset of the general dataset based on one or more classifier metrics indicative of a quality of the general dataset, the subset of the general dataset being representative of the general dataset; train a classifier based on the annotated subset of the general dataset and the one or more classifier metrics; analyze each general data of the general dataset to determine a score for each of the one or more classifier metrics associated with the respective general data using the trained classifier; generate a filtered general dataset by filtering the general dataset based on one or more filters, the one or more filters indicative of threshold scores for corresponding classifier metrics; train the small language model with the filtered general dataset; generate a synthetic dataset for refining the small language model; and subsequent to training of the small language model with the filtered general dataset, train the small language model with the synthetic dataset. 10 . The computing device of claim 9 , wherein each general data of the general dataset is associated with a score for each of the one or more classifier metrics. 11 . The computing device of claim 9 , wherein the one or more classifier metrics comprise factual knowledge, everyday knowledge, scientific knowledge, human behavior, toxicity, completeness, obscenity, obscurity, commonality, reasoning, promotional content, and/or unwanted content. 12 . The computing device of claim 9 , wherein to generate the filtered general dataset by filtering the general dataset based on the one or more filters comprises to: generate the one or more filters for the one or more classifier metrics, each filter corresponding to a respective classifier metric and indicative of a threshold score assigned for the respective classifier metric; and filter the general dataset based on the one or more filters. 13 . The computing device of claim 9 , to generate a synthetic dataset for refining the small language model comprises to: identify one or more deficit skills in the small language model; determine one or more data formats to address the one or more deficit skills; generate the one or more prompts for generating the one or more data formats; inject sources of randomization and diversity in the one or more prompts; and generate the synthetic dataset based on the one or more prompts using a generative transformer, the synthetic dataset including the one or more data formats. 14 . The computing device of claim 13 , wherein the one or more deficit skills include any skill or topic for boosting the capability of the small language model. 15 . The computing device of claim 9 , wherein the plurality of instructions, when executed, further cause the computing device to: prior to training of the small language model with the filtered general dataset, perform a warm start by copying weights from an existing trained model into the small language model. 16 . A computer storage medium storing computer-executable instructions that when executed cause at least one processor to perform operations, comprising: obtaining a general dataset, the general dataset including a plurality of general data; annotating a subset of the general dataset based on one or more classifier metrics indicative of a quality of the general dataset, the subset of the general dataset being representative of the general dataset; training a classifier based on the annotated subset of the general dataset and the one or more classifier metrics; analyzing each general data of the general dataset to determine a score for each of the one or more classifier metrics associated with the respective general data using the trained classifier; generating a filtered general dataset by filtering the general dataset based on one or more filters, the one or more filters indicative of threshold scores for corresponding classifier metrics; training the small language model with the filtered general dataset; generating a synthetic dataset for refining the small language model; and subsequent t
Combinations of networks · CPC title
Generative networks · CPC title
Active learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.