Sign language processing

US2025166417A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025166417-A1
Application numberUS-202418957195-A
CountryUS
Kind codeA1
Filing dateNov 22, 2024
Priority dateNov 22, 2023
Publication dateMay 22, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method including obtaining, during a communication session between a first device and a second device, video data that includes sign language content. In these and other embodiments, the sign language content may include one or more video frames of a figure performing sign language. The method may further include obtaining audio data that represents the sign language content in the video data and providing, during the communication session, the video data and the audio data to a sign language processing system that includes a machine learning model. In these and other embodiments, the video data and the audio data may be generated independent of the sign language processing system. The method may also include training the machine learning model during the communication session using the video data and the audio data.

First claim

Opening claim text (preview).

1 . A method comprising: obtaining, during a communication session between a first device and a second device, video data that includes sign language content, the sign language content including one or more video frames of a figure performing sign language; obtaining audio data that represents the sign language content in the video data; providing, during the communication session, the video data and the audio data to a sign language processing system that includes a machine learning model, the video data and the audio data being generated independent of the sign language processing system; and training the machine learning model during the communication session using the video data and the audio data. 2 . The method of claim 1 , wherein the audio data and the video data are obtained from different devices. 3 . The method of claim 1 , wherein one of the first device and the second device provides one of the audio data and the video data and the other of the first device and the second device does not provide the video data and does not provide the audio data. 4 . The method of claim 1 , wherein the machine learning model is part of a sign language generation system or a sign language recognition system. 5 . The method of claim 1 , wherein the audio data is obtained before the video data. 6 . The method of claim 1 , wherein training the machine learning model during the communication session using the video data and the audio data includes directing the audio data to an automatic speech recognition system configured to generate first text data that includes a transcription of spoken words in the audio data, the first text data used in training the machine learning model. 7 . The method of claim 6 , wherein training the machine learning model during the communication session includes: generating, by the sign language processing system, second text data by providing the video data to the machine learning model, the second text data representing the sign language content in the video data; comparing the first text data and the second text data; and adjusting the machine learning model based on the comparison. 8 . The method of claim 7 , wherein the steps of generating, comparing, and adjusting occur before an end of the communication session. 9 . The method of claim 6 , wherein training the machine learning model during the communication session includes: generating, by the sign language processing system, second video data by providing the first text data to the machine learning model, the second video data including sign language representing the first text data; comparing the video data and the second video data; and adjusting the machine learning model based on the comparison. 10 . The method of claim 1 , wherein training the machine learning model during the communication session using the video data and the audio data includes training the machine learning model using data that is not obtained from the communication session in conjunction with the video data and the audio data from the communication session. 11 . The method of claim 1 , wherein the video data and the audio data are deleted at an end of the communication session. 12 . The method of claim 1 , wherein the video data and the audio data are deleted after the machine learning model is trained using the video data and the audio data. 13 . The method of claim 1 , wherein the video data and the audio data are deleted within a predetermined amount of time after an end of the communication session. 14 . At least one non-transitory computer-readable media configured to store one or more instructions that, in response to being executed by a system, cause or direct the system to perform the method of claim 1 . 15 . A system comprising: one or more computer readable mediums including instructions; one or more computing systems coupled to the one or more computer readable mediums and configured to execute the instructions to cause or direct the system to perform operations, the operations comprising: obtaining, during a communication session between a first device and a second device, video data that includes sign language content, the sign language content including one or more video frames of a figure performing sign language; obtaining audio data that represents the sign language content in the video data; providing, during the communication session, the video data and the audio data to a sign language processing system that includes a machine learning model, the video data and the audio data being generated independent of the sign language processing system; and training the machine learning model during the communication session using the video data and the audio data. 16 . The system of claim 15 , wherein the machine learning model is part of a sign language generation system or a sign language recognition system. 17 . The system of claim 15 , wherein training the machine learning model during the communication session using the video data and the audio data includes directing the audio data to an automatic speech recognition system configured to generate first text data that includes a transcription of spoken words in the audio data, the first text data used in training the machine learning model. 18 . The system of claim 17 , wherein training the machine learning model during the communication session includes: generating, by the sign language processing system, second text data by providing the video data to the machine learning model, the second text data representing the sign language content in the video data; comparing the first text data and the second text data; and adjusting the machine learning model based on the comparison. 19 . The system of claim 18 , wherein the steps of generating, comparing, and adjusting occur before an end of the communication session. 20 . The system of claim 17 , wherein training the machine learning model during the communication session includes: generating, by the sign language processing system, second video data by providing the first text data to the machine learning model, the second video data including sign language representing the first text data; comparing the video data and the second video data; and adjusting the machine learning model based on the comparison.

Assignees

Inventors

Classifications

  • Data-driven translation · CPC title

  • Machine-assisted translation, e.g. using translation memory · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Active pattern-learning, e.g. online learning of image or video features · CPC title

  • Transforming into visible information · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025166417A1 cover?
A method including obtaining, during a communication session between a first device and a second device, video data that includes sign language content. In these and other embodiments, the sign language content may include one or more video frames of a figure performing sign language. The method may further include obtaining audio data that represents the sign language content in the video data…
Who is the assignee on this patent?
Sorenson Ip Holdings Llc
What technology area does this patent fall under?
Primary CPC classification G11B27/02. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).