Information processing apparatus, information processing system, information processing method, and non-transitory recording medium to generate sound data including beats based on user behavior information in a scheduled conference

US12462832B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12462832-B2
Application numberUS-202218049369-A
CountryUS
Kind codeB2
Filing dateOct 25, 2022
Priority dateJan 21, 2022
Publication dateNov 4, 2025
Grant dateNov 4, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing apparatus includes circuitry to acquire behavior information of a plurality of users having a conversation, generate sound data based on the behavior information, and cause an output device to output an ambient sound based on the sound data, wherein the sound data is generated by combining a number of sounds, a number of beats and a tone based on the behavior information.

First claim

Opening claim text (preview).

The invention claimed is: 1 . An information processing system apparatus, comprising: an information processing apparatus including circuitry; an input device configured to measure behavior information of a plurality of users and input the behavior information to the information processing apparatus; and an output device configured to output a sound, the information processing apparatus, the input device, and the output device being connected to each other via a communication network, wherein the circuitry of the information processing apparatus is configured to: store reservation information of a scheduled conference in a memory; determine that the plurality of users start a conversation in the scheduled conference based on the behavior information acquired from the input device; automatically assign a melody to each of the plurality of users having the conversation in the scheduled conference based on the reservation information stored in the memory; identify a user who speaks among the plurality of users; determine whether the identified user speaks continuously for a predetermined time or more based on measurement by the input device; and in response to determination that the identified user speaks continuously for the predetermined time or more, generate sound data by combining a number of sounds, a number of beats and a tone based on the behavior information and the melody assigned to the identified user, and cause the output device to output an ambient sound based on the sound data, so that the ambient sound changing according to a state of the conversation is output for the plurality of users. 2 . The information processing system of claim 1 , wherein the behavior information includes at least one of a speech utterance amount of the plurality of users or posture information of the plurality of users, the posture information being information in relation to postures of the plurality of users, the plurality of users being present in a space. 3 . The information processing system of claim 2 , wherein the behavior information includes at least one of frequency of speaker changes among the plurality of users, a screen change amount of one or more information processing terminals operated by the plurality of users, a number of users corresponding to the plurality of users, and heartbeats of the plurality of users. 4 . The information processing system of claim 2 , wherein the circuitry is further configured to: acquire surroundings-dependent information that is information on surroundings of at least one of inside the space or outside the space; and generate the sound data based on the behavior information, the melody, and the surroundings-dependent information. 5 . The information processing system of claim 2 , wherein the speech utterance amount of the plurality of users included in the behavior information is measured based on an output signal from a microphone. 6 . The information processing system of claim 2 , wherein the posture information of the plurality of users included in the behavior information is obtained based on an output signal from a camera. 7 . The information processing system of claim 2 , wherein the circuitry is further configured to output the ambient sound that varies depending on each of a plurality of areas of the space. 8 . The information processing system of claim 1 , wherein the circuitry acquires, via the communication network, the behavior information that includes a speech utterance amount of the plurality of users having the conversation. 9 . The information processing system of claim 1 , wherein the circuitry is further configured to determine a state of the plurality of users based on the behavior information, and cause the output device to output the ambient sound based on the state. 10 . The information processing system of claim 1 , wherein the circuitry is further configured to change the sound data generated based on the behavior information as time passes in a case where a conversation time of the conversation is determined in advance. 11 . The information processing system of claim 1 , wherein the output device includes at least one of a speaker provided in a space where the plurality of users are present or an information processing terminal operated by one of the plurality of users. 12 . The information processing system of claim 1 , wherein the scheduled conference is an online conference in which the plurality of users respectively operate information processing terminals and the conversation occurs via a communication network connecting the information processing apparatus and the information processing terminals, and the circuitry is further configured to: acquire the behavior information in the online conference through the information processing terminals operated by the plurality of users, the behavior information including a speech utterance amount of the plurality of users, frequency of speaker changes among the plurality of users, and a screen change amount of the information processing terminals; and generate the sound data based on the speech utterance amount, the frequency of speaker changes, the screen change amount, and the melody assigned to the identified user. 13 . The information processing system of claim 1 , wherein the number of sounds is classified based on a speech utterance amount of at least one of the plurality of users, and the speech utterance amount is determined by how many seconds, in a predetermined number of seconds, a state in which the at least one of the plurality of users speaks lasts. 14 . The information processing system of claim 1 , wherein the number of beats is classified based on frequency of speaker changes among the plurality of users, and the frequency of speaker changes is determined by a number of times that the speaker changes occur during a predetermined number of seconds. 15 . The information processing system of claim 1 , wherein the behavior information includes posture information of the plurality of users, and the posture information is acquired based on a posture bounding box of each of the plurality of users recognized by image processing on video data captured by the input device. 16 . An information processing method using an information processing apparatus, an input device, and an output device connected to each other via a communication network, comprising: storing reservation information of a scheduled conference in a memory of the information processing apparatus; determining that a plurality of users start a conversation in the scheduled conference based on behavior information of the plurality of users acquired from the input device; automatically assigning a melody to each of the plurality of users having the conversation in the scheduled conference based on the reservation information stored in the memory; identifying a user who speaks among the plurality of users; determining whether the identified user speaks continuously for a predetermined time or more based on measurement by the input device; and in response to determination that the identified user speaks continuously for the predetermined time or more, generating sound data by combining a number of sounds, a number of beats and a tone based on the behavior information and the melody assigned to the identified user, and causing the output device to output an ambient sound based on the sound data, so that the ambient sound changing according to a state of the conversation is output for the plurality of users. 1

Assignees

Inventors

Classifications

  • Synthesis of acoustic waves (synthesis of speech G10L13/00) · CPC title

  • Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L19/00 takes precedence) · CPC title

  • Sound category-dependent sound synthesis processes [Gensound] for musical use; Sound category-specific synthesis-controlling parameters or control means therefor · CPC title

  • Rhythm · CPC title

  • Geolocation input, i.e. control of musical parameters based on location or geographic position, e.g. provided by GPS, Wi-Fi® network location databases or mobile phone base station position databases · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12462832B2 cover?
An information processing apparatus includes circuitry to acquire behavior information of a plurality of users having a conversation, generate sound data based on the behavior information, and cause an output device to output an ambient sound based on the sound data, wherein the sound data is generated by combining a number of sounds, a number of beats and a tone based on the behavior information.
Who is the assignee on this patent?
Murata Haruki, Katoh Yuuya, Ricoh Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L25/63. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).