Cross-modal training of a machine-learning model that identifies abuse in audio streams
US-2025069596-A1 · Feb 27, 2025 · US
US12457256B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12457256-B2 |
| Application number | US-202318238414-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 25, 2023 |
| Priority date | Aug 25, 2023 |
| Publication date | Oct 28, 2025 |
| Grant date | Oct 28, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods of intelligent reporting within online communities are provided. A current communication session associated with a plurality of user device may be monitored. The current communication session may include a stream of audio-visual content generated in real-time based on interactions between the user devices. A recording trigger may be detected within the current communication session. A recording of a portion of the stream of audio-visual content may be recorded in response to the detected trigger event. The recording may be analyzed to attribute one or more sub-portions within the recording to one or more of the user devices. At least one of the sub-portions attributed to an identified one of the user devices may be determined to meet a moderation event. A report regarding the identified user device may be generated that includes the at least one sub-portion that meets the moderation event.
Opening claim text (preview).
What is claimed is: 1. A method of intelligent reporting within online communities, the method comprising: monitoring a current communication session associated with a plurality of user devices, wherein the current communication session is associated with a stream of audio-visual content generated in real-time based on interactions between the plurality of user devices; detecting a recording trigger within the current communication session; capturing a recording of at least a portion of the stream of audio-visual content in response to the detected recording trigger; analyzing the recording to attribute one or more sub-portions of the recording of the at least a portion of the stream of audio-visual content to one or more user devices of the plurality of user devices free of a user input attributing the one or more sub-portions to an identified user device; determining at least one sub-portion of the one or more sub-portions attributed to the identified user device of the one or more user devices meets a moderation event; and generating a report regarding the identified user device, wherein the report includes the at least one sub-portion that meets the moderation event. 2. The method of claim 1 , wherein the recording trigger is a signal from a signaling user device of the plurality of user devices in the current communication session, and further comprising making the report accessible to the signaling user device. 3. The method of claim 2 , further comprising sending a reminder notification to the signaling user device at an end of the current communication session, wherein the reminder notification includes a link to the report. 4. The method of claim 1 , further comprising flagging the sub-portions of the recording, and wherein the report includes the recording with the flagged sub-portions. 5. The method of claim 1 , further comprising: receiving annotation input from one of the user devices in the current communication session, the annotation input associated with one or more of the sub-portions within the recording; adding the annotation input to the report; and sending the report with the added annotation input to a designated moderator device. 6. The method of claim 5 , wherein the annotation input includes a voice memo recording, wherein the voice memo recording is also recorded during the current communication session. 7. The method of claim 5 , further comprising identifying one or more different possible moderation events associated with the report, and generating a menu of one or more options corresponding to different moderation events, wherein the annotation input corresponds to a selection from the menu. 8. The method of claim 7 , wherein the annotation input corresponds to a plurality of selections from a set of the user devices, and further comprising tallying different selections from the set of the user devices, wherein the tallied selections are included in the report sent to the designated moderator device. 9. The method of claim 8 , wherein the set of the user devices includes less than all of the user devices in the current communication session, and further comprising restricting access to the menu to the set of the user devices. 10. The method of claim 1 , further comprising buffering the stream of audio-visual content generated in real-time, wherein the captured recording includes a predetermined portion of the buffered stream prior to detection of the recording trigger. 11. The method of claim 1 , wherein attributing the sub-portions within the recording to one or more of the user devices is based on one or more learning models trained to disambiguate among different user voices of users of the user devices. 12. A system of intelligent reporting within online communities, the system comprising: a communication interface that communicates over a communication network, wherein the communication interface establishes a current communication session associated with a plurality of user device, wherein the current communication session associated with a stream of audio-visual content generated in real-time based on interactions between the user devices; a processor that executes instructions stored in memory, wherein the processor executes the instructions to: monitor the current communication session, detect a recording trigger within the current communication session; capture a recording of at least a portion of the stream of audio-visual content in response to the detected recording trigger, analyze the recording to attribute one or more sub-portions of the recording of the at least a portion of the stream of audio-visual content to one or more user devices of the plurality of user devices free of a user input attributing the one or more sub-portions to an identified user device, determine at least one sub-portion of the one or more sub-portions attributed to the identified user device of the one or more user devices meets a moderation event, and generate a report regarding the identified user device, wherein the report includes the at least one sub-portion that meets the moderation event; and memory that stores the captured recording and the report. 13. The system of claim 12 , wherein the recording trigger is a signal from a signaling user device of the plurality of user devices in the current communication session, and wherein the processor executes further instructions to make the report accessible to the signaling user device. 14. The system of claim 13 , wherein the communication interface further sends a reminder notification to the signaling user device at an end of the current communication session, wherein the reminder notification includes a link to the report. 15. The system of claim 12 , wherein the processor executes further instructions to flag the sub-portions of the recording, and wherein the report includes the recording with the flagged sub-portions. 16. The system of claim 12 , wherein the communication interface further receives annotation input from one of the user devices in the current communication session, the annotation input associated with one or more of the sub-portions within the recording, wherein the processor executes further instructions to add the annotation input to the report; and the communication interface further sends the report with the annotation input to a designated moderator device. 17. The system of claim 16 , wherein the annotation input includes a voice memo recording, wherein the voice memo recording is also recorded during the current communication network. 18. The system of claim 16 , wherein the processor executes further instructions to identify one or more different possible moderation events associated with the report, and to generate a menu of one or more options corresponding to different moderation events, wherein the annotation input corresponds to a selection from the menu. 19. The system of claim 18 , wherein the annotation input corresponds to a plurality of selections from a set of the user devices, and wherein the processor executes further instructions to tally different selections from the set of the user devices, wherein the tallied selections are included in the report sent to the designated moderator device. 20. The system of claim 19 , wherein the set of the user devices includes less than all of the user devices in the current communication session, and wherein the processor executes further instructions to restrict access to the menu to the set of the user devices. 21. The system of cl
Interaction with lists of selectable items, e.g. menus · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Execution arrangements for user interfaces · CPC title
Selection of displayed objects or displayed text elements (G06F3/0482 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.