What technology area does this patent fall under?

Primary CPC classification G10L25/51. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Processing speech signals in voice-based profiling

US10529328B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10529328-B2
Application number	US-201615739085-A
Country	US
Kind code	B2
Filing date	Jun 22, 2016
Priority date	Jun 22, 2015
Publication date	Jan 7, 2020
Grant date	Jan 7, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This document describes a data processing system for processing a speech signal for voice-based profiling. The data processing system segments the speech signal into a plurality of segments, with each segment representing a portion of the speech signal. For each segment, the data processing system generates a feature vector comprising data indicative of one or more features of the portion of the speech signal represented by that segment and determines whether the feature vector comprises data indicative of one or more features with a threshold amount of confidence. For each of a subset of the generated feature vectors, the system processes data in that feature vector to generate a prediction of a value of a profile parameter and transmits an output responsive to machine executable code that generates a visual representation of the prediction of the value of the profile parameter.

First claim

Opening claim text (preview).

What is claimed is: 1. A data processing system for processing a speech signal, the data processing system comprising: a detection system front end server that routes data representing a speech signal to one or more processing devices that generate a response to the speech signal; and a segmentation server that includes the one or more processing devices and that processes the speech signal to segment the data into a plurality of segments, with each segment representing a portion of the speech signal, and with the segmentation server further performing operations comprising: for each segment, generating a feature vector comprising data indicative of one or more features of the portion of the speech signal represented by that segment; determining confidence values for the one or more features of the feature vector; and comparing the confidence values for the one or more features to respective threshold values; for each of a subset of the generated feature vectors, processing data in that feature vector to generate a prediction of a value of a profile parameter, with the subset comprising one or more feature vectors determined to have respective features that have confidence values exceeding the respective threshold values; wherein the segmentation server generates and transmits an output responsive to machine executable code that generates a visual representation of the prediction of the value of the profile parameter, and wherein the detection system front end server uses the output responsive to the machine executable code to remotely update a display of a client device that submitted a request to present the visual representation of the prediction of the value of the profile parameter; and wherein the segmentation server executes a first selected prediction algorithm on detection data processed in accordance with a second selected prediction algorithm, wherein the detection data represents two or more features having a predetermined correlation or a predetermined dependency between the two or more features, the two or more features each having respective confidence values that exceed the respective threshold values. 2. The data processing system of claim 1 , wherein the output comprises the value of the profile parameter and forensic profile data through one or more application interfaces. 3. The data processing system of claim 1 , wherein processing data in the feature vector to generate a prediction comprises selecting a predictor algorithm, based on the data indicative of the one or more features with the confidence value that exceeds the respective threshold, for processing the data in each of the subset of the generated feature vectors. 4. The data processing system of claim 1 , wherein the profile parameter comprises one or more of a bio-relevant parameter, a socio-personal parameter, and an environmental parameter. 5. The data processing system of claim 4 , wherein the bio-relevant parameter comprises one of a physical parameter, a physiological parameter, a medical parameter, or a psychological parameter. 6. The data processing system of claim 4 , wherein the socio-personal parameter comprises one of a behavioral parameter, a demographic parameter, or a sociological parameter. 7. The data processing system of claim 1 , wherein one or more features comprise one or more micro-properties of the speech signal, the micro-properties comprising one or more of formants, pitch, hamonicity, jitter, shimmer, formant bandwidths, harmonic bandwidths, voicing onset and offset times, glottal pulse shape, pitch onset pattern, aphonicity, biphonicity, flutter, wobble, breathiness, and resonance. 8. The data processing system of claim 1 , wherein the one or more features comprise a spectral feature characterizing time-frequency characteristics of the signal, the time-frequency characteristics comprising one or more of short-time Fourier transforms, segmental cepstral features and power-normalized cepstra. 9. The data processing system of claim 1 , wherein the one or more features comprise a trend feature, the trend feature comprising a modulation feature, long-term formant statistics, and a formant trajectory feature. 10. The data processing system of claim 1 , wherein the one or more features comprise one or more of phonetic and linguistic features, the phonetic and linguistic features comprising phoneme durations and timing patterns. 11. The data processing system of claim 1 , the segmentation server further performing operations comprising: generating, based on data of a feature vector of the subset, a category for the data segment associated with that feature vector; and assigning the category to a forensic profile. 12. The data processing system of claim 1 , the segmentation server further performing operations comprising: comparing the speech signal to an additional speech signal by comparing one or more feature vectors of the subset of the generated feature vectors to one or more feature vectors of an additional subset of generated feature vectors of the additional speech signal, the additional subset comprising one or more additional feature vectors determined to have one or more features having respective confidence values that exceed the respective threshold values. 13. The data processing system of claim 1 , wherein generating the prediction of the value comprises executing a machine learning algorithm to determine a strength of an association between the feature vector and the profile parameter. 14. The data processing system of claim 1 , the segmentation server further performing operations comprising determining which of the one or more features having respective confidence values that exceed the respective threshold values in the feature vector represents a masking-invariant pattern in a segment. 15. The data processing system of claim 1 , the segmentation server further performing operations comprising recovering data in a segment by modifying the segment. 16. The data processing system of claim 1 , wherein the value of the profile parameter is determined in real-time or near real-time based on execution of a predictive algorithm. 17. The data processing system of claim 1 , the segmentation server further performing operations comprising identifying a source based on the value of the profile parameter. 18. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: processing a speech signal to segment the speech signal into a plurality of segments, with each segment of the plurality of segments representing a portion of the speech signal; for each segment, generating a feature vector comprising data indicative of one or more features of the portion of the speech signal represented by that segment; determining confidence values for the one or more features of the feature vector; and comparing the confidence values for the one or more features to respective threshold values; for each of a subset of the generated feature vectors, processing data in that feature vector to generate a prediction of a value of a profile parameter, with the subset comprising one or more feature vectors determined to have respective features that have confidence values exceeding the respective threshold values; and generating an output responsive to machine executable code as a program-returned response including the prediction of the value of the profile parameter; and executing a first selected prediction

Assignees

Univ Carnegie Mellon

Inventors

Singh Rita

Classifications

G10L25/63
for estimating an emotional state · CPC title
G10L17/02
Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction · CPC title
G10L17/26
Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices · CPC title
G10L25/51Primary
for comparison or discrimination · CPC title
G10L25/66
for extracting parameters related to health condition (detecting or measuring for diagnostic purposes A61B5/00) · CPC title

Patent family

Related publications grouped by family.

View patent family 57586621

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10529328B2 cover?: This document describes a data processing system for processing a speech signal for voice-based profiling. The data processing system segments the speech signal into a plurality of segments, with each segment representing a portion of the speech signal. For each segment, the data processing system generates a feature vector comprising data indicative of one or more features of the portion of th…
Who is the assignee on this patent?: Univ Carnegie Mellon
What technology area does this patent fall under?: Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).