Systems and methods to improve trust in conversations

US12412562B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12412562-B2
Application numberUS-202217732944-A
CountryUS
Kind codeB2
Filing dateApr 29, 2022
Priority dateApr 29, 2022
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to a system, a method, and a product for using machine learning models to quantify and/or improve trust in conversations. The system includes a non-transitory memory; and a processor in communication with the non-transitory memory. The processor executes the instructions to cause the system to: obtain a set of vocal features and a set of text features for each sample in audio samples; obtain a trust score for each sample; perform a preprocess to obtain a set of input features for each sample; determine a type of machine-learning algorithm for the machine-learning network; tune a set of hyper parameters for the machine-learning network; generate a predicated trust score by the machine-learning network with the sets of input features for each sample; and train the machine-learning network based on the predicated trust score and the trust score for each sample to obtain the training result.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a non-transitory memory storing instructions executable to construct a machine learning network to quantify a trust score and to automate trust delivery with a digital avatar by generating a trustworthy voice for the digital avatar; and a processor in communication with the non-transitory memory, wherein, the processor executes the instructions to cause the system to: obtain a set of vocal features and a set of text features for each sample in audio samples; obtain a trust score for each sample; perform a preprocess on the set of vocal features and the set of text features to obtain a set of input features for each sample; determine a type of machine-learning algorithm for the machine-learning network based on a training result of the machine-learning network; tune a set of hyper parameters for the machine-learning network based on a cross validation according to the machine-learning network; generate a predicated trust score by the machine-learning network with the sets of input features for each sample; train the machine-learning network based on the predicated trust score and the trust score for each sample to obtain the training result; generate a set of trust components for a user by the machine-learning network; concatenate the set of trust components with a user profile of the user to obtain an expanded user profile; train a second machine-learning network by input the expanded user profile to recommend features for improving trust scores; and generate a list of recommended features for the user by the trained second machine learning network based on the expanded user profile, wherein generating the trustworthy voice for the digital avatar comprises: receiving an input text and a reference trustworthy tone sample; collecting a sequence of phonemes and a Mel spectrogram from the input text using a text to speech module; encoding the Mel spectrogram with an input encoder to generate an input embedding; encoding the reference trustworthy tone sample with a trust encoder and concatenating with the input embedding to generate an output; processing the output of the concatenation through a location-sensitive attention layer using cumulative attention weights to generate an encoded input sequence; predicting a Mel spectrogram with a decoder from the encoded input sequence; and generating the trustworthy voice for the digital avatar from the Mel spectrogram using a vocoder, wherein the digital avatar is configured to replace the user in a conversation. 2. The system according to claim 1 , wherein the instructions to cause the system to obtain the set of vocal features and the set of text features for each sample in the audio samples, comprises instructions to cause the system to: input each sample in the audio samples into a transcribe analytics module to obtain a transcribed result for each sample; input the transcribed result for each sample into a vocal feature module to obtain the set of vocal features for each sample; and input the transcribed result for each sample into a text feature module to obtain the set of text features for each sample. 3. The system according to claim 1 , wherein the instructions to cause the system to obtain the trust score for each sample, comprises the instructions to cause the system to: obtain a set of scores based on human annotation for each sample, each score in the set of scores corresponding to a variable in a pre-defined trust calculation function; and calculate the trust score for each sample based on the pre-defined trust calculation function with the set of scores. 4. The system according to claim 3 , wherein: the pre-defined trust calculation function comprises a plurality of variables comprising a credibility variable, a reliability variable, an intimacy variable, and a self-orientation variable. 5. The system according to claim 4 , wherein: the pre-defined trust calculation function comprises one of the following: a summation of the credibility variable, the reliability variable, and the intimacy variable, the summation being divided by the self-orientation variable; the summation of the credibility variable, the reliability variable, and the intimacy variable, the summation being subtracted by the self-orientation variable; or the summation of the credibility variable, the reliability variable, and the intimacy variable, the summation being subtracted by three times the self-orientation variable. 6. The system according to claim 1 , wherein the instructions to cause the system to perform the preprocess on the set of vocal features and the set of text features to obtain the set of input features for each sample, comprises the instructions to cause the system to: remove correlated features from the set of vocal features and the set of text features to obtain a reduced set of vocal features and a reduced set of text features; and encode categorical variables based on the reduced set of vocal features and the reduced set of text features to obtain the set of input features. 7. The system according to claim 1 , wherein the type of machine-learning algorithm comprises one of a gradient boosting, a random forest, a ridge with principal component analysis (PCA), or a linear regression. 8. The system according to claim 1 , wherein the set of hyper parameters comprises at least one of a learning rate, a minimum sample split, a minimum sample leaf, or a number of estimators. 9. The system according to claim 1 , wherein: the training result comprises a mean absolute error (MAE) between the predicated trust score and the trust score for each sample; and the instructions to cause the system to train the machine-learning network based on the predicated trust score and the trust score for each sample, comprises the instructions to cause the system to: train the machine-learning network based on the predicated trust score and the trust score for each sample to minimize the MAE based on a gradient boosting regressor. 10. The system according to claim 1 , wherein: the list of recommended features comprises features from a recommendation library; the second machine-learning network comprises a softmax module for training and a nearest neighbor index module to generate a recommendation probability for each feature in the recommendation library; and the list of recommended features comprises top N features with highest recommendation probability, N being a positive integer. 11. A method comprising: obtaining, by a computing device comprising a memory storing instructions executable to construct a machine-learning network to quantify a trust score and to automate trust delivery with a digital avatar by generating a trustworthy voice for the digital avatar and a processor in communication with the memory, a set of vocal features and a set of text features for each sample in audio samples; obtaining, by the computing device, a trust score for each sample; performing, by the computing device, a preprocess on the set of vocal features and the set of text features to obtain a set of input features for each sample; determining, by the computing device, a type of machine-learning algorithm for the machine-learning network based on a training result of the machine-learning network; tuning, by the computing device, a set of hyper parameters for the machine learning network based on a cross validation according to the machine-learning network; generating, by the computing device, a predicated trust score by the machine learning network with the sets of input features for each sample; training, by the computing device, the machine-learning network based on the predicated trus

Assignees

Inventors

Classifications

  • using artificial neural networks · CPC title

  • driven by audio data · CPC title

  • Feature extraction for speech recognition; Selection of recognition unit · CPC title

  • the extracted parameters being spectral information of each sub-band · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12412562B2 cover?
The present disclosure relates to a system, a method, and a product for using machine learning models to quantify and/or improve trust in conversations. The system includes a non-transitory memory; and a processor in communication with the non-transitory memory. The processor executes the instructions to cause the system to: obtain a set of vocal features and a set of text features for each sam…
Who is the assignee on this patent?
Accenture Global Solutions Ltd
What technology area does this patent fall under?
Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).