System and method for quality evaluation of collaborative text inputs

US10482176B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10482176-B2
Application numberUS-201815920243-A
CountryUS
Kind codeB2
Filing dateMar 13, 2018
Priority dateOct 17, 2017
Publication dateNov 19, 2019
Grant dateNov 19, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure relates generally to quality evaluation of collaborative text input, and more particularly to system and method for quality evaluation of collaborative text inputs using Long Short Term Memory (LSTM) networks. In one embodiment, the method includes receiving an input data associated with a task to be accomplished collaboratively and sequentially by a plurality of contributors. The input data includes task-wise data sequence of contributor's post-edit submissions. A plurality of features are extracted from the input data. Based on the plurality of features, a plurality of input sequences are constructed. The input sequences include a plurality of concatenated feature vectors, where each of the concatenated feature vectors includes a post-edit feature vector and a contributor representation feature vector. The input sequences are modelled as a LSTM network, where the LSTM network is utilized to train a binary classifier for quality evaluation of the post-edit submission.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method for quality evaluation of collaborative text input, comprising: receiving, via one or more hardware processors, an input data associated with a task to be accomplished collaboratively and sequentially by a plurality of contributors, the input data comprising task-wise data sequence of contributor's post-edit submissions; extracting a plurality of features from the input data, via the one or more hardware processors; constructing, via the one or more hardware processors, a plurality of input sequences based on the plurality of features, an input sequence of the plurality of input sequences comprising a plurality of concatenated feature vectors, each of the concatenated feature vectors comprising a post-edit feature vector and a contributor representation feature vector; and modelling, via the one or more hardware processors, the plurality of input sequences as a Long Short Term Memory (LSTM) network, wherein the LSTM network is utilized to train a first binary classifier for quality evaluation of the post-edit submission. 2. The method of claim 1 , wherein the post-edit submission is determined to be improved over prior post-edit submissions if the first binary classifier determines a binary class as 1, and post-edit submission is determined not to be improved if the first binary classifier determines the binary class as 0. 3. The method of claim 1 , wherein the task comprises a translation of text, for each post-edit submission, the post-edit feature vector comprises an edit vector of a current submission, and wherein the post-edit feature vector comprises of ratios of word-wise inserts, deletions and substitutions made by a contributor to a selected parent submission with total number of words, word length of a source sentence associated with the task, and quantized and normalized time taken by the contributor to perform the task. 4. The method of claim 1 , wherein for every post-edit submission, the contributor representation feature vector comprises one of one-hot vector encoding of dimensions of total number of contributors and learned contributor embedding by feeding the one-hot-vectors associated with the contributors to a neural network (NN) with a single linear hidden layer. 5. The method of claim 4 , further comprising learning the contributor embedding during the training of the LSTM network. 6. The method of claim 1 , further comprising training a second binary classifier, by the LSTM network to predict a probability of a contributor improving upon prior submissions. 7. The method of claim 6 , wherein predicting the probability comprises: mapping a history of prior post-edit submissions into an intermediate embedding, the history of prior post-edit submissions obtained from the input data; applying output of the LSTM networks and current contributor embedding to a neural network (NN), wherein the output of the LSTM networks comprises the history embedding at the last time stamp; and predicting the probability of a contributor improving upon the prior submissions based on the output of the NN. 8. The method of claim 1 , further comprising training an encoder-decoder architecture to predict sequence of ranks for each of the plurality of sequence of submissions based on the plurality of sequences of the concatenated feature vectors. 9. The method of claim 8 , wherein training the encoder-decoder architecture comprises: applying the sequence of concatenated feature vectors into the encoder-decoder architecture, the encoder-decoder architecture predicts a sequence of binary labels generated from selected expert provided ranks of the post-edit submissions as ordinal labels; obtaining, from the encoder, the embedding of the last hidden state which represents the history of post-edit; feeding the embedding to the decoder which learns the sequence of binary labels for each time step to provide a binary sequence; and calculating ranks based on the binary sequence obtained from the encoder-decoder. 10. A system for quality evaluation of collaborative text input, the system comprising: at least one memory storing instructions; and one or more hardware processors coupled to said at least one memory, wherein said one or more hardware processors are configured by said instructions to: receive an input data associated with a task to be accomplished collaboratively and sequentially by a plurality of contributors, the input data comprising task-wise data sequence of contributor's post-edit submissions; extract a plurality of features from the input data; construct a plurality of input sequences based on the plurality of features, an input sequence of the plurality of input sequences comprising a plurality of concatenated feature vectors, each of the concatenated feature vectors comprising a post-edit feature vector and a contributor representation feature vector; and model the plurality of input sequences as a Long Short Term Memory (LSTM) network, wherein the LSTM network is utilized to train a first binary classifier for quality evaluation of the post-edit submission. 11. The system of claim 10 , wherein the one or more hardware processors are configured by the instructions to determine if the post-edit submission is improved over prior post-edit submissions when the first binary classifier determines a binary class as 1, and determines the post-edit submission not to be improved when the first binary classifier determines the binary class as 0. 12. The system of claim 10 , wherein the task comprises a translation of text, for every post-edit submission, the post-edit feature vector comprises an edit vector of a current submission, and wherein the post-edit feature vector comprises of ratios of word-wise inserts, deletions and substitutions made by a contributor to a selected parent submission with total number of words; word length of a source sentence, associated with the task, and quantized and normalized time taken by the contributor to perform the task. 13. The system of claim 10 , wherein for every post-edit submission, the contributor representation feature vector comprises one of one-hot vector encoding of dimensions of total number of contributors and learned contributor embedding by feeding the one-hot-vectors associated with the contributors to a neural network (NN) with a single linear hidden layer. 14. The system of claim 13 , wherein the one or more hardware processors are configured by the instructions to learn the contributor embedding during the training of the LSTM network. 15. The system of claim 10 , wherein the one or more hardware processors are configured by the instructions to train a second binary classifier, by the LSTM network to predict a probability of a contributor improving upon prior submissions. 16. The system of claim 15 , wherein to predict the probability, the one or more hardware processors are configured by the instructions to: map a history of prior post-edit submissions into an intermediate embedding, the history of prior post-edit submissions obtained from the input data; apply output of the LSTM networks and current contributor embedding to a neural network (NN), wherein the output of the LSTM networks comprises the history embedding at the last time stamp; and predict the probability of a contributor improving upon the prior submissions based on the output of the NN. 17. The system of claim 10 , wherein the one or more hardware processors are configured by the instructions to train an encoder-decoder architecture to predict sequence of ranks for each of the plurality of

Assignees

Inventors

Classifications

  • Quality analysis or management · CPC title

  • G06F40/289Primary

    Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • Editing, e.g. inserting or deleting · CPC title

  • for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10482176B2 cover?
This disclosure relates generally to quality evaluation of collaborative text input, and more particularly to system and method for quality evaluation of collaborative text inputs using Long Short Term Memory (LSTM) networks. In one embodiment, the method includes receiving an input data associated with a task to be accomplished collaboratively and sequentially by a plurality of contributors. T…
Who is the assignee on this patent?
Tata Consultancy Services Ltd
What technology area does this patent fall under?
Primary CPC classification G06Q10/06395. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 19 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).