Composite machine-learning system for label prediction and training data collection
US-2018285773-A1 · Oct 4, 2018 · US
US10997672B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10997672-B2 |
| Application number | US-201715610596-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 31, 2017 |
| Priority date | May 31, 2017 |
| Publication date | May 4, 2021 |
| Grant date | May 4, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes obtaining data related to a plurality of historical transactions, where each historical transaction is associated with a label based on a click stream created by the first user, generating a vector of features from the data related to each historical transaction, training, using the vectors and labels, a multinomial classifier to generate a probability that a specific transaction belongs to a specific classification with respect to income, obtaining data related to a new transaction from a financial stream for a second financial account of a second user of the financial service, generating a new vector of features from the data related to the new transaction, determining a classification with respect to income for the new transaction, and presenting the classification to the second user for review in a view of a graphical user interface.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method, comprising: generating a vector of features from data related to a historical transaction; generating a probability that the historical transaction belongs to a specific classification with respect to income; training a multinomial classifier using the vector, a label associated with the historical transaction, and the probability; assigning weights to a sparse matrix made up of a plurality of vectors that include the vector to train the multinomial classifier; obtaining data related to a new transaction from a data stream for an account of an online service; splitting the data related to the new transaction into a set of unigrams; generating a new vector of features from the data related to the new transaction, the new vector including a set of values that correspond and are assigned to the set of unigrams; determining a classification with respect to income for the new transaction by applying the multinomial classifier to the new vector; labeling the new transaction with the classification; presenting the classification to a view of a graphical user interface; and populating, using the classification, one or more fields of a form that is maintained by the online service. 2. The method of claim 1 , further comprising: receiving a reclassification of the new transaction; populating, using the reclassification instead of the classification, one or more fields of the form that is maintained by the online service; and updating, using the reclassification, the multinomial classifier. 3. The method of claim 1 , wherein at least one of the features has been filtered using a custom stop-word dictionary developed through empirical testing of the multinomial classifier, and wherein the classification is based on a probability generated by applying the multinomial classifier to the new vector. 4. The method of claim 1 , wherein one of the features identifies a week day on which the historical transaction occurred. 5. The method of claim 1 , further comprising: applying a threshold based on a precision-recall curve to the probability when determining the classification with respect to income for the new transaction. 6. The method of claim 1 , further comprising the operation of: retraining the multinomial classifier at an end of a predetermined period using transactions which occurred during the predetermined period. 7. The method of claim 1 , wherein the online service is a massively multi-user online service. 8. The computer implemented method of claim 1 , further comprising: obtaining data related to a plurality of historical transactions, wherein each historical transaction is associated with a second account provided by the online service and with a label based on a click stream of graphical user interface interactions, and wherein the label identifies the historical transaction as belonging to a specific classification with respect to income. 9. The method of claim 8 , further comprising: obtaining data related to tax filing for a plurality of accounts of the online service; mining the data related to tax filing and the historical transactions using a clustering technique to identify potential sources of income; and training the multinomial classifier to generate a probability that a specific transaction is one of the potential sources of income. 10. A non-transitory computer-readable storage medium storing instructions, which when executed, perform operations as follows: generate a vector of features from the data related to each historical transaction; generate a probability that the historical transaction belongs to a specific classification with respect to income; train a multinomial classifier using the vector, a label associated with the historical transaction, and the probability; assign weights to a sparse matrix made up of a plurality of vectors that include the vector to train the multinomial classifier; obtain data related to a new transaction from a data stream for an account of an online service; split the data related to the new transaction into a set of unigrams; generate a new vector of features from the data related to the new transaction, the new vector including a set of values that correspond and are assigned to the set of unigrams; determine a classification with respect to income for the new transaction by applying the multinomial classifier to the new vector; label the new transaction with the classification; present the classification to a graphical user interface; and populate, using the classification, one or more fields of a form that is maintained by the online service. 11. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: receive a reclassification of the new transaction; populate, using the reclassification instead of the classification, one or more fields of the form that is maintained by the online service; and update, using the reclassification, the multinomial classifier. 12. The non-transitory computer-readable storage medium of claim 10 , wherein at least one of the features has been filtered using a custom stop-word dictionary developed through empirical testing of the multinomial classifier, and wherein the classification is based on a probability generated by applying the multinomial classifier to the new vector. 13. The non-transitory computer-readable storage medium of claim 10 , wherein one of the features identifies a week day on which the historical transaction occurred. 14. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: apply a threshold based on a precision-recall curve to a probability when determining the classification with respect to income for the new transaction. 15. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: retrain the multinomial classifier at an end of a predetermined period using transactions which occurred during the predetermined period. 16. The non-transitory computer-readable storage medium of claim 10 , wherein the online service is a massively multi-user online service. 17. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: obtain data related to a plurality of historical transactions, wherein each historical transaction is associated with a second account provided by the online service and with a label based on a click stream of graphical user interface interactions, and wherein the label identifies the historical transaction as belonging to a specific classification with respect to income. 18. The non-transitory computer-readable storage medium of claim 17 , further comprising instructions to: obtain data related to tax filing for a plurality of accounts of the online service; mine the data related to tax filing and the historical transactions using a clustering technique to identify potential sources of income; and train the multinomial classifier to generate the probability that a specific transaction is one of the potential sources of income. 19. A system comprising: a processor; a storage storing instructions which, when executed by the processor, perform operations as follows: generate a vector of features from the data related to each historical transaction; generate a probability that the historical transaction belongs to a specific classification with respect to income; train a multinomial classifier using the vecto
Machine learning · CPC title
Tax preparation or submission · CPC title
Query processing support for facilitating data mining operations in structured databases · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.