Method for predicting business income from user transaction data

US10997672B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10997672-B2
Application numberUS-201715610596-A
CountryUS
Kind codeB2
Filing dateMay 31, 2017
Priority dateMay 31, 2017
Publication dateMay 4, 2021
Grant dateMay 4, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes obtaining data related to a plurality of historical transactions, where each historical transaction is associated with a label based on a click stream created by the first user, generating a vector of features from the data related to each historical transaction, training, using the vectors and labels, a multinomial classifier to generate a probability that a specific transaction belongs to a specific classification with respect to income, obtaining data related to a new transaction from a financial stream for a second financial account of a second user of the financial service, generating a new vector of features from the data related to the new transaction, determining a classification with respect to income for the new transaction, and presenting the classification to the second user for review in a view of a graphical user interface.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method, comprising: generating a vector of features from data related to a historical transaction; generating a probability that the historical transaction belongs to a specific classification with respect to income; training a multinomial classifier using the vector, a label associated with the historical transaction, and the probability; assigning weights to a sparse matrix made up of a plurality of vectors that include the vector to train the multinomial classifier; obtaining data related to a new transaction from a data stream for an account of an online service; splitting the data related to the new transaction into a set of unigrams; generating a new vector of features from the data related to the new transaction, the new vector including a set of values that correspond and are assigned to the set of unigrams; determining a classification with respect to income for the new transaction by applying the multinomial classifier to the new vector; labeling the new transaction with the classification; presenting the classification to a view of a graphical user interface; and populating, using the classification, one or more fields of a form that is maintained by the online service. 2. The method of claim 1 , further comprising: receiving a reclassification of the new transaction; populating, using the reclassification instead of the classification, one or more fields of the form that is maintained by the online service; and updating, using the reclassification, the multinomial classifier. 3. The method of claim 1 , wherein at least one of the features has been filtered using a custom stop-word dictionary developed through empirical testing of the multinomial classifier, and wherein the classification is based on a probability generated by applying the multinomial classifier to the new vector. 4. The method of claim 1 , wherein one of the features identifies a week day on which the historical transaction occurred. 5. The method of claim 1 , further comprising: applying a threshold based on a precision-recall curve to the probability when determining the classification with respect to income for the new transaction. 6. The method of claim 1 , further comprising the operation of: retraining the multinomial classifier at an end of a predetermined period using transactions which occurred during the predetermined period. 7. The method of claim 1 , wherein the online service is a massively multi-user online service. 8. The computer implemented method of claim 1 , further comprising: obtaining data related to a plurality of historical transactions, wherein each historical transaction is associated with a second account provided by the online service and with a label based on a click stream of graphical user interface interactions, and wherein the label identifies the historical transaction as belonging to a specific classification with respect to income. 9. The method of claim 8 , further comprising: obtaining data related to tax filing for a plurality of accounts of the online service; mining the data related to tax filing and the historical transactions using a clustering technique to identify potential sources of income; and training the multinomial classifier to generate a probability that a specific transaction is one of the potential sources of income. 10. A non-transitory computer-readable storage medium storing instructions, which when executed, perform operations as follows: generate a vector of features from the data related to each historical transaction; generate a probability that the historical transaction belongs to a specific classification with respect to income; train a multinomial classifier using the vector, a label associated with the historical transaction, and the probability; assign weights to a sparse matrix made up of a plurality of vectors that include the vector to train the multinomial classifier; obtain data related to a new transaction from a data stream for an account of an online service; split the data related to the new transaction into a set of unigrams; generate a new vector of features from the data related to the new transaction, the new vector including a set of values that correspond and are assigned to the set of unigrams; determine a classification with respect to income for the new transaction by applying the multinomial classifier to the new vector; label the new transaction with the classification; present the classification to a graphical user interface; and populate, using the classification, one or more fields of a form that is maintained by the online service. 11. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: receive a reclassification of the new transaction; populate, using the reclassification instead of the classification, one or more fields of the form that is maintained by the online service; and update, using the reclassification, the multinomial classifier. 12. The non-transitory computer-readable storage medium of claim 10 , wherein at least one of the features has been filtered using a custom stop-word dictionary developed through empirical testing of the multinomial classifier, and wherein the classification is based on a probability generated by applying the multinomial classifier to the new vector. 13. The non-transitory computer-readable storage medium of claim 10 , wherein one of the features identifies a week day on which the historical transaction occurred. 14. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: apply a threshold based on a precision-recall curve to a probability when determining the classification with respect to income for the new transaction. 15. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: retrain the multinomial classifier at an end of a predetermined period using transactions which occurred during the predetermined period. 16. The non-transitory computer-readable storage medium of claim 10 , wherein the online service is a massively multi-user online service. 17. The non-transitory computer-readable storage medium of claim 10 , further comprising instructions to: obtain data related to a plurality of historical transactions, wherein each historical transaction is associated with a second account provided by the online service and with a label based on a click stream of graphical user interface interactions, and wherein the label identifies the historical transaction as belonging to a specific classification with respect to income. 18. The non-transitory computer-readable storage medium of claim 17 , further comprising instructions to: obtain data related to tax filing for a plurality of accounts of the online service; mine the data related to tax filing and the historical transactions using a clustering technique to identify potential sources of income; and train the multinomial classifier to generate the probability that a specific transaction is one of the potential sources of income. 19. A system comprising: a processor; a storage storing instructions which, when executed by the processor, perform operations as follows: generate a vector of features from the data related to each historical transaction; generate a probability that the historical transaction belongs to a specific classification with respect to income; train a multinomial classifier using the vecto

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • G06Q40/123Primary

    Tax preparation or submission · CPC title

  • Query processing support for facilitating data mining operations in structured databases · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10997672B2 cover?
A method includes obtaining data related to a plurality of historical transactions, where each historical transaction is associated with a label based on a click stream created by the first user, generating a vector of features from the data related to each historical transaction, training, using the vectors and labels, a multinomial classifier to generate a probability that a specific transact…
Who is the assignee on this patent?
Chen Meng, Pei Lei, Jennings Zachary Grove, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06Q40/123. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 04 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).