System and method for a cloud based solution to track notes against business records
US-9817991-B2 · Nov 14, 2017 · US
US11210300B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11210300-B2 |
| Application number | US-201615147052-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 5, 2016 |
| Priority date | May 14, 2015 |
| Publication date | Dec 28, 2021 |
| Grant date | Dec 28, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods to infer or predict the proper placement of unstructured data (such as text, phrases, segments of phrases, alphanumeric characters) into a more structured format (such as a specific data field). In some embodiments, this is based on a user's prior assignment of similar unstructured data into a specific structure. In some embodiments, this may be based on other users' prior assignment of similar unstructured data into the specific structure. In yet other embodiments, this may be based on information obtained from business data used by a data processing platform to assist in operating the business (i.e., either business data or the output of a business application that processes the business data, such as an ERP, CRM, or eCommerce application).
Opening claim text (preview).
What is claimed is: 1. A method of determining an assignment of one or more elements of data to a specific data field or to a set of data fields, comprising: training a machine learning algorithm to optimize identification of one or more structured data fields as destinations for elements of unstructured data based on historical entry of unstructured data into the structured data fields; accessing one or more sources of data to be processed for assignment to the specific data field or to the set of data fields; determining a relationship, association or correlation between samples of unstructured text and data fields that represent general text elements arranged in free-form strings using a natural language processing (NLP) technique that includes determining n-grams to represent each sample of unstructured text characters as a vector, determining, for each n-gram of the n-grams, an associated weight greater than zero and less than one based at least in part on an amount of time since a document containing the n-gram was cited by another document, with the weight reduced as the amount of time increases, and adding the highest weighted n-gram to a list of most likely candidates for placement into the specified data field or the set of data fields; identifying a most likely candidate text or string for placement into the specified data field or the set of data fields by applying the trained machine learning algorithm to the vector; adding the most likely candidate text or string to the list of most likely candidates for placement into the specified data field or the set of data fields; receiving a selection of one candidate from the list for placement into the specified data field or the set of data fields; in response to receiving the selection of the one candidate, using the one candidate as data values for the specified data field or the set of data fields; and storing the data values in a format or record associated with the specific data field or the set of data fields. 2. The method of claim 1 , wherein at least one of the one or more sources of data is data associated with a specific task. 3. The method of claim 1 , wherein at least one of the one or more sources of data is data associated with a specific data processing application or business area. 4. The method of claim 1 , wherein at least one of the one or more sources of data is data associated with a specific time interval covering a lifetime of a product architecture or a time since a product architecting event. 5. The method of claim 4 , wherein the data associated with the specific time interval is data that was generated within that time interval. 6. The method of claim 1 , wherein at least one of the one or more sources of data is data associated with a specific set of users. 7. The method of claim 1 , wherein the weights are at least in part a function of how recently a document containing the accessed data was entered into a system. 8. The method of claim 1 , wherein the weights are at least in part a function of the amount of citation or incorporation by other documents of elements of the accessed data. 9. The method of claim 1 , wherein the sources of data include data resident on a multi-tenant business data processing platform, the platform including tenant-specific data generated or utilized by one or more of a tenant-specific enterprise resource planning (ERP), customer relationship management (CRM), eCommerce, human resources (HR), or financial application. 10. The method of claim 1 , wherein the machine learning technique includes application of a k-nearest neighbor approach to identifying the most likely candidate text or string, wherein the k-nearest neighbor approach is uncombined with a support vector machine approach. 11. The method of claim 1 , wherein the amount of time since the document containing the n-gram was cited by another document is the minimum amount of time among several amounts of time since the document containing the n-gram was cited. 12. The method of claim 1 , wherein the weights are calculated at least in part by dividing the minimum time among all times since the document containing the n-gram was cited by the total time since the document was entered into the system and subtracting a resulting quotient from one. 13. The method of claim 1 , wherein: at least one of the one or more sources of data is data associated with a specific task, a specific data processing application or business area, and a specific set of users, and the data is generated within specific time interval covering a lifetime of a product architecture; the one or more sources of data include data resident on a multi-tenant business data processing platform, the platform including tenant-specific data generated or utilized by one or more of a tenant-specific eCommerce application; the weights are at least in part a function of how recently a document containing the accessed data was entered into a system; the machine learning technique includes application of a k-nearest neighbor approach to identifying the most likely candidate text or string that is uncombined with a support vector machine approach; the amount of time since the document containing the n-gram was cited by another document is the minimum amount of time among several amounts of time since the document containing the n-gram was cited. 14. A system for determining an assignment of one or more elements of data to a specific data field, comprising a database or data store containing a plurality of data records; one or more business related data processing applications installed in the system; a hardware processor programmed with a set of instructions, wherein, when executed by the hardware processor, the instructions cause the system to train a machine learning algorithm to optimize identification of one or more structured data fields as destinations for elements of unstructured data based on historical entry of unstructured data into the structured data fields; access one or more sources of data from the database or data store to be processed for assignment to the specific data field; determine a relationship, association or correlation between samples of unstructured text and data fields that represent general text elements arranged in free-form strings-using a natural language processing (NLP) technique that includes determining n-grams to represent each sample of unstructured text characters as a vector, determine, for each n-gram of the n-grams, an associated weight greater than zero and less than one based at least in part on an amount of time since a document containing the n-gram was cited by another document, with the weight reduced as the amount of time increases, and adding the highest weighted n-gram to a list of most likely candidates for placement into the specified data field or the set of data fields; identify a most likely candidate text or string for placement into the specified data field by applying a machine learning technique to the vector; add the most likely candidate text or string to the list of most likely candidates for placement into the specified data field or the set of data fields; receive a selection of one candidate from the list for placement into the specified data field or set of data fields; in response to receiving the selection of the one candidate, use the one candidate as data values for the specified data field; and store the data values in a format or record associated with the specific data field. 15. The system of claim 14 , wherein the one or more business related data processing applications include one
Data format conversion from or to a database · CPC title
Form filling; Merging · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Machine learning · CPC title
Administration; Management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.