Classifying structural features of a digital document by feature type using machine learning
US-11003862-B2 · May 11, 2021 · US
US11423219B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11423219-B2 |
| Application number | US-202016823777-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 19, 2020 |
| Priority date | Mar 19, 2020 |
| Publication date | Aug 23, 2022 |
| Grant date | Aug 23, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment provides a method, including: obtaining a plurality of previously submitted application documents, wherein each of the previously submitted application documents comprises information provided by a user who initiated a given previously submitted application document; clustering the plurality of previously submitted application documents into clusters of application documents based upon topics of the previously submitted application documents; selecting a representative application document; identifying entities contained within a given representative application document, wherein each of the entities corresponds to information to be entered into a new application document created from the given representative application document; and engaging in a dialogue with a user to create the new application document utilizing a similar representative application document to request information from the user, wherein the similar representative application document comprises a representative application document of a cluster having a topic similar to a topic of the new application document.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining a plurality of previously submitted application documents, wherein each of the previously submitted application documents has a corresponding topic and comprises information provided by a user who initiated a given previously submitted application document, wherein at least a portion of the information comprises unstructured information; clustering the plurality of previously submitted application documents into clusters of application documents based upon the topics of the previously submitted application documents, wherein the previously submitted application documents within a given cluster have similar topics among the previously submitted application documents included in the given cluster; selecting, for each cluster, a representative application document; identifying, for each representative application document of each of the clusters, entities contained within a given representative application document, wherein each of the entities in the representative application document corresponds to information to be entered into a new application document created from the given representative application document; and engaging in a dialogue with a user to create the new application document utilizing a similar representative application document to request information from the user, wherein the engaging comprises identifying a topic of the new application document, wherein the similar representative application document comprises one of the representative application documents from one of the clusters, the one of the clusters having a topic similar to the topic of the new application document, wherein the engaging comprises populating information received from the user via the dialogue into the new application document by identifying one of the entities in the one of the representative application documents corresponding to the information received form the user. 2. The method of claim 1 , comprising generating the new application document using the representative application document as a template for the new application document; and populating the new application document with the information provided by the user. 3. The method of claim 2 , comprising populating the new application document with information identified from a context of the user, the context being determined using a secondary source. 4. The method of claim 3 , comprising refining information included within the populated application document such that the included information is both (i) grammatically and (ii) semantically accurate. 5. The method of claim 1 , comprising segmenting the representative application document into different sections utilizing a pre-trained classifier. 6. The method of claim 1 , comprising determining at least one document to be attached to the new application document. 7. The method of claim 1 , comprising: determining that a representative application document having a topic similar to the topic of the new application document is unavailable; requesting additional information from the user regarding additional details of the new application document; and identifying, utilizing the additional information, at least one of the plurality of previously submitted application documents having a similarity to the new application document. 8. The method of claim 7 , comprising utilizing, during the dialogue with the user, the at least one of the plurality of previously submitted application documents having a similarity as the similar representative application document. 9. The method of claim 1 , wherein the identifying entities comprises utilizing an entity model that is trained for entity recognition. 10. The method of claim 1 , wherein the topic for the new application document is identified by the user. 11. An apparatus, comprising: at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to obtain a plurality of previously submitted application documents, wherein each of the previously submitted application documents has a corresponding topic and comprises information provided by a user who initiated a given previously submitted application document, wherein at least a portion of the information comprises unstructured information; computer readable program code configured to cluster the plurality of previously submitted application documents into clusters of application documents based upon the topics of the previously submitted application documents, wherein the previously submitted application documents within a given cluster have similar topics among the previously submitted application documents included in the given cluster; computer readable program code configured to select, for each cluster, a representative application document; computer readable program code configured to identify, for each representative application document of each of the clusters, entities contained within a given representative application document, wherein each of the entities in the representative application document corresponds to information to be entered into a new application document created from the given representative application document; and computer readable program code configured to engage in a dialogue with a user to create the new application document utilizing a similar representative application document to request information from the user, wherein the engaging comprises identifying a topic of the new application document, wherein the similar representative application document comprises one of the representative application documents from one of the clusters, the one of the clusters having a topic similar to a topic of the new application document, wherein the engaging comprises populating information received from the user via the dialogue into the new application document by identifying one of the entities in the one of the representative application documents corresponding to the information received form the user. 12. A computer program product, comprising: a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by a processor and comprising: computer readable program code configured to obtain a plurality of previously submitted application documents, wherein each of the previously submitted application documents has a corresponding topic and comprises information provided by a user who initiated a given previously submitted application document, wherein at least a portion of the information comprises unstructured information; computer readable program code configured to cluster the plurality of previously submitted application documents into clusters of application documents based upon the topics of the previously submitted application documents, wherein the previously submitted application documents within a given cluster have similar topics among the previously submitted application documents included in the given cluster; computer readable program code configured to select, for each cluster, a representative application document; computer readable program code configured to identify, for each representative application document of each of the clusters, entities contained within a given representative application document, wherein each of the entities in the representative application document corresponds to information to be entered into a new application document created from the given representative application document; and com
Form filling; Merging · CPC title
Templates · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Grammatical analysis; Style critique · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.