Generating vector representations of documents
US-2015220833-A1 · Aug 6, 2015 · US
US10817781B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10817781-B2 |
| Application number | US-201715582534-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 28, 2017 |
| Priority date | Apr 28, 2017 |
| Publication date | Oct 27, 2020 |
| Grant date | Oct 27, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes receiving, via a graphical user interface including a plurality of document elements and a plurality of class elements, user input associating a first document element of the plurality of document elements with a first class element of the plurality of class elements. Each document element represents a corresponding document of a plurality of documents, and each class element represents a corresponding class of a plurality of classes. The method also includes generating a document classifier using supervised training data, where the supervised training data indicates, based on the user input, that a first document represented by the first document element is assigned to a first class associated with the first class element.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, via a graphical user interface including a plurality of document elements and a plurality of class elements, user input associating a first document element of the plurality of document elements with a first class element of the plurality of class elements, wherein each document element represents a corresponding document of a plurality of documents and each class element represents a corresponding class of a plurality of classes; updating, based on the user input, supervised training data to indicate that a first document represented by the first document element is assigned to a first class associated with the first class element; receiving, via the graphical user interface, a selection of a particular option; and based on receiving the selection of the particular option: generating multiple preliminary document classifiers based on a first portion of the supervised training data, each preliminary document classifier of the multiple preliminary document classifiers generated based on respective classifier generation settings; evaluating performance of the multiple preliminary document classifiers based on a second portion of the supervised training data; and generating a document classifier based on classifier generation settings of a best performing preliminary document classifier of the multiple preliminary document classifier. 2. The method of claim 1 , further comprising, before receiving the user input, receiving second user input defining the plurality of classes, wherein the graphical user interface is generated based on the second user input. 3. The method of claim 2 , wherein receiving the second user input includes receiving a selection to create a new folder and receiving information designating a name of the folder, wherein the name of the folder is associated with an identifier of a new class of the plurality of classes. 4. The method of claim 3 , further comprising receiving a selection to rename the folder and receiving information designating a new name of the folder, wherein the new name of the folder is associated with the identifier of the class. 5. The method of claim 1 , further comprising, responsive to the user input, generating a pointer and a tag, the pointer indicating a storage location of the first document in a file system and the tag identifying the first class. 6. The method of claim 1 , wherein a first preliminary document classifier of the multiple preliminary document classifiers is generated using a first machine learning process and a second preliminary document classifier of the multiple preliminary document classifiers is generated using a second machine learning process, wherein the first machine learning process is different from the second machine learning process. 7. The method of claim 1 , further comprising: after generating the document classifier, detecting that a rebuild condition is satisfied; and based on detecting that the rebuild condition is satisfied, generating an updated document classifier. 8. The method of claim 7 , wherein the document classifier is generated based on the supervised training data, wherein the updated document classifier is generated based on second supervised training data, wherein the supervised training data includes first data indicating that each document of a first set of documents is assigned to one or more classes of a first set of classes, wherein the second supervised training data includes second data indicating that each document of a second set of documents is assigned to one or more classes of a second set of classes, and wherein the first set of classes is different from the second set of classes. 9. The method of claim 7 , wherein the document classifier is generated based on the supervised training data, wherein the updated document classifier is generated based on second supervised training data, wherein the supervised training data includes first data indicating that each document of a first set of documents is assigned to one or more classes of a first set of classes, wherein the second supervised training data includes second data indicating that each document of a second set of documents is assigned to one or more classes of a second set of classes, and wherein the first set of documents is different from the second set of documents. 10. The method of claim 7 , wherein the document classifier is generated based on the supervised training data, wherein the updated document classifier is generated based on second supervised training data, wherein the supervised training data includes first data indicating that each document of a first set of documents is assigned to one or more classes according to a first hierarchical arrangement, wherein the second supervised training data includes second data indicating that each document of a second set of documents is assigned to one or more classes according to a second hierarchical arrangement, and wherein the first hierarchical arrangement is different from the second hierarchical arrangement. 11. A computing device comprising: a memory storing document classifier generation instructions; and a processor configured to execute instructions from the memory, wherein the document classifier generation instructions are executable by the processor to perform operations comprising: receiving, via a graphical user interface including a plurality of document elements and a plurality of class elements, user input associating a first document element of the plurality of document elements with a first class element of the plurality of class elements, wherein each document element represents a corresponding document of a plurality of documents and each class element represents a corresponding class of a plurality of classes; updating, based on the user input, supervised training data to indicate that a first document represented by the first document element is assigned to a first class associated with the first class element; receiving, via the graphical user interface, a selection of a particular option; and based on receiving the selection of the particular option: generating multiple preliminary document classifiers based on a first portion of the supervised training data, each preliminary document classifier of the multiple preliminary document classifiers generated based on respective classifier generation settings; evaluating performance of the multiple preliminary document classifiers based on a second portion of the supervised training data; and generating a document classifier based on classifier generation settings of a best performing preliminary document classifier of the multiple preliminary document classifiers, the supervised training data used to generate the document classifier. 12. The computing device of claim 11 , wherein the user input associating the first document element with the first class element includes at least one of a drag-and-drop operation, a copy-and-paste operation, or a cut-and-paste operation. 13. The computing device of claim 11 , wherein the memory further stores file system data associated with the plurality of documents, and wherein the user input associating the first document element with the first class element does not change a portion of the file system data associated with the first document. 14. A computer readable storage device storing instructions that are executable by a processor to perform operations comprising: receiving, via a graphical user interface including a plurality of document elements and a plurality of class elements, user input associating a first document element of the plurality o
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
using kernel methods, e.g. support vector machines [SVM] · CPC title
Selection of displayed objects or displayed text elements (G06F3/0482 takes precedence) · CPC title
Ensemble learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.