Generating labels for images associated with a user
US-2017185670-A1 · Jun 29, 2017 · US
US11907277B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11907277-B2 |
| Application number | US-202318165156-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 6, 2023 |
| Priority date | May 13, 2013 |
| Publication date | Feb 20, 2024 |
| Grant date | Feb 20, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.
Opening claim text (preview).
That which is claimed: 1. A method for tagging machine readable text associated with a merchant, the machine readable text recovered from one or more electronic sources, the method comprising: applying one or more queries to the machine readable text, wherein the one or more queries are automatically generated from a corpus having one or more documents with one or more labels indicative of one or more services offered by one or more merchants, wherein each of the one or more queries has an associated weight and the one or more queries are based on an extracted feature set and a precision score; assigning, using a processor, a label to textual portions of the machine readable text based on results of the application of the queries to the machine readable text; and classifying the merchant based on the label. 2. The method according to claim 1 , wherein each query comprises a score indicative of an ability to return relevant results, the method further comprising: accessing the corpus having one or more documents; generating at least one query based on one or more extracted features of the extracted feature set and the one or more documents; generating the precision score for at least a portion of the generated at least one query; and selecting a query subset from the generated at least one query based on an assigned precision score satisfying a precision score threshold, wherein the selected query subset is configured to provide an indication of one or more labels to be applied to machine readable text. 3. The method according to claim 2 , wherein the precision score is calculated based on a number of true positive documents returned by a query of the generated at least one query divided by a total number of documents returned. 4. The method according to claim 2 , wherein generating the at least one query further comprises: generating an array of feature index pairs, the array of feature index pairs comprising one or more features and a position of the one or more features in a sentence; generating the at least one query as a function of one or more combinations of feature index pairs based on the array of feature index pairs; and outputting the at least one query. 5. The method according to claim 4 , wherein generating the at least one query further comprises: calculating a distance between a first feature in the at least one query and a second feature in the at least one query; and generating a distance measure for the at least one query. 6. The method according to claim 5 , further comprising rounding the distance between the first feature and the second feature to a next highest multiple of a predetermined number. 7. The method according to claim 1 , further comprising: receiving the corpus; causing a first subset of words to be ignored in the corpus, the first subset of words comprising at least one of rare words or stop words; scoring a second subset of words based on a relationship between a word of the second subset of words and the label; and extracting features, the features comprising one or more words from the second subset of words that satisfy a predetermined threshold. 8. The method according to claim 1 , further comprising: calculating a normalization factor based on the precision score. 9. The method according to claim 8 , wherein assigning the label to textual portions of the machine readable text based on results of the application of the queries to the machine readable text further comprises: generating a score for the machine readable text, the generated score being a function of the precision score of a query of a query subset divided by the normalization factor; and generating at least one label for the machine readable text. 10. The method according to claim 9 , wherein the at least one label is a sub-dominant level in a hierarchical structure of service categories. 11. An apparatus for tagging machine readable text associated with a merchant, the machine readable text recovered from one or more electronic sources, the apparatus comprising at least one processor and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, causing the apparatus to at least: apply one or more queries to the machine readable text, wherein the one or more queries are automatically generated from a corpus having one or more documents with one or more labels indicative of one or more services offered by one or more merchants, wherein each of the one or more queries has an associated weight and the one or more queries are based on an extracted feature set and a precision score; assign, using the at least one processor, a label to textual portions of the machine readable text based on results of the application of the queries to the machine readable text; and classifying the merchant based on the label. 12. The apparatus according to claim 11 , wherein each query comprises a score indicative of an ability to return relevant results, and wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to: access the corpus having one or more documents; generate at least one query based on one or more extracted features of the extracted feature set and the one or more documents; generate the precision score for at least a portion of the generated at least one query; and select a query subset from the generated at least one query based on an assigned precision score satisfying a precision score threshold, wherein the selected query subset is configured to provide an indication of one or more labels to be applied to machine readable text. 13. The apparatus according to claim 12 , wherein the precision score is calculated based on a number of true positive documents returned by a query of the generated at least one query divided by a total number of documents returned. 14. The apparatus according to claim 11 , wherein generating the at least one query further comprises: generating an array of feature index pairs, the array of feature index pairs comprising one or more features and a position of the one or more features in a sentence; generating the at least one query as a function of one or more combinations of feature index pairs based on the array of feature index pairs; and outputting the at least one query. 15. The apparatus according to claim 14 , wherein generating the at least one query further comprises: calculating a distance between a first feature in the at least one query and a second feature in the at least one query; and generating a distance measure for the at least one query. 16. The apparatus according to claim 15 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to: round the distance between the first feature and the second feature to a next highest multiple of a predetermined number. 17. The apparatus according to claim 11 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to: receive the corpus; cause a first subset of words to be ignored in the corpus, the first subset of words comprising at least one of rare words or stop words; score a second subset of words based on a relationship between a word of the second subset of words and the label; and extract features, the features comprising one or more words from the second subset of words that satisfy a pre
Clustering; Classification · CPC title
Management therefor · CPC title
Presentation of query results · CPC title
Creation or modification of classes or clusters · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.