Machine learning collaboration techniques
US-2024420212-A1 · Dec 19, 2024 · US
US2021209353A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021209353-A1 |
| Application number | US-202016736020-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 7, 2020 |
| Priority date | Jan 7, 2020 |
| Publication date | Jul 8, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Aspects of the present invention disclose a method for extracting information of an unlabeled image within a document and aligning the information to text of the document. The method includes one or more processors identifying an image that is not associated with a corresponding label in a document that includes text. The method further includes determining a feature of an object of the image. The method further includes identifying an alignment candidate of the text of the document based at least in part on the feature of the object, wherein the alignment candidate is a segment of the text of the document identified as corresponding to the feature of the object. The method further includes aligning the feature with the alignment candidate of the text of the document.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: identifying, by one or more processors, an image that is not associated with a corresponding label in a document that includes text; determining, by one or more processors, a feature of an object of the image; identifying, by one or more processors, an alignment candidate of the text of the document based at least in part on the feature of the object, wherein the alignment candidate is a segment of the text of the document identified as corresponding to the feature of the object; and aligning, by one or more processors, the feature with the alignment candidate of the text of the document, wherein aligning the feature with the alignment candidate further comprises: annotating, by one or more processors, the alignment candidate of the text to include metadata corresponding to one or more classifications of the feature of the object of the image. 2 . The method of claim 1 , further comprising: determining, by one or more processors, that a score of the alignment candidate is above a defined threshold score; and ranking, by one or more processors, the score of the alignment candidate with a plurality of alignment candidates of the text of the document. 3 . The method of claim 1 , wherein determining the feature of the object of the image, further comprises: detecting, by one or more processors, the feature in the object of the image included in the document; and identifying, by one or more processors, an attribute of the feature of the object of the image, wherein the attribute is a characteristic of the object. 4 . The method of claim 3 , further comprising: determining, by one or more processors, a classification of the feature of the object; and extracting, by one or more processors, metadata of the classification of the feature, wherein the metadata includes textual data that represents classes and keywords of the classification. 5 . The method of claim 4 , wherein identifying the alignment candidate of the text of the document based at least in part on the feature of the object, further comprises: extracting, by one or more processors, the textual data of the metadata of the classification; performing, by one or more processors, a fuzzy text search based at least in part on the extracted textual data of metadata of the classification; and identifying, by one or more processors, one or more alignment candidates in text of the document that correspond with the extracted textual data of the metadata of the classification of the feature. 6 . The method of claim 5 , wherein aligning the feature with the alignment candidate of the text of the document, further comprises: modifying, by one or more processors, the alignment candidate of the text of the document to include a link, wherein the link includes the feature and the metadata of the classification. 7 . The method of claim 1 , wherein determining the score of the alignment candidate, further comprises: determining, by one or more processors, a semantic similarity of the alignment candidate and the feature of the object; and assigning, by one or more processors, the score to the alignment candidate, wherein the assigned score is based at least in part on a defined distance between concepts. 8 . A computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising: program instructions to identify an image that is not associated with a corresponding label in a document that includes text; program instructions to determine a feature of an object of the image; program instructions to identify an alignment candidate of the text of the document based at least in part on the feature of the object, wherein the alignment candidate is a segment of the text of the document identified as corresponding to the feature of the object; and program instructions to align the feature with the alignment candidate of the text of the document, wherein aligning the feature with the alignment candidate further comprises program instructions to: annotate the alignment candidate of the text to include metadata corresponding to one or more classifications of the feature of the object of the image. 9 . The computer program product of claim 8 , further comprising program instructions, stored on the one or more computer readable storage media, to: determine that a score of the alignment candidate is above a defined threshold score; and rank the score of the alignment candidate with a plurality of alignment candidates of the text of the document. 10 . The computer program product of claim 8 , wherein program instructions to determine the feature of the object of the image, further comprise program instructions to: detect the feature in the object of the image included in the document; and identify an attribute of the feature of the object of the image, wherein the attribute is a characteristic of the object. 11 . The computer program product of claim 10 , further comprising program instructions, stored on the one or more computer readable storage media, to: determine a classification of the feature of the object; and extract metadata of the classification of the feature, wherein the metadata includes textual data that represents classes and keywords of the classification. 12 . The computer program product of claim 11 , wherein program instructions to identify the alignment candidate of the text of the document based at least in part on the feature of the object, further comprise program instructions to: extract the textual data of the metadata of the classification; perform a fuzzy text search based at least in part on the extracted textual data of metadata of the classification; and identify one or more alignment candidates in text of the document that correspond with the extracted textual data of the metadata of the classification of the feature. 13 . The computer program product of claim 12 , wherein program instructions to align the feature with the alignment candidate of the text of the document, further comprise program instructions to: modify the alignment candidate of the text of the document to include a link, wherein the link includes the feature and the metadata of the classification. 14 . The computer program product of claim 8 , wherein program instructions to determine the score of the alignment candidate, further comprise program instructions to: determine a semantic similarity of the alignment candidate and the feature of the object; and assign the score to the alignment candidate, wherein the assigned score is based at least in part on a defined distance between concepts. 15 . A computer system comprising: one or more computer processors; one or more computer readable storage media; and program instructions stored on the computer readable storage media for execution by at least one of the one or more processors, the program instructions comprising: program instructions to identify an image that is not associated with a corresponding label in a document that includes text; program instructions to determine a feature of an object of the image; program instructions to identify an alignment candidate of the text of the document based at least in part on the feature of the object, wherein the alignment candidate is a segment of the text of the document identified as corresponding to the feature of the object; and program instructions to align the feature with the alignment candidate of the text of the document, where
Classification techniques · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title
Proximity, similarity or dissimilarity measures · CPC title
Classification of content, e.g. text, photographs or tables · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.