Large-scale image tagging using image-to-topic embedding
US-2018267997-A1 · Sep 20, 2018 · US
US11657223B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11657223-B2 |
| Application number | US-202117552742-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 16, 2021 |
| Priority date | Jul 2, 2019 |
| Publication date | May 23, 2023 |
| Grant date | May 23, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system for extracting a key phrase from a document includes a neural key phrase extraction model (“BLING-KPE”) having a first layer to extract a word sequence from the document, a second layer to represent each word in the word sequence by ELMo embedding, position embedding, and visual features, and a third layer to concatenate the ELMo embedding, the position embedding, and the visual features to produce hybrid word embeddings. A convolutional transformer models the hybrid word embeddings to n-gram embeddings, and a feedforward layer converts the n-gram embeddings into a probability distribution over a set of n-grams and calculates a key phrase score of each n-gram. The neural key phrase extraction model is trained on annotated data based on a labeled loss function to compute cross entropy loss of the key phrase score of each n-gram as compared with a label from the annotated dataset.
Opening claim text (preview).
The invention claimed is: 1. A system comprising: a processor; and memory storing instructions that, when executed by the processor, cause the processor to perform acts comprising: obtaining a webpage, wherein a topical authority score with respect to a topic is to be computed for the webpage, and further wherein the topical authority score is representative of authoritativeness of the webpage with respect to the topic; computing hybrid embeddings for words in the webpage, where the hybrid embeddings are based upon semantic embeddings for the words in the webpage and visual embeddings for the words in the webpage, wherein the visual embeddings are based upon visual features of the words; computing a key phrase score for a sequence of words in the words, wherein the key phrase score is computed based upon the hybrid embeddings computed for the words; and assigning the topical authority score to the webpage based upon the key phrase score computed for the sequence of words, wherein the webpage is ranked in a ranked list of search results returned to a user based upon: a query received from the user; and the topical authority score assigned to the webpage. 2. The system of claim 1 , wherein the hybrid embeddings are based further upon positional embeddings for the words in the webpage, the positional embeddings being representative of positions of the words in the webpage. 3. The system of claim 2 , wherein the word embeddings, the visual embeddings, and the positional embeddings are concatenated together to form the hybrid embeddings. 4. The system of claim 1 , wherein the visual features of the words comprise at least one of location, size, font, and HTML structure of the words. 5. The system of claim 1 , the acts further comprising computing key phrase scores for several sequences of words in the words, wherein the topical authority score assigned to the webpage is based upon the key phrase scores computed for the several sequences of words. 6. The system of claim 5 , wherein the several sequences of words have different lengths. 7. The system of claim 1 , wherein the acts are performed by a search engine. 8. A method performed by a computing system, the method comprising: obtaining a webpage, wherein a topical authority score with respect to a topic is to be computed for the webpage, and further wherein the topical authority score is representative of authoritativeness of the webpage with respect to the topic; computing hybrid embeddings for words in the webpage, where the hybrid embeddings are based upon semantic embeddings for the words in the webpage and visual embeddings for the words in the webpage, wherein the visual embeddings are based upon visual features of the words; computing a key phrase score for a sequence of words in the words, wherein the key phrase score is computed based upon the hybrid embeddings computed for the words; and assigning the topical authority score to the webpage based upon the key phrase score computed for the sequence of words, wherein the webpage is ranked in a ranked list of search results returned to a user based upon: a query received from the user; and the topical authority score assigned to the webpage. 9. The method of claim 8 , wherein the hybrid embeddings are based further upon positional embeddings for the words in the webpage, the positional embeddings being representative of positions of the words in the webpage. 10. The method of claim 9 , wherein the word embeddings, the visual embeddings, and the positional embeddings are concatenated together to form the hybrid embeddings. 11. The method of claim 8 , wherein the visual features of the words comprise at least one of location, size, font, and HTML structure of the words. 12. The method of claim 8 , further comprising computing key phrase scores for several sequences of words in the words, wherein the topical authority score assigned to the webpage is based upon the key phrase scores computed for the several sequences of words. 13. The method of claim 12 , wherein the several sequences of words have different lengths. 14. The method of claim 8 , wherein the acts are performed by a search engine. 15. A non-transitory computer-readable medium that stores instructions that, when executed by a processor, cause the processor to perform acts comprising: obtaining a webpage, wherein a topical authority score with respect to a topic is to be computed for the webpage, and further wherein the topical authority score is representative of authoritativeness of the webpage with respect to the topic; computing hybrid embeddings for words in the webpage, where the hybrid embeddings are based upon semantic embeddings for the words in the webpage and visual embeddings for the words in the webpage, wherein the visual embeddings are based upon visual features of the words; computing a key phrase score for a sequence of words in the words, wherein the key phrase score is computed based upon the hybrid embeddings computed for the words; and assigning the topical authority score to the webpage based upon the key phrase score computed for the sequence of words, wherein the webpage is ranked in a ranked list of search results returned to a user based upon: a query received from the user; and the topical authority score assigned to the webpage. 16. The non-transitory computer-readable medium of claim 15 , wherein the hybrid embeddings are based further upon positional embeddings for the words in the webpage, the positional embeddings being representative of positions of the words in the webpage. 17. The non-transitory computer-readable medium of claim 16 , wherein the word embeddings, the visual embeddings, and the positional embeddings are concatenated together to form the hybrid embeddings. 18. The non-transitory computer-readable medium of claim 15 , wherein the visual features of the words comprise at least one of location, size, font, and HTML structure of the words. 19. The non-transitory computer-readable medium of claim 15 , the acts further comprising computing key phrase scores for several sequences of words in the words, wherein the topical authority score assigned to the webpage is based upon the key phrase scores computed for the several sequences of words. 20. The non-transitory computer-readable medium of claim 19 , wherein the several sequences of words have different lengths.
Convolutional networks [CNN, ConvNet] · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Supervised learning · CPC title
Semantic analysis · CPC title
based on feedback of a supervisor · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.