Directly identifying items from an item catalog satisfying a received query using a model determining measures of similarity between items in the item catalog and the query
US-2023146336-A1 · May 11, 2023 · US
US12008065B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12008065-B2 |
| Application number | US-202318153960-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 12, 2023 |
| Priority date | Dec 22, 2020 |
| Publication date | Jun 11, 2024 |
| Grant date | Jun 11, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize machine learning models to generate identifier embeddings from digital content identifiers and then leverage these identifier embeddings to determine digital connections between digital content items. In particular, the disclosed systems can utilize an embedding machine-learning model that comprises a character-level embedding machine-learning model and a word-level embedding machine-learning model. For example, the disclosed systems can combine a character embedding from the character-level embedding machine-learning model and a token embedding from the word-level embedding machine-learning model. The disclosed systems can determine digital connections between the plurality of digital content items by processing these identifier embeddings for a plurality of digital content items utilizing a content management model. Based on the digital connections, the disclosed systems can surface one or more digital content suggestions to a user interface of a client device.
Opening claim text (preview).
What is claimed is: 1. A system comprising: at least one processor; and at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one processor, cause the system to: identify a plurality of identifiers corresponding to a plurality of digital content items; generate a plurality of identifier embeddings corresponding to the plurality of identifiers by utilizing one or more embedding machine-learning models; generate digital similarity predictions between the plurality of digital content items by processing the plurality of identifier embeddings utilizing a content management model; and determine a digital connection between a subset of digital content items of the plurality of digital content items based on the digital similarity predictions. 2. The system of claim 1 , wherein generating the plurality of identifier embeddings comprises generating at least one token representing a subset of individual characters within a given identifier of the plurality of identifiers. 3. The system of claim 2 , wherein the subset of individual characters represents a word within the given identifier. 4. The system of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the system to: determine, utilizing the content management model, a digital connection between a first identifier embedding from the plurality of identifier embeddings associated with a first digital content item and a second identifier embedding from the plurality of identifier embeddings associated with a second digital content item; and based on the digital connection, generate a suggestion related to at least one of the first digital content item or the second digital content item. 5. The system of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the system to generate, based on the digital connection between the subset of digital content items of the plurality of digital content items, a suggestion comprising at least one of a suggested team workspace, a suggested digital content item, or a suggested access privilege. 6. The system of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the system to generate the plurality of identifier embeddings by utilizing a word-level embedding machine-learning model and a character-level embedding machine-learning model. 7. The system of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the system to generate the plurality of identifier embeddings by: generating a word-level embedding utilizing a first embedding layer of a first neural network; and generating a character-level embedding utilizing a second embedding layer of a second neural network. 8. The system of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the system to generate a storage organization relationship suggestion for one or more digital content items from the subset of digital content items based on the digital connection between the subset of digital content items. 9. A non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause a computing device to: identify a plurality of identifiers corresponding to a plurality of digital content items, wherein the plurality of identifiers comprise a plurality of filenames associated with the plurality of digital content items; generate a plurality of identifier embeddings corresponding to the plurality of identifiers by utilizing one or more embedding machine-learning models; generate digital similarity predictions between the plurality of digital content items by processing the plurality of identifier embeddings utilizing a content management model; and determine a digital connection between a first digital content item and a second digital content item from the plurality of digital content items based on the digital similarity predictions. 10. The non-transitory computer readable medium as recited in claim 9 , wherein generating the plurality of identifier embeddings comprises generating a first token representing a word within a first filename associated with the first digital content item. 11. The non-transitory computer readable medium as recited in claim 9 , further comprising instructions that, when executed by the at least one processor, cause the computing device to: generate a suggestion related to the first digital content item or the second digital content item based on the digital connection between the first digital content item and the second digital content item; and provide the suggestion to a client device having access to the plurality of digital content items. 12. The non-transitory computer readable medium as recited in claim 9 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate, based on the digital connection between the first digital content item and the second digital content item, a suggestion comprising at least one of a suggested team workspace, a suggested digital content item, or a suggested access privilege. 13. The non-transitory computer readable medium as recited in claim 9 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the plurality of identifier embeddings by utilizing a word-level embedding machine-learning model to process words within the plurality of filenames and a character-level embedding machine-learning model to process characters within the plurality of filenames. 14. The non-transitory computer readable medium as recited in claim 9 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate a suggestion for the first digital content item based on at least one of: a file extension embedding corresponding to the first digital content item; or a user activity embedding corresponding to user activity with respect to the first digital content item. 15. A computer-implemented method comprising: identifying a plurality of identifiers corresponding to a plurality of digital content items; generating a plurality of identifier embeddings corresponding to the plurality of identifiers by utilizing one or more embedding machine-learning models; generating digital similarity predictions between the plurality of digital content items by processing the plurality of identifier embeddings utilizing a content management model; and determining a digital connection between a subset of digital content items of the plurality of digital content items based on the digital similarity predictions. 16. The computer-implemented method of claim 15 , wherein generating the plurality of identifier embeddings comprises generating a token representing a subset of individual characters within a given identifier of the plurality of identifiers. 17. The computer-implemented method of claim 15 , further comprising providing, for display on a client device and based on the digital connection between the subset of digital content items of the plurality of digital content items, a suggestion comprising at least one of a suggested team workspace, a suggested digital content item, or a suggested access privilege. 18. The computer-implemented method of claim 15 , further comprising generating the plurality of identifier embeddings by utilizing a word-level embedding machine
Supervised learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Details of searching files based on file metadata · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.