Search engine use of neural network regressor for multi-modal item recommendations based on visual semantic embeddings
US-2020311798-A1 · Oct 1, 2020 · US
US11748570B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11748570-B2 |
| Application number | US-202016842155-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 7, 2020 |
| Priority date | Apr 7, 2020 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment provides a method, including: accessing, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts; segmenting each of the plurality of dynamic visual media scripts into scenes; generating, for each of the plurality of dynamic visual media scripts, a character fingerprint identifying topics corresponding to each character within a corresponding dynamic visual media script, wherein the generating comprises (i) extracting both characters and topics from the dynamic visual media script and (ii) associating each of the topics with a corresponding character, wherein the character fingerprint identifies costumes of a given character and a topic corresponding to each costume; and producing, for each scene within each dynamic visual media script, a scene vector identifying (iii) the topics included within a corresponding scene and (iv) a character fingerprint for each character occurring within the scene.
Opening claim text (preview).
What is claimed is: 1. A method for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media, the method comprising: receiving a dynamic visual media script for recommendation of at least one costume; accessing, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts; segmenting each of the plurality of dynamic visual media scripts into scenes, wherein the segmenting comprises identifying, utilizing at least one scene segmentation technique to enclose semantic boundaries, a segment by identifying a portion of the dynamic visual media script having a costume and corresponding context that are consistent throughout the portion; generating, for each of the plurality of dynamic visual media scripts, a character fingerprint identifying topics corresponding to each character within a corresponding dynamic visual media script, wherein the generating comprises (i) extracting both characters and topics from the dynamic visual media script and (ii) associating each of the topics with a corresponding character and comprises generating a topic-relationship graph across the dynamic visual media corpus, wherein the topic-relationship graph represents topics occurring within the plurality of dynamic visual media scripts as nodes and relationships between the topics as edges, wherein the edges are weighted with an occurrence frequency, wherein the character fingerprint identifies costumes of a given character and a topic corresponding to each costume; producing, for each scene within each dynamic visual media script, a scene vector identifying (iii) the topics included within a corresponding scene and (iv) a character fingerprint for each character occurring within the scene and training the machine-learning model with the scene vectors; encoding, within each scene within each dynamic visual media script, a topic encoding vector using the topic-relationship graph, the character fingerprint, and the scene vector, wherein the topic encoding vector reflects an overall story of the dynamic visual media script and ensures consistency of costume generation of a character throughout the dynamic visual media script; and providing, based upon the topic encoding vectors, at least one recommendation for a costume within the received dynamical visual script, wherein the providing at least one recommendation comprises applying a generative adversarial network to each scene vector of the received dynamic visual script against the scene vectors of the machine-learning model, thereby generating one or more costumes for the dynamic visual media scene of the received dynamic visual media script. 2. The method of claim 1 , wherein the producing a scene vector comprises utilizing the topic-relationship graph. 3. The method of claim 1 , comprising generating a scene-level character fingerprint by applying a time window corresponding to a scene to the dynamic visual media script, wherein the generating is carried out for the applied time window. 4. The method of claim 1 , wherein the providing at least one recommendation comprises: segmenting the received dynamic visual script; generating a character fingerprint for each character within the received dynamic visual script; and producing a scene vector for each scene within the received dynamic visual script. 5. The method of claim 1 , comprising generating a textual script for each scene corresponding to a dialogue included in a corresponding scene; and wherein the generating and the producing are based upon the textual script. 6. An apparatus for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media, the apparatus comprising: at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to receive a dynamic visual media script for recommendation of at least one costume; computer readable program code configured to access, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts; computer readable program code configured to segment each of a plurality of dynamic visual scripts into scenes, wherein the computer readable code configured to segment comprises identifying, utilizing at least one scene segmentation technique to enclose semantic boundaries, a segment by identifying a portion of the dynamic visual media script having a costume and corresponding context that are consistent throughout the portion; computer readable program code configured to generate, for each of the plurality of dynamic visual media scripts, a character fingerprint identifying topics corresponding to each character within a corresponding dynamic visual media script, wherein the generating comprises (i) extracting both characters and topics from the dynamic visual media script and (ii) associating each of the topics with a corresponding character, and comprises generating a topic-relationship graph across the dynamic visual media corpus, wherein the topic-relationship graph represents topics occurring within the plurality of dynamic visual media scripts as nodes and relationships between the topics as edges, wherein the edges are weighted with an occurrence frequency, wherein the character fingerprint identifies costumes of a given character and a topic corresponding to each costume; computer readable program code configured to produce, for each scene within each dynamic visual media script, a scene vector identifying (iii) the topics included within a corresponding scene and (iv) a character fingerprint for each character occurring within the scene and training the machine-learning model with the scene vectors; computer readable program code configured to encode, within each scene within each dynamic visual media script, a topic encoding vector using the topic-relationship graph, the character fingerprint, and the scene vector, wherein the topic encoding vector reflects an overall story of the dynamic visual media script and ensures consistency of costume generation of a character throughout the dynamic visual media script; and computer readable program code configured to provide, based upon the topic encoding vectors, at least one recommendation for a costume within the received dynamical visual script, wherein the providing at least one recommendation comprises applying a generative adversarial network to each scene vector of the received dynamic visual script against the scene vectors of the machine-learning model, thereby generating one or more costumes for the dynamic visual media scene of the received dynamic visual media script. 7. A computer program product for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by a processor comprising: computer readable program code configured to receive a dynamic visual media script for recommendation of at least one costume; computer readable program code configured to access, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts; computer readable program code configured to segment each of a plurality of dynamic visual scripts into scenes, wherein th
Supervised learning · CPC title
Generative networks · CPC title
Adversarial learning · CPC title
Semantic analysis · CPC title
using statistical methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.