Content access and storage

US11599581B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11599581-B2
Application numberUS-201816616921-A
CountryUS
Kind codeB2
Filing dateMay 25, 2018
Priority dateMay 25, 2017
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of generating matching metadata vectors for identifying content items in a store searchable by input vectors, the method comprising: receiving multiple training inputs, each training input comprising a content identifier indicative of a content item, and at least one natural language description of the content item; for each training input: converting the natural language description into at least one text component; generating at least one vector, each vector corresponding to one text component; generating a set of component parts for each vector, each component part corresponding to a coordinate initialized with a random value; adjusting each random coordinate based on the relationship of each component part to other vectors; determining a weighting for each vector with respect to the item; and defining a metadata vector for each item comprising the vectors containing the adjusted coordinates for that item and the weighting for each vector.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of generating matching metadata vectors for identifying content items in a store searchable by input vectors, the method comprising: receiving multiple training inputs, each training input comprising a content identifier indicative of a content item, and at least one natural language description of the content item; for each training input: converting the natural language description into at least one text component; generating at least one vector, each vector corresponding to one text component; generating a set of component parts for each vector, each component part corresponding to a coordinate initialized with a random value; adjusting each random coordinate based on the relationship of each component part to other vectors by assigning an association strength to each component part of each vector, the association strength being indicative of the association of that component part of the vector with the same component part of other vectors; determining a weighting for each vector with respect to the item; and defining a metadata vector for each item comprising the vectors containing the adjusted coordinates for that item and the weighting for each vector. 2. The method according to claim 1 wherein at least one training input is associated with a plurality of descriptions. 3. The method of claim 1 wherein each content item comprises a category, with at least one text component corresponding to that category. 4. The method of claim 3 wherein each content item corresponds to one of: a setting; a place; a name; a title; or a definer of at least one of a setting, a place, a name or a title. 5. The method according to claim 1 wherein each training input comprises multiple natural language descriptions of different semantic levels, the method further comprising: deriving one or more metadata vectors for each semantic level. 6. The method according to claim 5 further comprising: storing a tabular structure comprising the metadata vectors for multiple semantic levels associated with each content identifier. 7. The method according to claim 1 further comprising the steps of: receiving a search input comprising a natural language description of an unknown content item, which lacks a content identifier; vectorising the natural language description into a search vector comprising a set of text components and assigning a weight to each text component; comparing the search vector to each metadata vector derived from the training inputs to generate a list of possible matches; computing a score for each possible match based on vector similarities; and filtering the list of possible matches based on the similarity score to determine that the search input has a match within the training inputs. 8. The method of claim 7 wherein the step of vectoring the natural language description is based on a frequency of occurrence of the text component in the search input. 9. The method of claim 7 wherein the determined weightings are based on a frequency of occurrence of the text component in the search input and wherein the determined weightings are indicative of predictive power of a text component to associate with a search for a content item. 10. The method according to claim 7 further comprising the steps of: presenting match results to a user; receiving confirmation from the user that a match is correct; and associating the content identifier with the search input to create a further training input. 11. The method according to claim 10 further comprising the steps of: receiving confirmation from the user that the match is incorrect; receiving a correct content identifier from the user; and associating the correct content identifier with the search input to create a further training input. 12. A content access and storage system comprising: a store holding a plurality of content identifiers, each content identifier indicative of a content item; and a computer configured to execute a computer program to carry out the method of claim 1 . 13. A content access and storage system according to claim 12 , wherein the store holds a graph structure comprising a plurality of nodes, some nodes representing content items and some nodes representing text components, wherein the nodes are connected by links according to adjusted weightings. 14. A content access storage system according to claim 12 further comprising: a user interface configured to receive a search input from a user and to receive feedback from the user identifying a content identifier to be associated with the search input. 15. A method of accessing a content store comprising a plurality of metadata vectors generated according to the method of claim 1 , the method comprising: receiving a search input comprising a natural language description of an unknown content item, which lacks a content identifier; vectorising the natural language description into a search vector comprising a set of text components and assigning a weight to each text component; comparing the search vector to each metadata vector to generate a list of possible matches; computing a score for each possible match based on vector similarities; and filtering the list of possible matches based on the similarity score to determine if the search input has a match within the training inputs. 16. A content access and storage system comprising a store of metadata vectors in accordance with the method of claim 15 , and a computer program which when executed by a processor performs the method steps of claim 15 . 17. The method according to claim 1 further comprising: storing a graph structure comprising a plurality of nodes, some nodes representing content items and some nodes representing text components, wherein the nodes are connected by links according to the adjusted weightings. 18. A method of generating matching metadata vectors for identifying content items in a store searchable by input vectors, the method comprising: receiving multiple training inputs, each training input comprising a content identifier indicative of a content item, and at least one natural language description of the content item; for each training input: converting the natural language description into at least one text component; generating at least one vector, each vector corresponding to one text component; generating a set of component parts for each vector, each component part corresponding to a coordinate initialized with a random value; adjusting each random coordinate based on the relationship of each component part to other vectors; determining a weighting for each vector with respect to the item by determining that a vector does not provide a correct identification of content for a given set of weightings, and adjusting the weightings; and defining a metadata vector for each item comprising the vectors containing the adjusted coordinates for that item and the weighting for each vector. 19. The method according to claim 18 further comprising: storing a graph structure comprising a plurality of nodes, some nodes representing content items and some nodes representing text components, wherein the nodes are connected by links according to the adjusted weightings. 20. A method of generating matching metadata vectors for identifying content items in a store searchable by input vectors, the method comprising: receiving multiple training inputs, each training input comprising a content identifier indicative of a content item, and at lea

Assignees

Inventors

Classifications

  • G06F16/907Primary

    Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • Presentation of query results · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Natural language query formulation or dialogue systems · CPC title

  • Semantic analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599581B2 cover?
A method of generating matching metadata vectors for identifying content items in a store searchable by input vectors, the method comprising: receiving multiple training inputs, each training input comprising a content identifier indicative of a content item, and at least one natural language description of the content item; for each training input: converting the natural language description i…
Who is the assignee on this patent?
Piksel Inc, Prj Holding Company Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/907. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).