Relationship graph interlinkage system
US-2016110476-A1 · Apr 21, 2016 · US
US9734166B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9734166-B2 |
| Application number | US-201313975497-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 26, 2013 |
| Priority date | Aug 26, 2013 |
| Publication date | Aug 15, 2017 |
| Grant date | Aug 15, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A first set of contextual dimensions is generated from one or more textual descriptions associated with a given event, which includes one or more examples. A second set of contextual dimensions is generated from one or more visual features associated with the given event, which includes one or more visual example recordings. A similarity structure is constructed from the first set of contextual dimensions and the second set of contextual dimensions. One or more of the textual descriptions is matched with one or more of the visual features based on the similarity structure.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: generating a first set of contextual dimensions from one or more textual descriptions associated with a given event, wherein the one or more textual descriptions comprise a corpus of text describing one or more aspects of the given event, and the first set of contextual dimensions results in a first taxonomy for the one or more textual descriptions; generating a second set of contextual dimensions from one or more audio-visual features associated with the given event, wherein the one or more audio-visual features comprise at least one of a video content and an image content that visually depicts the one or more aspects of the given event together with an audio content component, and the second set of contextual dimensions results in a second taxonomy for the one or more audio-visual features; constructing a similarity structure from the first set of contextual dimensions and the second net of contextual dimensions, wherein the similarity structure comprises a visual and textual concept relationship network that links the first taxonomy and the second taxonomy based on relatedness between elements of the first taxonomy and the second taxonomy; and matching one or more of the textual descriptions with one or more of the audio-visual features based on the similarity structure such that the one or more textual descriptions that match the one or more audio-visual features serve to annotate the one or more audio-visual features; wherein the generating, constructing and matching steps are performed via one or more processing devices. 2. The method of claim 1 , wherein the step of generating a first set of contextual dimensions for one or more textual descriptions associated with a given event further comprises parsing the one or more textual descriptions associated with the given event by identifying one or more terms or one or more sets of terms appearing in one or more taxonomies or one or more ontologies. 3. The method of claim 2 , wherein the step of generating a first set of contextual dimensions for one or more textual descriptions associated with a given event further comprises mapping the one or more identified terms or one or more identified sets of terms to one or more textual objects in the one or more taxonomies or the one or more ontologies. 4. The method of claim 3 , wherein the step of generating a first set of contextual dimensions for one or more textual descriptions associated with a given event further comprises classifying the one or more textual objects into one or more classes. 5. The method of claim 4 , wherein the step of generating a first set of contextual dimensions for one or more textual descriptions associated with a given event further comprises arranging the one or more classified textual objects in a time sequence describing the given event in one or more event taxonomy graphs. 6. The method of claim 5 , wherein the step of generating a second set of contextual dimensions for one or more audio-visual features associated with the given event further comprises extracting the one or more audio-visual features associated with the given event from one or more images or one or more objects from a video frame from one or more videos. 7. The method of claim 6 , wherein the step of generating a second set of contextual dimensions for one or more audio-visual features associated with the given event further comprises classifying the one or more audio-visual features into one or more visual concepts associated with one or more taxonomies or one or more ontologies. 8. The method of claim 1 , wherein the step of constructing a similarity structure from the first set of contextual dimensions and the second set of contextual dimensions further comprises forming the relationship network by associating each of the one or more visual concepts to the one or more event taxonomy graphs. 9. The method of claim 8 , wherein the step of matching one or more of the textual descriptions with one or more of the audio-visual features based on the similarity structure further comprises assigning a relevant one of the one or more textual descriptions to one of the one or more images or the one or more videos based on the formed relationship network. 10. The method of claim 9 , wherein the step of classifying the one or more textual objects and the step of classifying the one or more audio-visual features further comprise selecting from context classes, object classes and activity classes. 11. A computer program product comprising a processor-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by the one or more processing devices implement steps of: generating a first set of contextual dimensions from one or more textual descriptions associated with a given event, wherein the one or more textual descriptions comprise a corpus of text describing one or more aspects of the given event, and the first set of contextual dimensions results in a first taxonomy for the one or more textual descriptions; generating a second set of contextual dimensions from one or more audio-visual features associated with the given event, wherein the one or more audio-visual features comprise at least one of a video content and an image content that visually depicts the one or more aspects of the given event together with an audio content component, and the second set of contextual dimensions results in a second taxonomy for the one or more visual features; constructing a similarity structure from the first set of contextual dimensions and the second set of contextual dimensions, wherein the similarity structure comprises a visual and textual concept relationship network that links the first taxonomy and the second taxonomy based on relatedness between elements of the first taxonomy and the second taxonomy; and matching one or more of the textual descriptions with one or more of the audio-visual features based on the similarity structure such that the one or more textual descriptions that match the one or more audio-visual features serve to annotate the one or more audio-visual features. 12. An apparatus, comprising: a memory; and a processor operatively coupled to the memory and configured to: generate a first set of contextual dimensions from one or more textual descriptions associated with a given event, wherein the one or more textual descriptions comprise a corpus of text describing one or more aspects of the given event,and the first set of contextual dimensions results in a first taxonomy for the one or more textual descriptions; generate a second set of contextual dimensions from one or more audio-visual features associated with the given event, wherein the one or more audio-visual features comprise at least one of a video content and an image content that visually depicts the one or more aspects of the given event together with an audio content component, and the second set of contextual dimensions results in a second taxonomy for the one or more audio-visual features; construct a similarity structure from the first set of contextual dimensions and the second set of contextual dimensions, wherein the similarity structure comprises a visual and textual concept relationship network that links the first taxonomy and the second taxonomy based on relatedness between elements of the first taxonomy and the second taxonomy; and match one or more of the textual descriptions with one or more of the audio-visual features based on the similarity structure such that the one or more textual descriptions that match the one or more visual audio-visual features serve to annotate the
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
using information manually generated, e.g. tags, keywords, comments, manually generated location and time information · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.