Geolocating social media

US10191945B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10191945-B2
Application numberUS-201414184399-A
CountryUS
Kind codeB2
Filing dateFeb 19, 2014
Priority dateFeb 20, 2013
Publication dateJan 29, 2019
Grant dateJan 29, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for geolocating social media are described. According to an embodiment, information from textual content of a non-geolocated social media data item stored in a database is extracted. A knowledge database is then searched for a cluster of geo-located social media data items to which the information most closely relates, and an estimated location is assigned to the non-geolocated social media data item according to the cluster to which the information most closely relates. Each cluster comprises one or more representative tags for a spatio-temporal region. The knowledge database is created from geolocated social media data by grouping data according to location and extracting representative tags from the location's grouping of data according to textual content as well as information related to reliability and truthfulness of the textual content.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of encoding geolocation metadata onto a social media data item, the method comprising: providing a remote knowledge database comprising M number of clusters of social media data items, each respective cluster representing a geographic area and establishing representative information for the geographic area associated with the respective cluster, each respective cluster being represented by a plurality of representative tags and comprising a plurality of social media data items that each respectively contain metadata that indicate origination from a common spatial-temporal location, each representative tag of the plurality of representative tags being an image, a keyword, or a phrase that is associated with the common spatial-temporal location; providing a non-transitory computer-readable medium comprising stored instructions, that when executed cause at least one processor to: receive, by a receiving device, a social media data item that is not encoded with geolocation metadata; detect textual information contained in the social media data item not encoded with geolocation metadata, the textual information comprising visual images and text; access the remote knowledge database via a network; traverse, in parallel, each respective cluster in the remote knowledge database to detect a correlation between the textual information contained in the social media data item not encoded with geolocation metadata and each representative tag the plurality of representative tags of each respective cluster in the remote knowledge database; rank each representative tag of the plurality of representative tags of each respective cluster by determining a strength of correlation between the textual information in the social media data item and each representative tag the plurality of representative tags of each respective cluster in the remote knowledge database based upon weighted values of each representative tag of the plurality of representative tags of each respective cluster; receive, from the remote knowledge database, the geolocation of the respective cluster having a representative tag that has the strongest respective correlation with the textual information contained in the social media data item not encoded with geolocation metadata; and append, the social media data item not encoded with geolocation metadata with metadata containing the geolocation of the respective cluster whose representative tag has the strongest correlation with the textual information contained in the social media data item not encoded with geolocation metadata. 2. The method according to claim 1 , wherein traversing each respective cluster, in parallel, in the remote knowledge database comprises: comparing, in parallel, the textual information of the social media data item not encoded with geolocation metadata to each representative tag of the plurality of representative tags of each respective cluster in the remote knowledge database to determine a highest probable matching respective cluster. 3. The method according to claim 2 , wherein determining a strength of correlation between the textual information in the social media data item not encoded with geolocation metadata and each representative tag of the plurality of representative tags of each respective cluster in the remote knowledge database comprises: detecting a geolocation of a respective cluster having a representative tag that has a probability value or correlating to the textual information containing in the social media item non encoded with geolocation metadata that is above a threshold probability value. 4. The method according to claim 1 , wherein the textual information from the social media data item not encoded with geolocation metadata comprises tags, image descriptions, and reader comments. 5. The method according to claim 1 , further comprising creating the remote knowledge database of social media data items of known geolocations. 6. The method according to claim 5 , wherein creating the remote knowledge database comprises: providing a plurality of social media data items to a database, each social media data item encoded with metadata indicating the geolocation of the origination of the social media data item; grouping the plurality of social media data items into M number of clusters, each respective cluster comprising social media data items originated within a predetermined spatial and temporal range of a known spatial-temporal location; extracting a plurality of representative tags from each social media data item in each respective cluster, each representative tag of the plurality of representative tags being an image, a keyword, or a phrase associated with the known common spatial-temporal location; assigning each representative tag of the plurality of representative tags a weighted value for each respective cluster indicating a strength of correlation between each representative tag of the plurality of representative tags and the known common spatial-temporal location of each respective cluster; and storing, in the computer memory, the respective clusters, the plurality of representative tags, and associated weighted values of each tag of the plurality of representative tags. 7. The method according to claim 6 , further comprising: extracting data, from each respective social media data item, related to reliability and truthfulness of the textual information contained in each respective social media data item. 8. The method according to claim 7 , the data related to reliability and truthfulness of the textual information comprising at least one of frequency of use, ownership, and number of views. 9. A system for data-mining social media data items of unknown geolocation origin, the system comprising: one or more processors; one or more non-transitory computer readable media; a remote knowledge database creation and updating application stored on at least one of the one or more non-transitory computer readable media that, when executed by at least one of the one or more processors, directs the at least one of the one or more processors to: receive, by a receiving device, a plurality of social media data items, each social media data item respectively encoded with metadata indicating the geolocation of the origination of the social media data item; group the plurality social media data items into M number of clusters, each respective cluster being comprised of social media data items originated within a predetermined range of a known spatial-temporal location; extract a plurality of representative tags from each social media data item in each respective cluster, a representative tag comprising at least one image, keyword, or phrase associated with the known common spatial-temporal location; assign each representative tag of the plurality of representative tags with a weighted value indicating a respective strength of correlation between each representative tag of the plurality of representative tags and the known common spatial-temporal location; store, in computer memory, the respective clusters, the plurality of representative tag of each respective cluster, and the respective associated weighted values of the representative tags of each respective cluster; and determine an outlier of the social media data item based on location and time; and receive a social media data item that is not encoded with geolocation metadata; detect, by text recognition software, textual information contained in the social media data item; detect, via image recognition software, visual information contained in the social media data item; access a remote knowledge database, stored in the computer memory, of clustered social media

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10191945B2 cover?
Techniques for geolocating social media are described. According to an embodiment, information from textual content of a non-geolocated social media data item stored in a database is extracted. A knowledge database is then searched for a cluster of geo-located social media data items to which the information most closely relates, and an estimated location is assigned to the non-geolocated socia…
Who is the assignee on this patent?
Rishe Naphtali David, The Florida International Univ Board Of Trustees
What technology area does this patent fall under?
Primary CPC classification G06F16/2455. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 29 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).