Dynamic modeling of geospatial words in social media

US9405743B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9405743-B1
Application numberUS-201514710915-A
CountryUS
Kind codeB1
Filing dateMay 13, 2015
Priority dateMay 13, 2015
Publication dateAug 2, 2016
Grant dateAug 2, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Dynamically modelling geospatial words in social media, in one aspect, generates a word set based on frequencies of words occurring in GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates. Locations are partitioned by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations. A text stream contained in the GPS annotated text data is segmented into time windows. Footprints of locations in time windows are generated. Geospatial weights associated with words in the word set are generated based on localness of words determined based on the footprints. Words in a text message are extracted and scores are determined for the set of discrete non-overlapped locations associated with the words.

First claim

Opening claim text (preview).

We claim: 1. A system for dynamically modeling geospatial words in social media, comprising: a processor; a data collector operable to execute on the processor and further operable to receive GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates; a model trainer operable to execute on the processor and further operable to generate a word set based on frequencies of words occurring in the GPS annotated text data, the model trainer further operable to partition locations by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations, the model trainer further operable to segment a text stream contained in the GPS annotated text data into time windows, the model trainer further operable to generate footprints of locations in time windows, the model trainer further operable to determine geospatial weights associated with words in the word set based on localness of words determined based on the footprints, the model trainer further operable to dynamically integrate geotagging by extracting words in a text message and determining scores associated with the set of discrete non-overlapped locations; a storage device coupled to the processor and operable to store the footprints and GPS labeled data, the GPS labeled data generated based on mapping the words in the word set to a respective location in the set of discrete non-overlapped locations. 2. The system of claim 1 , further comprising a model deployer operable to execute on the processor and further operable to predict location information for a new text message based on words in the new text message and the geospatial weights. 3. The system of claim 1 , wherein the model trainer samples a fixed number of the time windows in segmenting the text stream into time windows and generating the footprints. 4. The system of claim 1 , wherein the model trainer generates footprints of locations in time windows by, for each GPS annotated text data in a time window, constructing a bipartite graph between a word type of the GPS annotated text data and a mapped location. 5. The system of claim 4 , wherein the model trainer generates footprints by further determining an association strength between the word type and the mapped location. 6. The system of claim 5 , the model trainer further selects a number of locations based on the association strength as the footprints, the footprints parameterized by associated word type, time window, and the number. 7. A non-transitory computer readable storage medium storing a program of instructions executable by a machine to perform a method of dynamically modeling geospatial words in social media, the method comprising: receiving GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates; generating a word set based on frequencies of words occurring in the GPS annotated text data; partitioning locations by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations; segmenting a text stream contained in the GPS annotated text data into time windows; generating footprints of locations in time windows; determining geospatial weights associated with words in the word set based on localness of words determined based on the footprints; dynamically integrating in geotagging by extracting words in a text message and determining scores associated with the set of discrete non-overlapped locations. 8. The non-transitory computer readable storage medium of claim 7 , further comprising: sampling a fixed number of the time windows in the segmenting and the generating steps. 9. The non-transitory computer readable storage medium of claim 7 , wherein the generating footprints of locations in time windows comprises, for each GPS annotated text data in a time window, constructing a bipartite graph between a word type of the GPS annotated text data and a mapped location. 10. The non-transitory computer readable storage medium of claim 9 , wherein the generating footprints of locations further comprises determining an association strength between the word type and the mapped location. 11. The non-transitory computer readable storage medium of claim 10 , further comprising selecting a number of locations based on the association strength as the footprints, the footprints parameterized by associated word type, time window, and the number. 12. The non-transitory computer readable storage medium of claim 7 , further comprising predicting location information for a new text message based on words in the new text message and the geospatial weights.

Assignees

Inventors

Classifications

  • Spatial or temporal dependent retrieval, e.g. spatiotemporal queries · CPC title

  • Annotation, e.g. comment data or footnotes · CPC title

  • Language identification · CPC title

  • Clustering; Classification · CPC title

  • Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9405743B1 cover?
Dynamically modelling geospatial words in social media, in one aspect, generates a word set based on frequencies of words occurring in GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates. Locations are partitioned by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations. A text stream contained in…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/9537. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).