Multi-feature balancing for natural language processors
US-2024419910-A1 · Dec 19, 2024 · US
US2016335235A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016335235-A1 |
| Application number | US-201514748754-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 24, 2015 |
| Priority date | May 13, 2015 |
| Publication date | Nov 17, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Dynamically modelling geospatial words in social media, in one aspect, generates a word set based on frequencies of words occurring in GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates. Locations are partitioned by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations. A text stream contained in the GPS annotated text data is segmented into time windows. Footprints of locations in time windows are generated. Geospatial weights associated with words in the word set are generated based on localness of words determined based on the footprints. Words in a text message are extracted and scores are determined for the set of discrete non-overlapped locations associated with the words.
Opening claim text (preview).
We claim: 1 . A method of dynamically modeling geospatial words in social media, comprising: receiving GPS annotated text data generated by a GPS-enabled device containing latitude and longitude coordinates; generating a word set based on frequencies of words occurring in the GPS annotated text data; partitioning locations by mapping GPS coordinates in the GPS annotated text data to a set of discrete non-overlapped locations; segmenting a text stream contained in the GPS annotated text data into time windows; generating footprints of locations in time windows; determining geospatial weights associated with words in the word set based on localness of words determined based on the footprints; dynamically integrating in geotagging by extracting words in a text message and determining scores associated with the set of discrete non-overlapped locations. 2 . The method of claim 1 , further comprising: sampling a fixed number of the time windows in the segmenting and the generating steps. 3 . The method of claim 1 , wherein the generating footprints of locations in time windows comprises, for each GPS annotated text data in a time window, constructing a bipartite graph between a word type of the GPS annotated text data and a mapped location. 4 . The method of claim 3 , wherein the generating footprints of locations further comprises determining an association strength between the word type and the mapped location. 5 . The method of claim 4 , further comprising selecting a predetermined number of locations based on the association strength as the footprints, the footprints parameterized by associated word type, time window, and the predetermined number. 6 . The method of claim 1 , further comprising predicting location information for a new text message based on words in the new text message and the geospatial weights.
Language identification · CPC title
Recognition of textual entities · CPC title
Clustering; Classification · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Spatial or temporal dependent retrieval, e.g. spatiotemporal queries · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.