Geotagging unstructured text

US9262438B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9262438-B2
Application numberUS-201313960119-A
CountryUS
Kind codeB2
Filing dateAug 6, 2013
Priority dateAug 6, 2013
Publication dateFeb 16, 2016
Grant dateFeb 16, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Mechanisms are described to extract location information from unstructured text, comprising: building a language model from geo-tagged text; building a classifier for differentiating referred and physical location; given unstructured text, identifying referred location using the language model (that is, the location to which the unstructured text refers); given the unstructured text, identifying if referred location is also the physical location using the classifier; and predicting (that is, performing calculation(s) and/or estimation(s) of degree of confidence) of referred and physical location.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented in a computer system for extracting location information from unstructured text by utilizing a language model and a classifier, the method comprising: obtaining, by a computer, the unstructured text; identifying by the computer, via use of the language model and based upon the received unstructured text, a location referred to by the received unstructured text; and determining by the computer, via use of the classifier, whether the location referred to by the received unstructured text is also a physical location from where the received unstructured text was sent; wherein the language model is based upon a source of data that is distinct from the unstructured text. 2. The method of claim 1 , further comprising building, by the computer, the language model. 3. The method of claim 2 , wherein the language model is built based upon geo-tagged text. 4. The method of claim 2 , further comprising building, by the computer, a plurality of language models, each of the language models corresponding to a respective location. 5. The method of claim 1 , further comprising building, by the computer, the classifier. 6. The method of claim 5 , wherein the classifier is built based upon a training set of data. 7. The method of claim 1 , further comprising determining, by the computer, if the received unstructured text is location-neutral. 8. The method of claim 7 , wherein, if it is determined that the received unstructured text is location-neutral then the identifying the location referred to by the received unstructured text and the determining, via use of the classifier, whether the location referred to by the received unstructured text is also a physical location from where the received unstructured text was sent are not performed. 9. The method of claim 1 , wherein the identifying the location referred to by the received unstructured text comprises calculating, by the computer, a degree of confidence that the location referred to is correct. 10. The method of claim 1 , wherein the determining whether the location referred to by the received unstructured text is also the physical location from where the received unstructured text was sent comprises calculating, with the computer, a degree of confidence that the location referred to by the received unstructured text is also the physical location from where the received unstructured text was sent. 11. The method of claim 1 , further comprising outputting, by the computer, at least one of: (a) the location referred to by the received unstructured text; (b) the physical location from where the received unstructured text was sent; and (c) any combination thereof. 12. A computer readable storage medium, tangibly embodying a program of instructions executable by the computer for extracting location information from unstructured text by utilizing a language model and a classifier, the program of instructions, when executing, performing the following steps: obtaining the unstructured text; identifying, via use of the language model and based upon the received unstructured text, a location referred to by the received unstructured text; and determining, via use of the classifier, whether the location referred to by the received unstructured text is also a physical location from where the received unstructured text was sent; wherein the language model is based upon a source of data that is distinct from the unstructured text. 13. The computer readable storage medium of claim 12 , wherein the program of instructions, when executing, further performs building the language model. 14. The computer readable storage medium of claim 13 , wherein the language model is built based upon geo-tagged text. 15. The computer readable storage medium of claim 13 , wherein the program of instructions, when executing, further performs building a plurality of language models, each of the language models corresponding to a respective location. 16. The computer readable storage medium of claim 12 , wherein the program of instructions, when executing, further performs building the classifier. 17. The computer readable storage medium of claim 12 , wherein the program of instructions, when executing, further performs outputting at least one of: (a) the location referred to by the received unstructured text; (b) the physical location from where the received unstructured text was sent; and (c) any combination thereof. 18. A computer-implemented system for extracting location information from unstructured text by utilizing a language model and a classifier, the system comprising: an input element configured to receive the unstructured text; an identifying element configured to identify, via use of the language model and based upon the received unstructured text, a location referred to by the received unstructured text; a determining element configured to determine, via use of the classifier, whether the location referred to by the received unstructured text is also a physical location from where the received unstructured text was sent; and an output element configured to output the determination of whether the location referred to by the received unstructured text is also the physical location from where the received unstructured text was sent; wherein the language model is based upon a source of data that is distinct from the unstructured text. 19. The system of claim 18 , further comprising a first building element configured to build the language model. 20. The system of claim 19 , wherein the language model is built based upon geo-tagged text. 21. The system of claim 19 , wherein the first building element is configured to build a plurality of language models, each of the language models corresponding to a respective location. 22. The system of claim 18 , further comprising a second building element configured to build the classifier. 23. The system of claim 18 , wherein the output element is further configured to output at least one of: (a) the location referred to by the received unstructured text; (b) the physical location from where the received unstructured text was sent; and (c) any combination thereof. 24. A method implemented in a computer system for extracting location information from unstructured text by utilizing a language model and a classifier, the method comprising: building, by a computer, the language model; building, by the computer, the classifier; obtaining, by the computer, the unstructured text; identifying by the computer, via use of the language model and based upon the received unstructured text, a location referred to by the received unstructured text; determining by the computer, via use of the classifier, whether the location referred to by the received unstructured text is also a physical location from where the received unstructured text was sent; and outputting, by the computer, at least one of: (a) the location referred to by the received unstructured text; (b) the physical location from where the received unstructured text was sent; and (c) any combination thereof; wherein the language model is based upon a source of data that is distinct from the unstructured text. 25. The method of claim 24 , wherein the language model is built based upon geo-tagged text.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • G06F16/29Primary

    Geographical information databases · CPC title

  • Spatial or temporal dependent retrieval, e.g. spatiotemporal queries · CPC title

  • Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9262438B2 cover?
Mechanisms are described to extract location information from unstructured text, comprising: building a language model from geo-tagged text; building a classifier for differentiating referred and physical location; given unstructured text, identifying referred location using the language model (that is, the location to which the unstructured text refers); given the unstructured text, identifyin…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30241. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 16 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).