Discovery of new business openings using web content analysis
US-9773252-B1 · Sep 26, 2017 · US
US11756059B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11756059-B2 |
| Application number | US-202117563847-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 28, 2021 |
| Priority date | Mar 12, 2013 |
| Publication date | Sep 12, 2023 |
| Grant date | Sep 12, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In general, embodiments of the present invention provide systems, methods and computer readable media for identifying a new business based on programmatically analyzing content received from online sources and, as a result, discovering one or more references to the business. In embodiments, the system stores historical data representing previously identified new businesses and then uses attributes of those businesses in search queries to receive related content. Additionally or alternatively, the system stores data representing online sources that historically provided content containing references to new businesses and then continues to access those sources for additional content. In embodiments, the system performs content analysis on structured and/or unstructured content. In some embodiments, analysis of content received from a particular online source includes a source-specific algorithm that takes a source-specific representation of the content as input and produces a result indicating the likelihood that the content includes a new business reference.
Opening claim text (preview).
That which is claimed: 1. An apparatus comprising at least one processor and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to: receive content feeds comprising content data from an online source; extract, using a pattern recognition algorithm, references to providers that are included in the content data; generate one or more verification results by verifying each provider represented by a reference to a provider of the references to providers, wherein generating the one or more verification results comprises one or more of determining whether data representing the provider is stored in a repository or causing determining whether an indication of the provider is present in a different online source; determine, based at least in part on the content feeds, an initial confidence rating associated with the online source; adjust, by increasing or decreasing, the initial confidence rating associated with the online source based at least in part on the one or more verification results; and modify the pattern recognition algorithm based at least in part on the adjusted initial confidence rating. 2. The apparatus of claim 1 , wherein a provider is one of a newly opened business or a business that is about to open. 3. The apparatus of claim 1 , wherein the content data is received based at least in part on a source search index. 4. The apparatus of claim 1 , further caused to: update a source search index based on source data quality signals calculated based upon verification of the content data. 5. The apparatus of claim 1 , further caused to: prune a source search index based at least in part on source data quality signals calculated based upon verification of the content data. 6. The apparatus of claim 1 , wherein the pattern recognition algorithm is based at least in part on statistical inference. 7. The apparatus of claim 1 , further caused to: adjust the initial confidence rating associated with the online source based at least in part on a number of verified providers associated with the online source relative to other online sources within a predetermined period. 8. A computer-implemented method comprising: receiving content feeds comprising content data from an online source; extracting, using a pattern recognition algorithm, references to providers that are included in the content data; generating one or more verification results by verifying each provider represented by a reference to a provider of the references to providers, wherein generating the one or more verification results comprises one or more of determining whether data representing the provider is stored in a repository or causing determining whether an indication of the provider is present in a different online source; determining, based at least in part on the content feeds, an initial confidence rating associated with the online source; adjusting, by increasing or decreasing, the initial confidence rating associated with the online source based at least in part on the one or more verification results; and modifying the pattern recognition algorithm based at least in part on the adjusted initial confidence rating. 9. The method of claim 8 , wherein a provider is one of a newly opened business or a business that is about to open. 10. The method of claim 8 , wherein the content data is received based at least in part on a source search index. 11. The method of claim 8 , further comprising: updating a source search index based on source data quality signals calculated based upon verification of the content data. 12. The method of claim 8 , further comprising: pruning a source search index based at least in part on source data quality signals calculated based upon verification of the content data. 13. The method of claim 8 , wherein the pattern recognition algorithm is based at least in part on statistical inference. 14. The method of claim 8 , further comprising: adjusting the initial confidence rating associated with the online source based at least in part on a number of verified providers associated with the online source relative to other online sources within a predetermined period. 15. A computer program product comprising at least one non- transitory storage medium for storing computer program code, that, when executed by an apparatus, cause the apparatus to: receive content feeds comprising content data from an online source; extract, using a pattern recognition algorithm, references to providers that are included in the content data; generate one or more verification results by verifying each provider represented by a reference to a provider of the references to providers, wherein generating the one or more verification results comprises one or more of determining whether data representing the provider is stored in a repository or causing determining whether an indication of the provider is present in a different online source; determine, based at least in part on the content feeds, an initial confidence rating associated with the online source; adjust, by increasing or decreasing, the initial confidence rating associated with the online source based at least in part on the one or more verification results; and modify the pattern recognition algorithm based at least in part on the adjusted initial confidence rating. 16. The computer program product of claim 15 , wherein a provider is one of a newly opened business or a business that is about to open. 17. The computer program product of claim 15 , wherein the content data is received based at least in part on a source search index. 18. The computer program product of claim 15 , wherein the apparatus is further caused to: update a source search index based on source data quality signals calculated based upon verification of the content data. 19. The computer program product of claim 15 , wherein the apparatus is further caused to: prune a source search index based at least in part on source data quality signals calculated based upon verification of the content data. 20. The computer program product of claim 15 , wherein the pattern recognition algorithm is based at least in part on statistical inference.
Market modelling; Market analysis; Collecting market data · CPC title
Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals · CPC title
Indexing; Web crawling techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.