Text entity recognition
US-9256795-B1 · Feb 9, 2016 · US
US9736212B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9736212-B2 |
| Application number | US-201414531080-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 3, 2014 |
| Priority date | Jun 26, 2014 |
| Publication date | Aug 15, 2017 |
| Grant date | Aug 15, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations optimize a browser render process by identifying content neutral embedded items and rendering a web page without fetching the content neutral items. An example method includes identifying a URL pattern common to a plurality of URLs stored in fetch records and selecting a sample of URLs from the plurality. The method also includes, for each URL in the sample, determining whether the URL is optional by generating a first rendering result using content for the URL and a second rendering result without using the content for the URL and calculating a similarity score for the URL by comparing the first rendering result and the second rendering result, the URL being optional when the similarity score is greater than a similarity threshold. The method may also include storing the URL pattern in a data store of optional resource patterns when a majority of the URLs in the sample are optional.
Opening claim text (preview).
What is claimed is: 1. A computer system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the system to: identify a URL pattern common to a plurality of URLs stored in fetch records; select a sample of URLs from the plurality of URLs; for each URL in the sample, determine whether the URL is optional by: generating a first rendering result for an embedder of the URL using content for the URL, generating a second rendering result for the embedder without using the content for the URL, calculating a similarity score for the URL by comparing the first rendering result to the second rendering result, and determining that the URL is optional when the similarity score is greater than a similarity threshold; and responsive to determining a predetermined quantity of the URLs in the sample are optional, store the URL pattern in a data store of optional resource patterns. 2. The system of claim 1 , the instructions further including instructions that, when executed by the at least one processor, cause the system to: receive a request for content of a requested embedded resource; determine whether the requested embedded resource matches the pattern in the data store; and responsive to determining the requested embedded resource matches the pattern, return an indication that the requested embedded resource is optional. 3. The system of claim 2 , wherein the indication is a URL not found error. 4. The system of claim 1 , wherein the quantity of the URLs is equal to a quantity of URLs in the sample. 5. The system of claim 1 , wherein identifying the URL pattern common to the plurality of URLs stored in the fetch records includes: generating a group URL for respective URLs stored in the fetch records by removing at least a portion of a query string from the URL; and clustering the URLs by group URL. 6. The system of claim 1 , wherein identifying the URL pattern common to the plurality of URLs stored in the fetch records includes: generating a group URL for respective URLs stored in the fetch records by removing at least a portion of a query string from the URL; clustering the URLs by group URL; and selecting the group URL of a cluster with a highest number of members as the URL pattern. 7. The system of claim 1 , wherein calculating the similarity score includes: determining a longest common sequence for a DOM tree of the first rendering result and a DOM tree of the second rendering result; and using the longest common sequence to determine the similarity score. 8. A method comprising: identifying, using at least one processor, a URL pattern common to a plurality of URLs stored in fetch records; selecting, using the at least one processor, a sample of URLs from the plurality of URLs; for each URL in the sample, determining whether the URL is optional by: generating a first rendering result for an embedder of the URL using content for the URL and a second rendering result for the embedder without using the content for the URL, and calculating a similarity score for the URL by comparing the first rendering result and the second rendering result, the URL being optional responsive to determining the similarity score is greater than a similarity threshold; and responsive to determining a majority of the URLs in the sample are optional, storing the URL pattern in a data store of optional resource patterns. 9. The method of claim 8 , further comprising: receiving a request for content of a requested embedded resource; determining whether the requested embedded resource matches the pattern in the data store; and responsive to determining the requested embedded resource matches the pattern, returning an indication that the requested embedded resource is optional. 10. The method of claim 8 , further comprising: determining that a quantity represented by the plurality of URLs exceeds a size threshold prior to determining whether URLs in the sample are optional. 11. The method of claim 8 , further comprising: responsive to determining all the URLs in the sample are optional, storing the URL pattern in the data store of optional resource patterns. 12. The method of claim 8 , wherein identifying the URL pattern common to the plurality of URLs in the fetch records includes: for respective URLs in the fetch records, generating a group URL for the URL in the fetch record by removing at least a portion of a query string from the URL; and clustering by group URL. 13. The method of claim 8 , wherein identifying the URL pattern common to the plurality of URLs in the fetch records includes: for respective URLs in the fetch records, generating a group URL for the URL in the fetch record by removing at least a portion of a query string from the URL; clustering by group URL; and selecting the group URL of a cluster responsive to determining a quantity of members in the cluster meets a threshold. 14. The method of claim 8 , wherein calculating the similarity score includes: determining a longest common sequence for a DOM tree of the first rendering result and a DOM tree of the second rendering result; and using the longest common sequence to determine the similarity score. 15. A method comprising: receiving a request to render a web page; identifying, using at least one processor, at least one embedded resource in the web page that requires a fetch; determining that the embedded resource is an optional resource by determining that a URL for the embedded resource matches a pattern in a data store identifying optional resources; and rendering, using the at least one processor, the web page as if the embedded resource is unavailable without fetching content for the embedded resource. 16. The method of claim 15 , further comprising: receiving the data store of patterns identifying optional resources from a service. 17. The method of claim 15 , wherein the embedded resource is optional responsive to determining a rendering result of an embedder web page rendered with the embedded resource is similar to a rendering result of the embedder web page rendered without the embedded resource. 18. The method of claim 15 , wherein determining that the embedded resource is an optional resource includes: rewriting the URL for the embedded resource by removing a query string from the URL; and matching the rewritten URL to the pattern in the data store identifying optional resources, wherein the embedded resource is optional when the rewritten URL matches the pattern. 19. The method of claim 15 , wherein determining that the embedded resource is an optional resource includes: rewriting the URL for the embedded resource by removing at least a portion of a query string from the URL; and matching the rewritten URL to the pattern in the data store identifying optional resources, wherein the embedded resource is optional when the rewritten URL matches the pattern. 20. The method of claim 15 , the data store identifying optional resources being populated based on comparison of rendering results.
based on web technology, e.g. hypertext transfer protocol [HTTP] · CPC title
for accessing web services by means of a binding identification of the management service or element · CPC title
Physics · mapped topic
Browsing optimisation, e.g. caching or content distillation · CPC title
Digital output to display device {; Cooperation and interconnection of the display device with other functional units} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.