Text entity recognition
US-9256795-B1 · Feb 9, 2016 · US
US2015381699A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2015381699-A1 |
| Application number | US-201414531080-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 3, 2014 |
| Priority date | Jun 26, 2014 |
| Publication date | Dec 31, 2015 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations optimize a browser render process by identifying content neutral embedded items and rendering a web page without fetching the content neutral items. An example method includes identifying a URL pattern common to a plurality of URLs stored in fetch records and selecting a sample of URLs from the plurality. The method also includes, for each URL in the sample, determining whether the URL is optional by generating a first rendering result using content for the URL and a second rendering result without using the content for the URL and calculating a similarity score for the URL by comparing the first rendering result and the second rendering result, the URL being optional when the similarity score is greater than a similarity threshold. The method may also include storing the URL pattern in a data store of optional resource patterns when a majority of the URLs in the sample are optional.
Opening claim text (preview).
What is claimed is: 1 . A computer system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the system to: identify a URL pattern common to a plurality of URLs stored in fetch records; select a sample of URLs from the plurality of URLs; for each URL in the sample, determine whether the URL is optional by: generating a first rendering result for an embedder of the URL using content for the URL; generating a second rendering result for the embedder without using the content for the URL, and calculating a similarity score for the URL by comparing the first rendering result to the second rendering result; determining that the URL is optional when the similarity score is greater than a similarity threshold; and when a predetermined quantity of the URLs in the sample are optional, store the URL pattern in a data store of optional resource patterns. 2 . The system of claim 1 , the instructions further including instructions that, when executed by the at least one processor, cause the system to: receive a request for content of a requested embedded resource; determine whether the requested embedded resource matches the pattern in the data store; and when the requested embedded resource matches the pattern, return an indication that the requested embedded resource is optional. 3 . The system of claim 2 , wherein the indication is a URL not found error. 4 . The system of claim 1 , wherein the quantity of the URLs is equal to a quantity of URLs in the sample. 5 . The system of claim 1 , wherein identifying the URL pattern common to the plurality of URLs stored in the fetch records includes: generating a group URL for respective URLs stored in the fetch records by removing at least a portion of a query string from the URL; and clustering the URLs by group URL. 6 . The system of claim 1 , wherein identifying the URL pattern common to the plurality of URLs stored in the fetch records includes: generating a group URL for respective URLs stored in the fetch records by removing at least a portion of a query string from the URL; clustering the URLs by group URL; selecting the group URL of a cluster with a highest number of members as the URL pattern. 7 . The system of claim 1 , wherein calculating the similarity score includes: determining a longest common sequence for a DOM tree of the first rendering result and a DOM tree of the second rendering result; and using the longest common sequence to determine the similarity score. 8 . A method comprising: identifying, using at least one processor, a URL pattern common to a plurality of URLs stored in fetch records; selecting, using the at least one processor, a sample of URLs from the plurality of URLs; for each URL in the sample, determining whether the URL is optional by: generating a first rendering result for an embedder of the URL using content for the URL and a second rendering result for the embedder without using the content for the URL, and calculating a similarity score for the URL by comparing the first rendering result and the second rendering result, the URL being optional when the similarity score is greater than a similarity threshold; and when a majority of the URLs in the sample are optional, storing the URL pattern in a data store of optional resource patterns. 9 . The method of claim 8 , further comprising: receiving a request for content of a requested embedded resource; determining whether the requested embedded resource matches the pattern in the data store; and when the requested embedded resource matches the pattern, returning an indication that the requested embedded resource is optional. 10 . The method of claim 8 , further comprising: determining that a quantity represented by the plurality of URLs exceeds a size threshold prior to determining whether URLs in the sample are optional. 11 . The method of claim 8 , further comprising: when all the URLs in the sample are optional, storing the URL pattern in the data store of optional resource patterns. 12 . The method of claim 8 , wherein identifying the URL pattern common to the plurality of URLs in the fetch records includes: for respective URLs in the fetch records, generating a group URL for the URL in the fetch record by removing at least a portion of a query string from the URL; and clustering by group URL. 13 . The method of claim 8 , wherein identifying the URL pattern common to the plurality of URLs in the fetch records includes: for respective URLs in the fetch records, generating a group URL for the URL in the fetch record by removing at least a portion of a query string from the URL; clustering by group URL; and selecting the group URL of a cluster when a quantity of members in the cluster meets a threshold. 14 . The method of claim 8 , wherein calculating the similarity score includes: determining a longest common sequence for a DOM tree of the first rendering result and a DOM tree of the second rendering result; and using the longest common sequence to determine the similarity score. 15 . A method comprising: receiving a request to render a web page; identifying, using at least one processor, at least one embedded resource in the web page that requires a fetch; determining that the embedded resource is an optional resource; and rendering, using the at least one processor, the web page as if the embedded resource is unavailable without fetching content for the embedded resource. 16 . The method of claim 15 , wherein determining that the embedded resource is an optional resource includes determining that a URL for the embedded resource matches a pattern in a data store of optional resources. 17 . The method of claim 15 , further comprising: receiving a data store of patterns for optional resources from a service, wherein determining that the embedded resource is an optional resource includes determining that a URL for the embedded resource matches a pattern in the data store. 18 . The method of claim 15 , wherein the embedded resource is optional when a rendering result of an embedder web page rendered with the embedded resource is similar to a rendering result of the embedder web page rendered without the embedded resource. 19 . The method of claim 15 , wherein determining that the embedded resource is an optional resource includes: rewriting a URL for the embedded resource by removing a query string from the URL; and matching the rewritten URL to a pattern in a data store of optional resources, wherein the embedded resource is optional when the rewritten URL matches the pattern. 20 . The method of claim 15 , wherein determining that the embedded resource is an optional resource includes: rewriting a URL for the embedded resource by removing at least a portion of a query string from the URL; and matching the rewritten URL to a pattern in a data store of optional resources, wherein the embedded resource is optional when the rewritten URL matches the pattern.
based on web technology, e.g. hypertext transfer protocol [HTTP] · CPC title
for accessing web services by means of a binding identification of the management service or element · CPC title
Optimising the visualization of content, e.g. distillation of HTML documents · CPC title
Browsing optimisation, e.g. caching or content distillation · CPC title
Digital output to display device {; Cooperation and interconnection of the display device with other functional units} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.