Methods for improving web scanner accuracy and devices thereof
US-11895138-B1 · Feb 6, 2024 · US
US12450305B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12450305-B2 |
| Application number | US-202318224875-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 21, 2023 |
| Priority date | Jan 31, 2022 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system comprising one or more processors and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform operations: classifying one or more webpages of a website into one or more classifications using interaction data, a content score, and a link equity score, each for the one or more webpages and removing, based on the one or more classifications, the one or more webpages from the website and from a sitemap of the website. Other embodiments are disclosed herein.
Opening claim text (preview).
What is claimed is: 1. A system comprising: one or more processors; and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising: classifying one or more webpages of a website into one or more classifications using interaction data, a content score, and a link equity score, each for the one or more webpages; and removing, based on the one or more classifications, the one or more webpages from the website and from a sitemap of the website. 2. The system of claim 1 , wherein the one or more classifications comprise a ranking range of one or more predicted ranking ranges for the one or more webpages of the website. 3. The system of claim 2 , wherein the one or more predicted ranking ranges comprise: a first ranking range comprising predicted rankings from a first predicted search result position to a fifth predicted search result position; a second ranking range comprising predicted rankings after the fifth predicted search result position; and a third ranking range comprising no predicted rankings. 4. The system of claim 3 , wherein removing the one or more webpages from the sitemap of the website comprises: removing the one or more webpages from the sitemap when the one or more webpages are classified in the third ranking range. 5. The system of claim 1 , wherein classifying the one or more webpages comprises: feeding the interaction data, the content score, and the link equity score into a logistic regression model; and classifying the one or more webpages of the website into the one or more classifications using the logistic regression model. 6. The system of claim 5 , wherein the logistic regression model comprises a multinomial logistic regression model. 7. The system of claim 1 , wherein classifying the one or more webpages comprises: classifying the one or more webpages of the website into the one or more classifications using the interaction data, the content score, the link equity score, an age of the one or more webpages, and pricing information for an item displayed on the one or more webpages. 8. The system of claim 1 , wherein the computing instructions, when executed on the one or more processors, cause the one or more processors to further perform an operation comprising: determining the content score using a text analysis and an image analysis performed on the one or more webpages of the website. 9. The system of claim 8 , wherein the text analysis comprises a natural language processing algorithm. 10. The system of claim 8 , wherein the image analysis comprises: checking a resolution of one or more images displayed on the one or more webpages against one or more resolution thresholds; and checking a quantity of the one or more images displayed on the one or more webpages against one or more image quantity thresholds. 11. A method being implemented via execution of computing instructions configured to run at one or more processors and configured to be stored at non-transitory computer-readable media, the method comprising: classifying one or more webpages of a website into one or more classifications using interaction data, a content score, and a link equity score, each for the one or more webpages; and removing, based on the one or more classifications, the one or more webpages from the website and from a sitemap of the website. 12. The method of claim 11 , wherein the one or more classifications comprise a ranking range of one or more predicted ranking ranges for the one or more webpages of the website. 13. The method of claim 12 , wherein the one or more predicted ranking ranges comprise: a first ranking range comprising predicted rankings from a first predicted search result position to a fifth predicted search result position; a second ranking range comprising predicted rankings after the fifth predicted search result position; and a third ranking range comprising no predicted rankings. 14. The method of claim 13 , wherein removing the one or more webpages from the sitemap of the website comprises: removing the one or more webpages from the sitemap when the one or more webpages are classified in the third ranking range. 15. The method of claim 11 , wherein classifying the one or more webpages comprises: feeding the interaction data, the content score, and the link equity score into a logistic regression model; and classifying the one or more webpages of the website into the one or more classifications using the logistic regression model. 16. The method of claim 15 , wherein the logistic regression model comprises a multinomial logistic regression model. 17. The method of claim 11 , wherein classifying the one or more webpages comprises: classifying the one or more webpages of the website into the one or more classifications using the interaction data, the content score, the link equity score, an age of the one or more webpages, and pricing information for an item displayed on the one or more webpages. 18. The method of claim 11 further comprising: determining the content score using a text analysis and an image analysis performed on the one or more webpages of the website; and wherein at least one of: (a) the text analysis comprises a natural language processing algorithm; or (b) the image analysis comprises: checking a resolution of one or more images displayed on the one or more webpages against one or more resolution thresholds; and checking a quantity of the one or more images displayed on the one or more webpages against one or more image quantity thresholds. 19. A non-transitory computer-readable medium storing instructions, wherein the instructions, upon execution by a processor, cause the processor to perform operations comprising: classifying one or more webpages of a website into one or more classifications using interaction data, a content score, and a link equity score, each for the one or more webpages; and removing, based on the one or more classifications, the one or more webpages from the website and from a sitemap of the website. 20. The non-transitory computer-readable medium of claim 19 , wherein the operations further comprise: determining the content score using a text analysis and an image analysis performed on the one or more webpages of the website.
Clustering; Classification · CPC title
Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.