Bot detection with page fingerprints and behavioral analysis

US12554845B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12554845-B2
Application numberUS-202418679101-A
CountryUS
Kind codeB2
Filing dateMay 30, 2024
Priority dateMay 30, 2024
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Improved bot detection systems and methods are disclosed. A page fingerprinting algorithm can be used to categorize web pages. The categorization of web pages enables improved insights in profiling the way a human interacts with a website as opposed to a bot. Learning the patterns of humans and bots for a given web page category (or navigation across categories), a heuristic ruleset and/or machine learning system can differentiate between humans and bots. In this way, a human website visitor can be distinguished from a bot in an improved manner. The teachings hereof include systems and methods for deriving page fingerprints from makeup language files, for categorizing pages based on their structure and associated fingerprints, as well as heuristics and machine learning techniques to characterize website visitor behavior and detect bots, based on such web page categorization and fingerprinting.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method for detecting bots, comprising: A. asynchronous to a request from a client, automatically assigning each of a plurality of URLs to one of a plurality of web page categories, where the automatic assignment for a given one or the plurality of URLs is based on a fingerprint computed from a markup language file associated with the given one of the plurality of URLs; and, B. responsive to intercepting the request from the client: determining a request URL, to which the request is directed; determining the assigned web page category for the request URL by comparison to the results of the automatic assignment process set forth in A; sending, with other data related to the request, the assigned web page category to a bot detection service; receiving from the bot detection service an indication as to whether the client is a bot; and, based on the indication, forwarding the request from the client for handling. 2 . The method of claim 1 , where said automatic assignment comprises, for a given URL of the plurality of URLs: processing a markup language file associated with the given URL to create an associated DOM tree; computing a fingerprint for the given URL from at least a portion of the associated DOM tree; and, assigning the given URL to one of the plurality of web page categories based on the fingerprint for the given URL. 3 . The method of claim 2 , wherein computing the fingerprint for the given URL from at least a portion of the associated DOM tree comprises at least one of: (i) filtering content from the associated DOM tree such that the fingerprint is computed from structure of the associated DOM tree, and, (ii) applying a hash function to code forming at least a portion of the associated DOM tree. 4 . The method of claim 1 , wherein determining the assigned web page category for the request URL comprises at least one of: (i) processing a markup language file associated with the request URL to create an associated DOM tree, and computing a fingerprint for the request URL from at least a portion of the associated DOM tree, and, (ii) looking up the request URL in a table to find the assigned web page category. 5 . The method of claim 1 , wherein the other data related to the request comprises any of: (i) sensor data reflecting one or more interactions at the client, and (ii) a cookie value received from the client. 6 . The method of claim 1 , further comprising, with the bot detection service, one or more of the following: (i) applying a set of rules that identify differences between human and bot behavior when visiting web pages in the assigned web page category, and, (ii) applying, in an inferencing step, a machine learning model trained with data reflecting interactions on web pages in web page categories, so as to differentiate a human from a bot. 7 . The method of claim 1 , further comprising, with the bot detection service, one or more of the following: (i) applying a set of rules that identify differences between humans and bot behavior in navigating across web pages different web page categories, and, (ii) applying, in an inferencing step, a machine learning model trained with data reflecting navigation on web pages across web page categories, so as to differentiate a human from a bot. 8 . The method of claim 1 , further comprising, with the bot detection service, applying, in an inferencing step, a machine learning model trained to differentiate a human from a bot. 9 . The method of claim 1 , wherein the handling comprises any of alerting or blocking the request. 10 . A system having one or more computers, each with circuitry forming at least one processor and memory storing computer program instructions for execution on the at least one processor to operate the respective computer, the one or more computers collectively operable to: A. asynchronous to a request from a client, automatically assign each of a plurality of URLs to one of a plurality of web page categories, where the automatic assignment for a given one or the plurality of URLs is based on a fingerprint computed from a markup language file associated with the given one of the plurality of URLs; and, B. responsive to intercepting the request from the client: determine a request URL, to which the request is directed; determine the assigned web page category for the request URL by comparison to the results of the automatic assignment process set forth in A; send, with other data related to the request, the assigned web page category to a bot detection service; receive from the bot detection service an indication as to whether the client is a bot; and, based on the indication, forward the request from the client for handling. 11 . The system of claim 10 , where said automatic assignment comprises, for a given URL of the plurality of URLs: processing a markup language file associated with the given URL to create an associated DOM tree; computing a fingerprint for the given URL from at least a portion of the associated DOM tree; and, assigning the given URL to one of the plurality of web page categories based on the fingerprint for the given URL. 12 . The system of claim 11 , wherein computing the fingerprint for the given URL from at least a portion of the associated DOM tree comprises at least one of: (i) filtering content from the associated DOM tree such that the fingerprint is computed from structure of the associated DOM tree, and, (ii) applying a hash function to code forming at least a portion of the associated DOM tree. 13 . The system of claim 10 , wherein determining the assigned web page category for the request URL comprises at least one of: (i) processing a markup language file associated with the request URL to create an associated DOM tree, and computing a fingerprint for the request URL from at least a portion of the associated DOM tree, and, (ii) looking up the request URL in a table to find the assigned web page category. 14 . The system of claim 10 , wherein the other data related to the request comprises any of: (i) sensor data reflecting one or more interactions at the client, and (ii) a cookie value received from the client. 15 . The system of claim 10 , the bot detection service operable to perform one or more of the following: (i) applying a set of rules that identify differences between human and bot behavior when visiting web pages in the assigned web page category, and, (ii) applying, in an inferencing step, a machine learning model trained with data reflecting interactions on web pages in web page categories, so as to differentiate a human from a bot. 16 . The system of claim 10 , further comprising, with the bot detection service, one or more of the following: (i) applying a set of rules that identify differences between humans and bot behavior in navigating across web pages different web page categories, and, (ii) applying, in an inferencing step, a machine learning model trained with data reflecting navigation on web pages across web page categories, so as to differentiate a human from a bot. 17 . The system of claim 10 , the bot detection service operable to apply, in an inferencing step, a machine learning model trained to differentiate a human from a bot. 18 . The system of claim 10 , wherein the handling comprises any of alerting or blocking the request. 19 . A non-transitory computer readable medium storing computer program instructions for execution on one or more hardware processors of one or m

Assignees

Inventors

Classifications

  • Test or assess a computer or a system · CPC title

  • G06F21/554Primary

    involving event detection and direct action · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12554845B2 cover?
Improved bot detection systems and methods are disclosed. A page fingerprinting algorithm can be used to categorize web pages. The categorization of web pages enables improved insights in profiling the way a human interacts with a website as opposed to a bot. Learning the patterns of humans and bots for a given web page category (or navigation across categories), a heuristic ruleset and/or mach…
Who is the assignee on this patent?
Akamai Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/554. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).