Table of Contents
Architecture Overview
A modern cloaking system is a multi-layer visitor classification engine that sits between the ad click and the destination page. It processes incoming requests in under 50 milliseconds and makes a binary decision: serve the safe page or the money page.
The architecture is almost always implemented at the edge (reverse proxy, CDN worker, or a dedicated server) so latency stays minimal. Layers 1 and 2 are synchronous — they happen before the HTML response is sent. Layer 3 is asynchronous and client-side — it runs after the initial page load.
Layer 1 — IP Intelligence
The first layer checks the visitor's IP address against several databases simultaneously:
ASN (Autonomous System Number) Matching
Every IP address belongs to an ASN — a block of IPs owned by a specific organization. Ad platforms like Meta own specific ASN ranges (AS32934, AS63293). When an IP from one of these ASNs hits the cloaking server, it's immediately flagged as a potential bot. This is fast, reliable, and covers the vast majority of automated crawlers.
Datacenter IP Detection
Cloud providers (AWS, Google Cloud, Azure, DigitalOcean, Hetzner, OVH) maintain well-documented IP ranges. Real users almost never access websites from datacenter IPs — these are used by servers, bots, and VPN users. Any IP classified as datacenter is scored heavily toward "bot."
Known Proxy / VPN Detection
Commercial VPN and proxy services (NordVPN, ExpressVPN, residential proxy networks) maintain IP pools that are tracked by IP intelligence services. A visitor coming from a known proxy exit node gets a high bot score, since ad platform reviewers use these to mask their identity.
IP Reputation Scoring
Beyond the above, quality IP intelligence services like MaxMind, IPinfo, or proprietary databases maintained by cloaking providers assign each IP a fraud/bot probability score based on historical behavior patterns. This provides a granular signal rather than a binary flag.
Layer 2 — User-Agent & Header Analysis
After the IP check, the system examines the full HTTP request headers:
User-Agent String Parsing
The User-Agent header identifies the browser and operating system. Known crawler UAs (facebookexternalhit/1.1, Googlebot/2.1, TikTokBot, LinkedInBot) are directly flagged. Headless Chrome (HeadlessChrome substring) and Puppeteer/Playwright-generated UAs are also detectable.
Header Consistency Analysis
Real browsers send a consistent set of headers with their requests. For example, a Chrome browser on Windows 11 will always send Sec-Ch-Ua, Sec-Ch-Ua-Platform, and Sec-Fetch-Dest headers with specific expected values. Bots that spoof user-agent strings often fail to spoof all accompanying headers consistently. Detecting these inconsistencies is highly reliable.
Accept-Language and Accept-Encoding
The Accept-Language header indicates the user's preferred language, which should match their geo. An IP geolocated to France with Accept-Language: zh-CN is anomalous. Similarly, bots often send minimal Accept-Encoding values compared to real browsers.
Layer 3 — JavaScript Behavioral Fingerprinting
This is the most sophisticated — and most important — layer. It runs client-side via a JavaScript snippet loaded on the page, and collects signals that only real human interaction can generate.
Mouse Movement Entropy
Real users move their mouse in curved, irregular paths influenced by their hand's physical inertia. Bots move in straight lines or jump directly between coordinates. The fingerprinting script records mouse trajectory data and computes the entropy (randomness) of the movement. Low entropy = likely bot.
Touch Event Analysis (Mobile)
On mobile, real users swipe and tap with natural variation in velocity, pressure, and coordinates. Device farm operators tapping rapidly through pages produce rhythmically consistent touch events. The script measures touch event timing variance and flags overly regular patterns.
Scroll Behavior
Real users scroll to read content — they pause at sections, scroll at human speed, and show acceleration/deceleration patterns. Bots either don't scroll at all, or scroll at a constant uniform velocity. Scroll depth combined with time-on-page is a strong combined signal.
Browser API Consistency
The script probes browser APIs to detect headless environments:
navigator.webdriver— set totruein Selenium/WebDriver environments- WebGL renderer — headless Chrome returns
SwiftShaderinstead of a real GPU - Audio context fingerprint — differs between headless and real browsers
- Canvas fingerprint — rendering artifacts differ between virtualized and native environments
screen.width/window.outerWidthconsistency — bots often fail to properly simulate viewport dimensions
Session Velocity
If the same IP (or IP subnet) generates 50 page visits in 10 minutes with near-identical behavioral patterns, this is a review sweep, not organic traffic. Session velocity analysis identifies coordinated bot traffic even when individual sessions look clean.
The Scoring Engine
Each signal from all three layers produces a sub-score. The scoring engine aggregates these using a weighted model:
| Signal | Weight | Why It Matters |
|---|---|---|
| ASN = known platform | Very High | Direct indicator of ad platform crawler |
| Datacenter IP | High | Real users rarely use datacenter IPs |
| Known proxy/VPN | High | Reviewer masking technique |
| navigator.webdriver = true | Very High | Direct headless browser signal |
| WebGL = SwiftShader | High | Headless Chrome rendering engine |
| Mouse entropy low | Medium | Automated interaction pattern |
| No scroll / instant scroll | Medium | Non-human reading behavior |
| Session velocity spike | High | Coordinated review sweep |
| Header inconsistency | Medium-High | Spoofed UA without matching headers |
The final score is a number between 0 (definitely human) and 100 (definitely bot). The classification threshold is typically configurable — a more conservative threshold (e.g., 60) passes fewer humans as bots (lower false positive rate) but lets more reviewers through (higher false negative rate). A stricter threshold (e.g., 40) does the opposite.
Page Routing Decision
Once the classification is made, the routing is straightforward:
- Bot classified → Serve the safe page (compliant, policy-friendly landing page) with HTTP 200. No redirect, no suspicious behavior in logs.
- Human classified → Either serve the money page directly, or issue a transparent redirect (302 or JavaScript redirect) to the real offer URL.
The best practice is to serve the safe page in-place (not redirect to a different URL), because a redirect adds a visible URL change that a human reviewer or automated monitoring tool could flag as suspicious.
Performance Considerations
Cloaking has to be fast. A cloaking layer that adds 2 seconds to your page load will destroy your Quality Score on Google Ads and hurt conversions everywhere. Performance requirements:
- Layers 1 and 2 (server-side): < 10ms total, ideally < 5ms
- Layer 3 (client-side JS): non-blocking, loaded asynchronously after content
- Safe page: full Core Web Vitals compliance (LCP < 2.5s, FID < 100ms)
- Money page redirect: if used, 302 rather than 301 (not cached by browsers)
How Platforms Try to Evade Cloakers
Ad platforms are constantly evolving their detection evasion techniques. Here's what they currently deploy against cloaking:
- Residential proxy networks — crawlers routed through real residential IPs (Comcast, Verizon, BT, Deutsche Telekom) that look identical to organic users at the IP level
- Real device farms (primarily TikTok) — actual iPhones/Android devices on mobile carrier connections
- Delayed crawls — re-crawling pages 24–96 hours after approval, hoping the advertiser has lowered their guard
- Slow loading simulation — some crawlers simulate a slow connection to trigger different page states
- JavaScript execution — modern review bots execute JavaScript fully, so JS-only cloaking without behavioral checks can be bypassed
This is why the behavioral fingerprinting layer (Layer 3) has become the most important. A residential IP running a full JavaScript-capable browser can bypass Layers 1 and 2 — only behavioral signals can catch it.
Multi-Layer Cloaking, Built for 2026
CloakTrack combines IP intelligence, header analysis, and JavaScript behavioral fingerprinting into a single platform with real-time dashboards and analytics.
See How CloakTrack Works →