FAQ

Read here about our Frequently Asked Questions.

Frequently Asked Questions

Client-side tools (like Google Analytics) rely on javascript executing in a user's browser. Most bots and AI scrapers don't execute javascript, they ignore it entirely, making them invisible to legacy trackers. Because we analyze requests at the server level, there is no "opt-out." If a bot hits your site, we log it. Period. You get the 100% truth, not just the "Client- Side Fluff."
They are primarily focused on blocking AI or in estimating fake data. But not all AI is bad; some AIs respect rules and can eventually drive visibility. Honeylog is a deterministic analytics tool first. We give you the visibility to understand your traffic so you can make informed decisions, rather than blindly blocking everything.
Absolutely not. Blindly blocking is a strategy from 2022. Not all AI is predatory. Some agents (like OpenAI's GPTBot or Perplexity) can drive future visibility if managed correctly. Honeylog is an intelligence platform first. We give you the data to distinguish between a competitor stealing your pricing and an LLM indexing you for an "Answer." You decide who stays and who goes.
Legacy tools use approximations, and are built for SEO audits, not real-time bot monitoring. Honeylog is explicitly designed to identify modern LLM scrapers, AI agents, and competitor bots continuously and automatically.
Google Analytics, Chartbeat and Parse.ly rely on a JavaScript tag that fires inside the visitor's browser. Most AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider) fetch your HTML directly without running JavaScript, so the tag never fires. The visit happens, your server serves the content, your analytics knows nothing about it. Honeylog reads requests at the server level, so it catches every bot hit the moment it lands.
All the major ones, by name and user agent: GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, Bytespider (ByteDance), CCBot (Common Crawl), Meta-ExternalAgent, Applebot-Extended, Amazonbot and others. We update the list when new crawlers appear and show growth trends per crawler.
Datadog and Splunk are observability platforms built for engineers debugging systems. Honeylog is built for audience and content teams. We focus only on classifying who is visiting (human, AI bot, search bot, spoofer) at the article and section level, with trends over time. You don't write queries. The dashboard is shaped for publishers from the first login.
User-agent strings are easy to fake. Anyone can claim to be GPTBot. Honeylog cross-references the user agent against the official IP ranges published by OpenAI, Anthropic, Perplexity, Google and others. If a request says it's GPTBot but comes from an IP outside OpenAI's published range, we flag it as spoofed. That same check exposes scrapers hiding behind familiar bot identities.
Honeylog needs no JavaScript tag, no SDK and no code changes. It reads your existing server-side logs. Most setups take under 30 minutes whether you run Nginx, Apache, Cloudflare, Fastly, AWS CloudFront or a custom edge.
Two ways. For licensing: Honeylog gives you exportable reports showing how many requests each AI vendor made against your archive, on which articles, over what time period. That is defensible data for negotiations with OpenAI, Anthropic or Google. For editorial: Honeylog shows which sections and articles attract the most AI attention. That tells you where your content is becoming AI knowledge and where you hold leverage.
Yes. Honeylog reads logs from all major CDNs (Cloudflare, Fastly, AWS CloudFront, Akamai) and direct server logs from Nginx, Apache and Caddy. CDN integration is often the best option, because you catch traffic before any caching layer hides it.
Profound and Similarweb track what AI answer engines say about your brand. They query ChatGPT, Claude, Perplexity and others, logging when your pages appear in answers. Honeylog tracks what AI crawlers do on your site: which bots hit which pages, how often, and whether they respect your robots.txt. The two tools complement each other. One watches the front-end answer. The other watches the back-end source.