Why We Replaced Google Analytics with Self-Hosted Tracking
A short story about switching our own site off Google Analytics, what we built in its place, and the three numbers a growth-focused website actually needs. Plus: how to separate ChatGPT and Claude referrals from generic Google traffic.

Two weeks ago we switched lantana-labs.com off Google Analytics and onto a self-hosted, cookieless analytics stack running on our own server. Nothing dramatic — the site loads a few kilobytes lighter, the cookie banner is gone, our Privacy Policy is honest again, and we own every byte of data we collect.
The switch cost us about half a working day. The mental shift — away from the assumption that "analytics = Google" — took a bit longer. This is what we learned along the way.
GA4 is free, and that's the expensive part
When analytics is free, the product is the data — and the data is you plus every visitor who lands on your site. That has been true for two decades. What changed in 2026 is that:
- The average European / Kenyan / GDPR-adjacent visitor now actually reads the cookie banner and rejects tracking.
- The visibility you lose to ad-blockers and privacy browsers keeps creeping up — we measured ~32% of our real traffic was invisible in GA4 in our last quarterly audit.
- Running analytics on somebody else's servers puts you one ToS update away from having to rewrite your privacy policy.
For a services business with modest traffic, you are paying in complexity and trust to get a product you are barely using two thirds of.
What we replaced it with
We chose a self-hosted, cookieless analytics tool — the details are deliberately boring and not the interesting part. The interesting part is what changed architecturally:
- Our site ships a tiny JavaScript tag (~2 KB) instead of the 90+ KB GA4 loader.
- No cookies, no device fingerprinting, no personal data leaving our servers.
- The dashboard runs on the same infrastructure as the rest of our stack, behind our own TLS, on a subdomain we own.
- Exports are a single `pg_dump` — we never lose the data to somebody else's deprecation announcement.
On the privacy side, the one line that used to read *"We use Google Analytics to…"* now reads *"We run a self-hosted, cookieless tool on our own infrastructure. It stores only anonymous, aggregated page views and referrers, and the data never leaves our servers."* Both sentences are true; only the second is defensible under Kenya's Data Protection Act without a cookie banner.
Three numbers that actually matter
One of the underrated benefits of rolling your own analytics is that you stop drowning in options. We now track three things with intent:
- Referrers, de-anonymised. Not just "organic" — we separate `chat.openai.com`, `claude.ai`, `perplexity.ai`, `gemini.google.com` and `google.com/search` as distinct sources. Each of these sends a behaviourally-different visitor, and we charge for the work differently downstream.
- Pages that convert, not pages that rank. GA4 makes it easy to stare at pageview counts. We rewired our funnel to read backwards from the contact form — which page did a visitor read immediately before they converted? The answer is never the page you expect.
- Session depth by source. A ten-second bounce from a ChatGPT referral is a completely different signal from a ten-second bounce from a cold-ad click. We log the difference explicitly.
That's it. Three numbers. A tight dashboard reads faster than a cluttered one, and decisions get made faster too.
How we track AI referrals separately
This is the question we get asked most often since we started writing about AI visibility. Short answer: you have to do it in three places, not one.
- Server-side referrer logging. Every pageview logs the `Referer` header. When it is a known AI-assistant domain, we tag the session `source: ai-assistant` with a sub-label per tool.
- UTM capture from ChatGPT's outbound links. ChatGPT tags some citations with `?utm_source=chatgpt.com` — we preserve that through our tracker and through to the contact form.
- A "How did you hear about us?" field on the form itself. Checkbox, not free text, with ChatGPT, Claude, Perplexity, Gemini and "another AI" as explicit options. Humans tell you things your analytics can't.
After 60 days of running all three, we can now separate AI-origin pipeline from generic Google pipeline in our monthly review. The close rates are measurably different. We'll publish the numbers once we have a full quarter.
When does this make sense for you?
Self-hosting is not universally correct. Our rule of thumb:
- Stick with GA4 if your site does under 1,000 visitors a month, you have no developer on retainer, and you accept the cookie banner. The setup and maintenance cost is not worth it at that scale.
- Move to a hosted privacy-first service (the obvious vendors exist; we won't name them here) if you want the privacy story without the infrastructure. Costs roughly USD 10–30 a month for sites our size.
- Self-host if you already run your own stack, care about the privacy-policy story, and want the data to genuinely belong to you. The marginal effort on top of an existing Docker deployment is small.
For our clients — typically founders of ambitious brands running their own product on their own infrastructure — option three is usually right. For clients running on Webflow or Framer without a backend team, option two almost always wins.
The one thing we underestimated
Email subscribers are up 14% since the cookie banner came down. That was not the point of the switch — but it is the kind of second-order effect that happens when a site feels trustworthy at the edges. Nothing on the page changed. The only thing different was the absence of a small, familiar annoyance.
People notice when you stop making them click "Reject all" to read your blog. Don't underestimate that.
Rewiring your website analytics? Privacy-first tracking and AI-referrer attribution are part of every digital-growth engagement we run. Start a brief.
Digital Growth
Full-service digital growth — SEO, AI visibility (ChatGPT, Claude, Perplexity, Google AI Overviews), paid media, content, social, lifecycle and privacy-first website analytics. For ambitious brands worldwide; Nairobi-based.

