A little Node.js service that watches for LaLiga's Cloudflare blocking and flips DNS off the proxy so my sites stay up during the matches.

If you live in Spain you already know the drill: a football match starts and half the internet falls over. LaLiga got the courts to let them block Cloudflare IP ranges to fight piracy, the ISPs comply with a sledgehammer, and every legit site sitting behind Cloudflare goes down as collateral. Mine included. So I built a little service that notices the block and gets out of the way.

The problem

Two of my sites live behind Cloudflare: tardigram.com (yes, the mbin instance I ported to k8s) and elpapeo.com. Normally that’s lovely, the orange cloud does its thing and I sleep well.

Then it’s matchday. The Spanish ISPs (DIGI, Movistar, Orange, Vodafone, Masmovil…) start null-routing big chunks of Cloudflare’s anycast ranges, and from inside Spain my sites are simply gone. Not slow. Gone. SYNs into the void. And there’s nothing wrong with my servers at all, the edge I pay to protect me is the thing being blocked.

But Nico, can’t you just complain to your ISP? Sure, let me know how that goes.

So the only lever I actually control is DNS. If Cloudflare is the thing being blocked, then during the match I just… stop going through Cloudflare.

The idea

The trick is dead simple once you say it out loud:

  • Normal time: the record is a proxied CNAME to my Argo tunnel. Client → Cloudflare → tunnel → backend. The works.
  • Matchtime (CF blocked): flip the record to a plain A record, proxy off, pointing straight at a VPS sitting outside the blast radius. Client → VPS → origin → backend. No Cloudflare in the path, nothing for the ISPs to block.

And when the match is over, flip it back.

That’s it. The whole service is just “watch for the block, toggle a DNS record, toggle it back, and don’t make a mess in between”. I called it cloudflare-switchover and it’s a small Node.js thing running in my cluster.

How it actually works

I didn’t want a dumb on/off switch, because a false positive means I’m voluntarily yanking my sites off Cloudflare for no reason. So it’s a little state machine with four states: normal, watching, fallback, restoring.

1
2
3
4
normal ──(footy detected)──→ watching ──(trace fails)──→ fallback
↑ │ │
│ (footy ends) (footy ends)
└───────────────────── normal restoring ──────┘

The signals it leans on:

  1. Is there football right now? I piggyback on hayahora.futbol, which tracks the block per ISP. I do a majority vote across the ISPs, with a staleness check so I don’t act on stale data.
  2. Is Cloudflare actually blocked? Footy being on isn’t enough, I confirm by hitting /cdn-cgi/trace on each domain. If the trace times out, CF really is unreachable from where I’m standing, and only then do I flip to fallback.
  3. Is it safe to come back? When the match ends I don’t restore blindly. I check origin.{domain} (which always goes through Cloudflare) and only flip back once the edge answers again.

Every transition pings me on Telegram and Slack with the trace result and a health check, so I can watch the whole thing happen from my phone while pretending to enjoy the actual football.

On the receiving end, the VPS is just nginx reverse-proxying to the origin over an frp tunnel, with Anubis in front doing a proof-of-work challenge to keep the scrapers out. HTTPS end to end, Let’s Encrypt on the VPS, no Cloudflare Origin certs needed because nginx talks to the CF edge which already has a perfectly good public cert.

Configuring it

It’s all env vars. The only two that matter are your Cloudflare token and the records you want it to babysit:

1
2
3
4
5
6
7
8
9
10
11
12
CLOUDFLARE_API_TOKEN=your_cf_api_token

# DOMAIN_RECORDS is JSON. Export it as a shell var (compose chokes on quotes in .env)
export DOMAIN_RECORDS='[
{
"zone_id": "your_zone_id",
"record_name": "tardigram.com",
"fallback_type": "A",
"fallback_content": "5.161.x.x",
"health_check_string": "Tardigram"
}
]'

fallback_content is the VPS IP, and health_check_string is just a bit of text the service expects to find in the page after a switch, so a 200 that’s secretly an error page doesn’t fool it. The rest have sane defaults: poll every 5 minutes, football threshold at a simple majority, optional Telegram/Slack webhooks.

One prerequisite on the Cloudflare side: for each domain create an origin.{domain} record pointing at the tunnel and never let the switcher touch it. The VPS uses it as its upstream, and the service uses it to check whether CF is reachable again before restoring. That subdomain is your back door to your own site.

Deploy-wise it’s a tiny container, one replica (you really don’t want two things racing to edit the same DNS record), running in k8s via ArgoCD. There’s a /status endpoint and a little dashboard with Force ON / Force OFF buttons for when you want to test the whole dance without waiting for Real Madrid to kick off.

The bit that bit me

Here’s the gotcha that cost me an evening, and it’s a fun one.

After every switch the service does a health check to confirm the site is actually up. Original code, totally reasonable looking:

1
const res = await fetch(`https://${domain}`);

And in production it always timed out. Ten seconds, every time, content=FAIL. But the site was up. I could curl it fine.

So what gives? DNS caching, that’s what. The health check runs milliseconds after I change the record. The service’s resolver still has the old Cloudflare IP cached, so fetch('https://tardigram.com') happily connects to… the exact Cloudflare edge that’s blocked right now. It hangs until the timeout and never once touches the new fallback path. The health check was faithfully testing the thing I’d just spent all this effort routing around.

The fix is to stop trusting DNS for that one check and pin the connection straight to the fallback IP, while keeping the SNI and Host header as the real domain, basically curl --resolve but in code:

1
2
3
4
5
6
https.request({
host: fallbackIp, // connect here
servername: domain, // ...but SNI says the real domain
headers: { Host: domain },
family: 4, // and skip IPv6, the cluster has no v6 route
});

Now it tests the path the switch actually creates instead of the one it’s trying to escape. Obvious in hindsight. Aren’t they all.

Wrapping up

It’s still rough around the edges and very much scratch-my-own-itch software, but it’s been quietly keeping my sites reachable through the matches, which is exactly what I wanted. If LaLiga is going to break the internet every weekend, the least I can do is duck.

If you want to poke at it, it’s cloudflare-switchover. Happy dodging.

Comments

⬆︎TOP