A Practical Lesson in SD-WAN Implementation

On the morning of March 12, 2024, I was reminded again that the difference between “resilient” and “robust” isn’t just semantics — it’s the margin between inconvenience and outright failure. This post breaks down another real outage we experienced and the mistakes I made along the way, in hopes that you won’t repeat them. The Incident At 6:24 AM, our network monitoring service alerted us that our SFTP server had lost internet access. Initially, it looked isolated — no other alerts, just this one box offline. I dove into OS and network troubleshooting on the server. But as minutes passed and additional reports rolled in, the scope started to expand. It wasn’t just the SFTP server — it was our entire primary internet connection at the data center, suffering from 20–80% packet loss. ...

May 29, 2025 · 5 min · Charlie Weeks