2016-10-24

Friday was supposed to be a quiet day, but it turned out to be THE real Black Friday for me. From an internet perspective, this was the worst I have ever seen at this scale.

The chart below shows the DNS resolution time and availability of twitter.com from around the world. 3 clear waves of outages:

7:10 EST to 9:10 EST

11:52 EST to 16:33 EST

19:13 EST to 20:38 EST



The DNS failures were the result of slow response times from the DYN servers (>4500 ms)

We, at Catchpoint, were impacted in 3 ways:

Our domain Catchpoint.com was not reachable for a solid 30 minutes until we introduced our secondary DNS solution. We also introduced a backup domain into the mix that was never on DYN so our customers could login to our portal and keep an eye on their online services. All of these were in standby we did not have to sign new contracts.

Our nodes could not talk reliably to our globally distributed command and control systems until we switched to IP only mode, bypassing DNS (Nice feature we built 1 year ago)

Many of our own 3rd party stopped working: Support solutions, CRM, door badging system, SSO, 2 Factor Authentication services, CDN and the list goes on and on.

This blog post is not about finger pointing, the folks at Dyn had a horrible day putting up with this scale of attack. They did an amazing job from notifications to extinguishing the fire. I really hope they got some well deserved rest this week end.

A lot has been written about who / what / why… so I will not get into any of these.

What we learned that day and solutions I think we must have:

DNS is the weakest link of our Internet Economy. We have written about 20-30 articles about DNS

A single DNS provider is not an option anymore for anyone. After this Black DNS Friday, no company, small or large, can rely on a single DNS provider.

DNS Vendors must make knowledge base articles about how to introduce secondary dns providers and they must be easy to find and follow!

DNS vendors must make the setup of auto – transfer easier to find. I should not have to open a ticket in a middle of a crisis to find out the IP of the xtransfer name servers!

DNS Vendors must not make themselves authoritative in the additional record sections for another 1 day or sometimes 2 days! Sorry DYN but you are the only one that has this nasty feature.



Introducing another DNS vendor would not achieved 100% of the result until you go in your DYN config and add that other solution in the mix:



The community must work together to come up with commercial or open source solutions to make DNS configurations compatible between vendors. This is a must have!

We must have a way to push registrar configurations faster. We need to have an Emergency reload button @ the registrar levels. Waiting 2 days is not going to work after this catastrophic event.

I am not an expert on security but if we are going to have billions of connected devices we need to change a few things:

In our day to day we interact with many things (Cars, cell phones, planes, hair dryers) that have some sort of certifications! I urge whomever is responsible from the UN, to National bodies to consider:

A ban on any internet connected device that does not force the change of default credential upon starting it. No more Admin/Admin for ANYTHING: Cameras, Fridge, Access points, Routers..

A ban on any devices that cannot fulfill this basic mandate. The same way we do not allow a Samsung phone on a plane, we should not allow certain devices on the internet. Consumer should also put pressure by not buying “things” that are not safe no matter how cheap!

A must have feature on every home and SMB router, access point… to detect abnormal traffic / activity and turn it off or slow it down, sending 10000000000 dns requests in 1 minutes is not normal! Learn from Microsoft and what they did from Windows XP to limit botnet.

Local ISPs must have capabilities to detect and stop rogue traffic. I have always been baffled by this, some can block objectionable content but not gbs of udp traffic?

Cyberserucity or lack off, is no longer a nuisance, I do not want to sound alarming or dramatic, but I hope this is going to be a huge wake up call for the community, governments, major hardware manufacturers (Cisco, Juniper..),… What happened Friday is a Pearl Harbor type of event. I hope no one got physically hurt, but we rely on the Internet for everything and as citizens we must protect it.

Dear Members of the House of Representatives, instead of wasting time investigating silly stuff, I urge you to work on this instead! And I am not  talking about hearings for the next 5 years, but real actions!

Thank you Dyn for the prompt response times to the support ticket, to Verisign, our customers who were very patient and understanding, our entire support organization, some special friends at various organization that landed a hand by providing some amazing advise.

Mehdi

The post The Internet’s Pearl Harbor – 10/21/2016 appeared first on Catchpoint's Blog.

Show more