As a community, and as a society at large, we miss a great opportunity to learn from security incidents and to improve the security of our digital infrastructure. Given our reliance on those systems and the immense impact a disturbance may have, we need to do everything within our reach to eradicate vulnerabilities. The mechanisme around airplane safety may be used as an example.
Even though there were a record 32 million departures of large aircrafts in 2013, there were “only” 137 fatalities due to accidents. In the ten-year period before, there were 0.6 fatal accidents per one million flights. One of the reasons for this surprisingly low number of fatal incidents is the ability of the aviation industry to learn from mistakes. We should start doing the same with incidents in the digital world.
Whenever a plane has crashed, the first priority is to provide assistance to any survivors. Right after that, all efforts are put into the recovery of the two black boxes – which are often orange to increase the chance of finding them. One of the boxes records all the instructions sent to the electronic systems of the aircraft. The other records conversations in the cockpit, radio communications between the cockpit crew and others, as well as ambient sounds. These recorders are designed to capture all important data and can withstand great external impact. Black boxes are key tools in determining what has caused the crash.
The purpose of these investigations is plain and simple: finding out what we can do to prevent a similar accident from happening. Did the pilot misread the altitude on the cockpit panel? Was there some mechanical failure that caused the suspension to break upon touchdown? Was the collision of two planes caused by confusing taxiing procedures? In other words: Is there some mechanical part that needs preventive replacement in planes of the same make and model? Do we need to redesign the construction of the landing gear? Or do we need to revise the procedures when taxiing from the runway to the gate? These recommendations are then systematically shared in the industry, making traveling by plane safer for all of us.
Surprisingly, none of that happens when there is an incident in our digital environment. Of course, incidents are being investigated. However, those investigations are commissioned or done by the affected organisation and aim to limit the damage to its own interests – especially the damage to the image of the organisation. The results are kept secret or shared haphazardly with the most demanding customers. At best, the results are used to improve the internal monitoring system. Of course, this is somewhat exaggerated. Fact is: as a community, and as a society at large, we miss a great opportunity to learn and to improve the security of our digital infrastructure.
In the Netherlands, just like everywhere else, many agencies and institutions have a role to play in the aftermath of a plane crash. On the trail of the emergency services you’ll see the public prosecutor, salvage businesses, insurance companies and many other organisations. They clean the site, chase after those who are responsible to try and get compensation and they prosecute the guilty. Perhaps the most important work is done by the Dutch Safety Board (DSB). They document every known detail that has led up to the crash and make recommendations for preventing a similar accident.
Accidents in the transportation sector have been examined on a fairly structured basis since the early 20th century, albeit never by a fully independent body. This only changed at the end of the century, together with the most pivotal improvement: blame was explicitly excluded from the investigation process in order to maximise the effectiveness. The DSB was established in early 2005 (after the reworks disaster in Enschede and a devastating fire in a bar in Volendam). It is allowed to initiate investigations into accidents in the various transport sectors, as well as in the fields of defence, industry and commerce, health, nature and environment, crisis and emergency.
Strangely enough, the DSB focuses on situations where people are dependent on others for their safety, but doesn’t investigate incidents in the digital domain - even when vulnerabilities in our industry can have an immense impact on the safety of large groups of people. And because of the architecture of these systems, people can’t defend themselves against that impact.
These systems are entrusted with the data about millions of people. This can be sensitive data where people rely on the protective measures of companies and governments. It is not just the data users more or less knowingly share, the same goes for the data that is generated by all the devices in our homes. Soon everyone will have dozens of connected devices in their home, continuously sharing our private lives with the outside world. Vulnerabilities in those systems makes that sensitive data s accessible to criminals. Vulnerable IT systems lead to vulnerable societies.
We have become highly dependent on the availability of our IT systems. As a result, incidents may affect the supply chain to supermarkets, the correct functioning of our cars, access to emergency services, financial transactions or the navigation of planes. If any of those systems is prevented from operating for more than a couple of days, or maybe even just a few hours, it could derail our society considerably. When these kind of attacks cannot be contained, it will undermine the trust of citizens and may trigger societal unrest.
Given our reliance on those systems and the immense impact a disturbance of these systems may have, we need to do everything within our reach to eradicate vulnerabilities. We need to make sure that the software created is based on solid security practices and is audited frequently. We need to make sure that closed software developers are liable for the products they put out on the market. We need to make sure that the government doesn’t introduce any new vulnerabilities in the form of backdoors. We need to make sure that all vulnerabilities we come across, are swiftly reported to those who are accountable and responsibly disclosed to the general public as soon as possible.
The easiest way to improve the security of key IT systems for our society, is to start learning from our mistakes. It would be fairly stupid to, after handling an incident, make the same mistake that led to that incident over and over again. The only way to prevent this from happening is by investigating security incidents transparently and thoroughly and create recommendations for preventive measures. Therefore, I propose the establishment of a new organisation with the sole purpose of investigating security incidents in the digital realm in order to prevent similar incidents from reoccurring. In other words, a cyber-incarnation of the Dutch Safety Board.
There are a number of vitally important requirements for such an investigative body. First and foremost, it needs to be fully independent. If it’s not independent, it won’t be trusted. Without the trust, investigations will be obstructed and the reports not taken seriously. As the primary objective is to learn from mistakes, an environment of trust is necessary. Secondly, these investigations may not be part of an investigation into guilt or liability. That should be explicitly ruled out. Only when investigators have full access to all relevant data, they will be able to make a factual, accurate and explanatory analysis. If the owner of the system, that has caused or discovered an incident, can’t be sure that the information he shares is not used against him, he will be reluctant to cooperate. Information handed over to the investigative body can’t be shared with insurance companies or law enforcement. If there is a need from law enforcement or others, those institutions should seek access to that information based on their own powers.
Because we need to learn from these incidents, this new organisation needs to be transparent and should extensively publish its findings. The reports should detail all the facts that led up to the incident, including the impact of the security breach. It needs to present readily applicable and relevant recommendations to enable others to prevent similar incidents. It should also provide suggestions for detecting early warnings and how to reduce the impact in case something goes wrong. Of course, this may mean that classified business information becomes public. The risk of disclosure can be decreased by reporting just facts in a timely manner. In any case the public interest should outweigh the interests of the individual company or government institution.
That does not mean the government has no role to play whatsoever. The opposite is true: the government needs to fund such an organisation as well as create the legal environment in which it can operate. When investigating security incidents in the digital realm, the investigators will need to have access to the systems in which a vulnerability has been abused. That requires investigatory powers to obtain access to documents and systems that are otherwise out of reach of an independent and non-law enforcement organisation. It needs to have the power to make copies of data carriers, audits and everything else that it deems relevant to the investigation. All of this requires a new bill that needs to be proposed and approved by the parliament.
The organisation should consist of a team of experts. While many of the incidents will have a root cause in a non-technical area, technical expertise will be required for a thorough examination in many other investigations. The organisation needs a number of forensic and security experts for all kinds of digital systems. Without losing its independence, the organisation may exchange experience and knowledge with experts from CSIRTs.
Finally, there will be way more incidents than any organisation is able to investigate. With the right set of priorities, the organisation should be able to make a selection that allows for a thorough investigation of a limited number of incidents while maximising the effectiveness of the outcomes. Without a doubt, the organisation should investigate high impact incidents in systems that are vitally important to our society. At the same time it should look into the low hanging fruit: incidents with small impact, hitting large groups of people. The latter could be done, for example, based on the notifications of data breaches that are made to the Dutch DPA. The organisation could investigate every 50th incident and publish a yearly summary of quick wins.
One unanswered question remains: What is the digital infrastructure equivalent of the black box?