Semiengineering.com

Heartbleed And The Internet Of Things

2014-05-01

Heartbleed is not a country and western song, but many wish it were. It’s a programming glitch with the potential to cause disastrous and widespread compromises on seemingly secure data.

By some estimates, the flaw in the heartbleed code has allowed hackers to collect personal data, including passwords, undetected, for as long as two years. Exactly how much data has been breached, and what the total damage will be, is still under assessment, but the media hype suggests it is substantial. Moreover, one has to wonder if this glitch may be connected to the recent data compromises at Target and other organizations. Fortunately, the fix is out, but it may take a while for everyone to apply it to their systems.

What makes this “bug,” for lack of a better term, so dangerous is that it is not some super-complex, self-morphing, Mensa-level, mega virus. In fact, it is not really a virus or bug at all. It simply exploits a somewhat overlooked programming mistake in the “heartbleed” part of certain versions of OpenSSL.

In this case the code vulnerability allows anyone on the Internet to read the memory of the systems running vulnerable versions of the OpenSSL software. The fix, according to Dmitry Bestuzhev, head of the research center, Kaspersky Lab Latin America, is quite simple and is included in the OpenSSL 1.0.1g version.

“However,” says Bestuzhev, “if some enterprises cannot upgrade to the patched version of OpenSSL, system administrators can recompile libraries and binaries of compromised versions of OpenSSL using the key -DOPENSSL_NO_HEARTBEATS. Either method will fix the problem for now.”

Extrapolating this to future intelligent objects, which will use the same Internet protocols and platform as today’s hardware, means the same vulnerabilities will exist for them as well. Because the IoT will be have orders of magnitude more objects and vastly varying levels of intelligence, coding mistakes that allow access to memory locations and permit alteration of read/write memory locations code are particularly dangerous.

Unfortunately, because no programming is perfect, they are always likely to be there, in one form or another, according to Bestuzhev. And there is really no specific neutralization strategy, such as the ones that can be developed against viruses, for example. Generally, these types of coding anomalies show up in the field after the code, program, OS, whatever, has gone public, because it isn’t the type of error or errant code that raises a flag during compiling or testing.

There are methods to mitigate these during development, however. According to Bestuzhev, when asked how such coding errors can be minimized, especially when countless new object inherit some level of intelligence he replied, “The key is in pentesting and auditing. However, these processes take time and human resources, and they always have a cost associated with them, and that plays into bottom line…It’s not enough to count on good will. Resources need to be allocated to funds and the necessary staff to get the job done. When this is realized, and implemented, we may have fewer problems like Heartbleed. However, it is important to note ‘fewer’ doesn’t mean ‘not at all.’”

Drilling Down a Bit – OpenSSL 101

SSL stands for Secure Socket Layer, which is the layer that handles encryption and authentication for the servers that run UNIX. The “open” part refers to freely available, unrestricted access UNIX source code, on which the majority of servers around the world run. This is something that is very common in the UNIX world, where almost all code and projects are freely available for anyone to see — programmers and hackers alike. Therefore, it is easy for someone with an understanding of UNIX OSes to see what the code does.

OpenSSL is an enormously popular method of keeping personal information private on the Internet. Millions, of Web sites use OpenSSL to protect your username, password, credit card information, and other private data. However, tests have shown one can access this data completely anonymously with no sign it was ever accessed. Somewhere along the line that should have been a wakeup call, but obviously, it just slipped by, under the radar, until it was exploited.

The term heartbleed, in OpenSSL is the moniker for the handshake that occurs when two servers prepare to make a secure connection (one could say that it seems a bit of a conundrum, at this juncture). It is also the verification process that the two computers use to make sure they are still online with each other. The process for verification goes something like this:

Every time data is exchanged between the host and client, a “heartbleed” routine is set up. Part of it is that, prior to the transmission, a verification check is made to make sure the server is listening, and the client is valid. If the verification is returned, the heartbleed data is sent. This process repeats until all of the transaction data is sent; the transaction is complete, and the connection terminated. However, if, during a transaction, one of the computers gets shut down, blows up, an earthquake happens, or some other crisis causes the transmission to be interrupted, the heartbleed goes out of sync, and the other computer is instructed to terminate the transaction. This is to prevent open connections from staying online, vulnerable, when the transaction fails. In reality, the process is quite simple and has been an accepted practice for years, millions of times a day across millions of computers worldwide.

Code Talk

The actual Linux code that starts the heartbleed, pulls the data, transmits it and verifies the transaction, is not necessary for this quick discussion, but if you are interested in it, go to this URL: http://blog.existentialize.com/diagnosis-of-the-openssl-heartbleed-bug.html/

As it turns out, the code that is at the root of all of this is short and simple. It is: memcpy(bp, pl, payload); The line of code is legitimate and, if left unmolested, does exactly what it is supposed to do, and accurately. The problem is that the code can be “fooled” by altering the payload data. As it turns out, the pb location is one where sensitive data is housed, by programming design.

Defining the variables

memcpy is an instruction that tells the computer to copy data from one memory location to another

pb is a location on the serving computer where the client data is going to be copied.

pl is the data from the client computer that is sent as part of the heartbleed transaction.

payload is a number that defines the size of pl.

While the general concept is that unused computer memory is empty. In reality it generally isn’t. Once the computer is up and running memory, read/write is an ongoing event. And, generally, the memory is always full of data. It may be old data, such as personal data from a previous transaction. It also can be partial unintelligible data because it has had some of it overwritten, or just some random data that is totally disconnected. But it is data, nonetheless. Why this works this way is because it is not efficient for computers to constantly wipe memory when they are done with. Rather, they simply set a flag that tells the OS what is currently in this memory location is old data, and it is okay to overwrite. But until the computer uses that memory, whatever is in there stays. And, in this case, it becomes the target data of hackers.

That being said, once a heartbleed transaction is set up, data taken from the client side pl is copied into the host location pb. Payload says that the data block is XX bytes, the size of the pl data block and sets up a reserved space on the host, so the exact same memory block size is reserved as pb. As the code executes, it takes the data in the client pl location and copies it to host pb location. Then it returns the data from the host to the client as part of the heartbleed transaction.

This is where the hacker gets access to sensitive data. The hacker initiates a transaction with a host computer. It may well be a legitimate login, or it may be a hacked login. Either way, the results are the same. For example, the hacker may set the pl data to zero, and set the payload number to, say 64 kb (it can be any value within limits). Then the transaction is initiated and instead of the pl data overriding the pb data bit for bit (remember, in a legitimate transaction, the pb and pl blocks are identical in size), nothing is overwritten on the host side and the returned data is what was in the host pb memory block, originally.

In some cases, as was discussed earlier, it may just be garbage. In other cases it might be the previous user’s data, including things like passwords or credit card data. So if an organized attack is devised, in reality the transgressor could mine millions of bits of left over data, and even though much of it might be gibberish, some of it will be valuable.

Therefore, by extrapolation, these and similar types of flaws can be passed to IoT object coding as well. To avert this, and, as the Internet evolves, the next generation of internet objects will have to have both much tighter coding awareness and higher level of autonomous firewalls.

On to the Internet of Things

Taking a look at how coding can affect intelligent objects of the emerging IoT presents some interesting challenges. The main difference between objects on the Internet of information vs. the Internet of things is that most objects today are human-interactive devices. Managing them, in whatever fashion, is done via human control – some is constant, some is periodic, but the point is that today, most devices are monitored by humans, most of the time. We make them do what we want, and if there is a security breach, we deal with it with human intelligence. That is not to say that these security breaches can’t get away from us, but sooner or later we are going to find them.

The Internet of things is envisioned as a network of interconnected objects. Everything from office supplies to private jets will have an online presence. Some will simply report and respond on small cell networks (picocells in the home, for example). Others will have complex, two-way reciprocal communications via the Internet.

The level of sophistication of these objects will vary widely. The simpler ones, such as door and window NC/NO alarm contacts may simply report a state and require nothing more than a simple low-bit controller. On the other hand, as the level of device sophistication increases, complex objects such automobiles, will have sophisticated MCUs or MPUs that rival those of powerful multicore high-end smart devices and computer processors. With this extremely wide girth of objects and their same wide girth of applications, managing the security of them will present what seems like almost an insurmountable plateau of challenges.

Conclusion

Bestuzhev had the same perspective on future directions as many Internet security experts have. He said that going forward “You can’t trust anything, even on trusted connections, since everything is potentially vulnerable.” That speaks volumes about the challenges that the industry faces as the Internet morphs.

He goes on to say that “all code, even open source, must be audited.”

“Sometimes the cost of an attack may be relatively very low, yet the impact very high, such as in heartbleed.” Even though the end-point are the weakest stage, one has to address all of the layers that have the potential to be exploited, and data compromised – on any platform,” Bestuzhev said.

Eventually, as more objects begin to integrate intelligence, and the future IoT starts to take shape, this and the countless other existing and potential vulnerabilities will create a security model that is vastly more complex than that of the existing Internet of information. And this is just the beginning.