Meet the Codethings: Safety-critical systems and the benefits of STPA with Shaun Mooney

Over the last decade open source software has become mainstream in the software industry and, at Codethink, we've gladly contributed to this growing trend. Unfortunately, not all software areas have grown equally, and some of them, such as safety, are behind the curve.

Our Division Manager, Shaun Mooney, is an advocate of the benefits open source can apply to safety-critical software and supports the safety analysis technique STPA as an alternative to bottom-up approaches.

At Codethink, Shaun has proved the relevance and the effectiveness of safety on open-source software in automotive and medical devices. Some of the most successful projects he's been part of are AV-STPA, an analysis of an autonomous vehicle with MIT, and the current project with ELISA, looking at opening APS.

Over the last couple of months, Shaun Mooney, Beth White and many other Codethings have been organising their first devroom ever at FOSDEM: The Safety and Open Source Devroom, a space to discuss the relevance of safety in FOSS. The Devroom will take place on the 6th of February, and we'll be able to learn, discuss and attend talks about safety-critical systems. Shaun will kick-start the day talking about 'Why we should use Free and Open Source Software for safety applications', more details here.

In this episode of 'Meet the Codethings', we've asked Shaun about FOSDEM, the future of safety-critical software in automotive, and STPA. Keep up-to-date with our news about safety by just subscribing to our Safety newsletter at the end of the article.

Interviewer: The 6th and 7th February is FOSDEM and some Codethings have coordinated a devroom, what talks in the Safety and Open Source devroom are you most excited about?

Shaun Mooney: "We have really interesting talks lined-up for the Safety and Open Source devroom. I think there is a very interesting talk about heap-manipulating programs with SPARK, that sounds interesting. The thing I'm most excited about is the community discussion. We are trying to get as many industry experts, open-source contributors and users of code to join an open discussion about the issues we are facing with open-source software in safety-critical applications, and we are trying to build a community around this."

I: Can you give us an example of safety-critical software?

SM: "Safety-critical software is everywhere around us. If we define safety as any instance where software can cause harm, when you think about it that way, the software is all around us. In a car, for example, the software which runs your brakes and your steering. It used to be physical-mechanical components, but now they're all done by software. A lot of modern cars don't have any physical connections with the pedal and your brakes. It's all done by sensors and software. In planes, with the autopilot or the extra tools the pilots need to help them pilot the plane, it's all run by software. In the medical industry, the machines keeping people alive, all have software in them. It's everywhere. If you think of any place with software and potentially can cause harm to a person, that's what we define as safety-critical."

I: Which are the challenges for open source on safety-critical applications?

SM: "There are some really interesting challenges using open source software for safety-critical applications. I guess there are two types of challenges:

One, the technical challenge: That's 'how we prove that the software will do what we want it to do?'. Often, I'm using traditional safety methods software that it's made for safety in traditional ways.

For example, you have a product idea where you come up, let’s say 'I want to build a car'. Then we'll need to know what will be the specifications for this project and your software will be designed exactly to reach in safety cars. With open source, the software might already exist, but it might be written for a different application. Still, because you think it does what you want it to do [for your car], you can use it for your safety application.

(...) We've been looking at interesting techniques to overcome these problems. We use STPA, a system analysis tool, a good way of defining what software should do. Also, it's a good way of analysing what a piece of software will do and how it will interact with your system. [With STPA we can] identify extra bits of software, add them and check it's doing what you expected.

We also have some clever testing techniques we can use, such as CICD, great for testing the software on the cloud. We can come up with our own test suites, even if someone else has written the software. Every time that's changed, we can make a version of the software, test it to our criteria, and make sure that it's doing what we want it to do. (...) We can do exciting things to overcome technical problems and prove the software is doing what we want.

I'd say that the second big challenge is cultural. Safety standard bodies and the people on the companies that currently are making safety-critical software have been doing safety for a certain way for so long (...). They have very rigorous ways of defining how the software has been, and it has to fit several processes which open source just doesn't meet. I don't think that means that open source software isn't applicable; we need to rethink how we use software."

I: Which is the future of safety-critical systems in automotive?

SM: "Automotive has a big push for safety-critical systems for a driverless car. It's one of the big focuses for most automotive companies, everyone wants to be the first to have a safe driverless car and that it's really a lot of where the focus is going: 'how do we design safe-driverless cars?', 'how do we use things as complicated us machine learning models?' - which are so complicated that no one really understands how they work -, 'how do we use things that we don't fully deterministic safely?'.

There is also a drive to use more open source in automotive applications. There is a big drive for a lot of automotive companies trying to use Linux as an operating system, trying to put safety applications on top of Linux. I think the future of safety-critical systems is really answering these big questions: 'how can we use huge complicated software projects like Linux in safety-critical applications?'"

I: What is Safety Theoretic Process Analysis (STPA)?

SM: "STPA is a safety analysis technique developed at MIT. It's a way to analyse systems and strive for safety requirements. It's top-down at the opposing bottom-up, which other techniques are, like FMEA, which means we can abstract complexity. When we have huge bits of code, like Linux System (...), it allows us to abstract that complexity out and drives requirements. Then, with the right requirements for big individual units, we can zoom into that and drill down what bits are important.

We can also analyse how different components in the system interact with each other, (...) rather than focusing on just failures. We look at what happens in situations where everything is working as designed, but the way we put them together just doesn't quite work. It's a great technique to use for the software."

I: Can you give us some examples of STPA in your work?

SM: "We've had a couple of really interesting projects at Codethink. One of the first projects with STPA was working with MIT, called AV-STPA. That was an analysis of an autonomous vehicle, it was in the open-source autonomous vehicle platform Apollo, and we use version 2. This was really our first venture in STPA. It was really interesting seeing how the autonomous vehicle software interacts with the vehicle, interacts with the vehicle's safety driver, and other components there. It was fascinating to see problems which we didn't think that would arouse from using STPA.

The other interesting project that we are running with ELISA is looking at opening APS, which is an open-source artificial pancreas system to help diabetic patients manage their administration of insulin. This thing runs on some hardware like a Raspberry Pi, it's all open source, and it interacts with insulin pump pad glucose monitors and helps regulate the insulin dosages. That's really interesting too."

I: How does STPA apply to software-intensive systems?

SM: "STPA is great for software-intensive systems because it doesn't focus on failures. Other analysis techniques, like FMEA, look at failures. But software doesn't fail in the same way. There's a theory that software doesn't fail; it does exactly what we tell it to do, it's just sometimes we told them to do the wrong thing.

This idea is really hard to get over it with traditional safety techniques. STPA allows us to model the interactions of software, different software components and how they work together. When we have something complicated like Linux, where we have different applications fighting for resources, (...) we can really model how they interact with each other. We can also model how the software interacts with humans, (...) and a lot of the time humans are the most unpredictable part of the system.

[STPA] lets us abstract complexity. So we have a hugely complex piece of code, we can say 'right, what does this unit have to do?' and then once we've defined how the unit interacts with other units, we'll zoom in and in and in and in. It's a great way of deriving safety requirements for your software."

I: What are the benefits of STPA in open source?

SM: "The benefits of STPA overall are really huge. As I've mentioned before: for abstracting complexity, for model failures, interactions and the top-down approach.

(...) [STPA] plays a big part in using open source for safety applications. It's a way to take an existent piece of code and analyse exactly how it interacts with our system. It also helps us derive requirements for extra pieces of the system that we'd need to build in safety monitors. [STPA] is the only analysis technique that I've come across, which helps us do this and support us with a body of evidence. It could convince the standards bodies that a piece of open-source software it's applicable for our safety application."

I: Which has been Codethink's most successful project applying to STPA?

SM: "I think that the autonomous vehicle project that we had was really eye-opening, not just for us, but I know for some of the people working at MIT had a big impact on the work that they were doing. Even in a quite a high abstraction that was, we didn't even drill down into the lines of code in software, even at a very high viewpoint, we were able to make some really groundbreaking discoveries that had a huge impact on the projects. I think that was our most successful project."

Keep up-to-date about Safety

The conversation about safety-critical software has just started. Complete the form and receive in your inbox our latest updates about Safety and Open Source.