Home > White Papers > Verification of Safety    s

Verification of Safety

Unlike the fairy tale Rumplestiltskin, do not think that by having named the devil that you have destroyed him. Positive verification of his demise is required.

System Safety Handbook for the
Acquisition Manager, U.S. Air Force

 

Testing for safety starts from the system safety design constraints. The goal is to show that the software will not do anything that will lead to violating the constraints. At a minimum, hazardous outputs must be tested for. The effectiveness of any specific safety design feature must be tested. The tests must focus on what the software will do that it is not supposed to do. This is much more difficult than just verifying that the software performs a particular function. Like any testing, the tests should be specified early in the development cycle and evolve as the design and hazard analysis evolves.

Safety testing needs to address:

bulletCritical functions for hazardous behavior
bulletBoundary conditions
bulletSpecial features (such as firewalls) upon which protection of critical functions is based
bulletIncorrect and unexpected inputs, input sequences, and timing.
bulletReaction of the software to system faults and failures (environmental stress testing)
bulletGo/No-Go and fail-safe modes
bulletOperator interface

The safety engineer must review the test plans and recommend tests based on hazard analyses, safety standards, checklists, previous accidents and incidents, and interface analyses. The safety effort should specify the conditions under which tests are to be conducted. The safety engineer should review test results for any safety-related problems that were missed in analysis or other testing and monitor tests for unexpected failure modes and hazardous states.

There are limitations to dynamic analysis, however. Testing for safety is essentially intractable. Again, demonstrating that software will not do something under any conditions is a much more difficult task than demonstrating that a particular function works as intended. (Testing for safety analogizes to trying to test for security. The intractability is similar.) Accidents occur only infrequently and usually in ways not anticipated by the designer or tester. Lastly, testing doesn't find requirements errors, which become the cause of most accidents.

The limitations above do not imply that one should not test, just that there is a limit to the confidence that can be acquired through dynamic analysis. There are also limits to static analysis; tests are needed to verify the accuracy of the model to the constructed system, the satisfaction of assumptions underlying the mathematical techniques, and things not covered by static analysis, such as performance. Test designers can use the results of static analysis as a guide to test planning.

Formal verification will not make complex systems simple either. There is no magic here. Some limitations of formal verification include that the system of formal mathematics, such as proofs, may be larger than the program being verified. Often, the formal verification is very difficult to construct. Like anything difficult to construct, it is likely to contain errors. Some limited aspects of system behavior may be proven, but are these errors likely to be caught in testing anyway? Are these errors ones that are likely to cause accidents? Formal verification probably has little effect on safety unless it is aimed at the safety constraints.

There are a variety of informal code analysis methods. Many on the market seem worthless, such as software sneak analysis. One method with some applicability is software fault tree analysis (SFTA). The method applies the reasoning process of formal methods without the mathematical complication of the proofs. The underlying theory is based on proof by contradiction.

SFTA is only practical for relatively small pieces of software. Again, there is no magic bullet here, but the technique can be useful. The idea is to work backward through the program logic from a hazardous output.

 

SFTA was used to analyze the flight telemetry program for a UC Berkeley scientific satellite called Firewheel. The mission was to sample electric fields in earth's magnetotail. The critical failure event was ripping wire booms of the satellite. It took about 2 days and required the examination of about 12% of the program's code (out of 300 lines of Pascal code). The method did detect a critical scenario that was undiscovered during a thorough test and evaluation by an independent group.

 

 

Ontario Hydro analyzed the shutdown software for their Darlington Nuclear Power Generating Station. The hazard was a failure to trip when required. There were about 3000 lines of code in Pascal and FORTRAN. It took about 2 person months. The analysts were unfamiliar with fault tree techniques, but they were familiar with software. The analysts identified over 40 potential hazardous failure modes (but no software errors). Minor changes were made to the software to guard against the identified failure modes.

SFTA is being applied successfully on real projects. It forces a different viewpoint: what software is NOT supposed to do. Beneficially, it starts from the system specification rather than the software specification. SFTA identifies critical assumptions about the environment, and it can be used to identify assertions and run-time checks for potentially hazardous states. Unfortunately, it is only practical for small pieces of software. It is a nontrivial undertaking, and it is not a magic bullet.

Home Products Services Publications White Papers About Us

Copyright 2003 Safeware Engineering Corporation. All rights reserved