Safety Principles and Tools Overview
Reminder : RAMS in the product development cycle
Integrating RAMS into the design from the outset of the development cycle is essential to create products that are inherently Reliable, Available, Maintainable, and Safe. In the context of the product development cycle, commonly known as the V-cycle, RAMS plays a crucial role at every stage of the lifecycle process:
The key principles of safety
This initial chapter lays the foundation for comprehending the safety process through simplified examples. Further elaboration on these concepts will be provided in a dedicated chapter. The subsequent part of the learning journey will delve into more intricate subtleties and details.
The imperative for safety has grown in tandem with the occurrence of accidents amid the evolution of new technologies. Initially, it manifested as best practices and specific safety systems designed to prevent recurring accidents.
For instance, the introduction of the dead man’s switch, particularly in the railway industry, arose in response to incidents involving fatalities or unconsciousness at the throttle. However, this concept can be generalized: the industrial revolution brought about uncontrolled new technologies, with hazards that were not easily foreseeable. Each accident paved the way for the development of corresponding safety barriers.
In certain cases, especially for uncomplicated or low-impact products, adherence to established norms and best practices accumulated over decades may suffice to mitigate risks. Nevertheless, for complex products, where accidents could result in multiple casualties, explicit safety demonstrations must be conducted.
Exemplary sectors highlighting the significance of safety include:
- Nuclear: A pioneer in implementing dedicated and proper system-oriented safety analyses. Despite experiencing the most serious accidents, it boasts the most reduced failure rate and stringent practices across all sectors.
- Aeronautic: Reliability in this sector significantly impacts safety, and accidents can potentially result in the loss of all passengers on board.
- Aerospace: While benefiting from the space race, it remains the most challenging for safety due to its hostile environment and embedded power.
- Railway: As the first large-scale transportation means, it has garnered extensive experience and is currently considered one of the safest modes of transport in certain regions.
- Automotive: Increasingly challenging with new energy storage and the proliferation of embedded electronics with safety functions.
- Medical Equipment: Precision in measurements poses the most significant challenge for patient safety.
The definition of a risk
The risk, or sometimes called danger or hazard, is the most crucial notion in a RAMS analysis. They must be all identified, rated and if needed: eliminated… We wish we could, but a total elimination is impossible, they will be mitigated instead, to an acceptable level. What does acceptable mean?
Below is a frequently used example of hazard categorization:
First parameter: the probability of occurrence of a failure.
Second: the gravity of a failure.
Third: the criticality, which is the association of the probability and the gravity.
In simple terms, a mitigated risk is considered at least tolerable. The higher the gravity of a risk, the lower its probability must be to align with a low criticality.
The rating of a risk
The rating of a risk in preliminary design
How to make sure all the risks are handled? There are so many…
During the preliminary design, the design department provides the list of functions, the mission profile and more rarely the bill of materials.
These represent the input for the first Safety Analysis of the project: the Preliminary Risk Analysis (its name can vary from one to another activity sector). Depending whether the input is the list of functions or elements, it will be a function PRA or element PRA.
For each function for a function PRA or for each element for an element PRA, all the potential failure modes are listed, and their effects are analyzed. Here is a simplified example:
Example for a function PRA:
Example for an element PRA:
Now we have the list of failure modes and effects. Now the goal is to determine a risk mitigation measure.
The mitigation measures are then exported into a synthesis document and are considered as requirements. Their compliance must be proven later in the project, whenever sufficient information from the design will be provided.
Risk rating in detailed design
Once the preliminary architecture provided, its compliance with the PHA must be checked. Moreover, all components technological solutions are known, so the associated failure causes, failure modes and failure rates are also known. The most used tools to check safety requirements are Failure Modes, Effects, and Criticality Analysis (FMECA) or Fault Tree Analyses (FTA).
The FMECA is a PRA, with the introduction of probability of occurrence and criticality. It exists under various forms. It can be a function FMECA, a component FMECA, a hybrid one. Many other forms also exist, but not suited for the present case)
The FMECA can supplement the PRA thanks to its complementary approach, and generate new safety requirements from new mitigation measures. The mitigation measures lead to the reduction of the failure rate, and so of the criticality. No criticality shall be undesirable or intolerable.
Note that the failure rate is taken from trials, standard database or even attached FTA.
The FTA is a tool, which allows to consider safety redundances. Indeed, in a FMECA, one line represents one failure which leads to an accident.
With a FTA, it is also possible to represent safety redundances. A failure of a redundance does not lead to the accident. It is the case for example on an airline plane: if one of the two motors breakdowns, it can safely continue its way with the other. It drastically reduces the failure rate, because the danger would happen only if during the down time of the redundance, the other reliable engine also breakdowns. This time during which redundance is not assured is called “latency time”.
The FTA will be the most precise and handy to quantify a feared event of a complex and redundant system.
Both FMECA and FTA are demonstration tools for Safety, to make sure all the risks are mitigated under the reasonable thresholds.