The risks associated with shutting down and restarting a unit properly make start-up and shutdown (SU/SD) processes critical to safe operations in the chemical and refining industries. This emphasized importance is due to the risks associated with the transition to and from steady state operation involving controls and monitoring procedures that are not used frequently and may not be well known by facility personnel. While commonly associated with planned shutdowns and turnarounds, SU/SD operations also encompass routine operational cycles and processes that occur independently of these events. These SU/SD operations produce various risks that need to be carefully managed to ensure the safety, reliability, and efficiency of industrial facilities. 

 

Understanding SU/SD service beyond shutdowns or turnarounds

SU/SD service in the chemical and refining industries encompasses several routine operational cycles and processes that occur separately from shutdowns or turnarounds. These process transients include batch processing, routine maintenance and inspections, catalyst regeneration and replacement, product changeovers, routine sampling and analysis, and process optimization. These SU/SD operations contribute to the safe, efficient, and reliable operation of facilities and help maintain product quality and process performance.

 

Risks associated with SU/SD service:

Safety Hazards: The SU/SD nature of process operations poses safety risks, exposing workers to hazardous materials, high pressures, temperatures, and complex equipment.

Process Upsets: Rapid changes in process conditions during startup and shutdown phases can lead to equipment failures, leaks, pressure imbalances, or other deviations from normal operating conditions.

Environmental Impact: Improper handling, storage, or disposal of hazardous chemicals and waste products during SU/SD operations can result in environmental contamination, spills, leaks, and air emissions.

Equipment Integrity: Frequent startup and shutdown cycles can stress equipment, increasing the likelihood of failures or malfunctions that compromise equipment integrity and reliability.

Energy Efficiency: The SU/SD nature of operations can lead to energy inefficiencies, requiring additional energy input during startups or shutdowns, impacting overall energy consumption and costs.

Cost Considerations: SU/SD operations involve expenses related to shutdown planning, labor, equipment, materials, and potential production losses, which need to be carefully managed to avoid cost overruns.

Human Factor: The nature of SU/SD service requires personnel operating the facility to have large amounts of knowledge of the process, operations, and risks of numerous different scenarios. As personnel at a facility change positions or new personnel are introduced, it adds an additional layer of risk that is completely dependent on a human factor.

Operational Flexibility: Frequent startups and shutdowns can limit the operational flexibility of a facility, impacting the ability to adjust production rates, respond to market demands, or accommodate process changes.

 

Examples of failures due to SU/SD service

Buncefield Oil Depot Explosion[1] 

On the evening of Saturday, December 10th, 2005, a tank at the Hertfordshire Oil Storage Limited (HOSL) section of the facility was being filled with gasoline. The explosion at the Buncefield oil depot in the UK occurred during a startup operation after a shutdown. The tank had two forms of level control: a gauge that enabled the employees to monitor the filling operation and an independent high-level switch (IHLS), which was meant to shut down operations automatically if the tank was overfilled. The first gauge stuck, and the IHLS was inoperable – there was, therefore, no means to alert the control room staff that the tank was filling to dangerous levels. Eventually, large quantities of gasoline overflowed from the top of the tank. A vapor cloud formed, which ignited, causing a massive explosion and a fire that lasted five days.

Failures of design and maintenance in both overfill protection systems and liquid containment systems were the technical causes of the initial explosion and the seepage of pollutants into the environment in its aftermath. However, underlying these immediate failings lay root causes based on broader human factor management failings:

  • The management systems in place at HOSL relating to start up and tank filling were both deficient and not properly followed, even though the systems were independently audited. 
  • Pressures on staff had been increasing before the incident. The site was fed by three pipelines, two of which control room staff had little control over in terms of flow rates and timing of receipt. This meant that staff did not have sufficient information easily available to them to precisely manage the storage of incoming fuel. 
  • Throughput had increased at the site. This put more pressure on site management and staff and further degraded their ability to monitor the receipt and storage of fuel. The pressure on staff was made worse by a lack of engineering support from the head office.

 

BP Texas City Refinery Explosion[2]

On the morning of March 23, 2005, the raffinate splitter tower in the refinery’s ISOM unit was restarted after a maintenance outage. During the startup, operations personnel pumped flammable liquid hydrocarbons into the tower for over three hours without any liquid being removed, which was contrary to startup procedure instructions. Critical alarms and control instrumentation provided false indications that failed to alert the operators of the high level in the tower. 

Consequently, unknown to the operations crew, the 170-foot (52-m) tall tower was overfilled, and liquid overflowed into the overhead pipe at the top of the tower. The overhead pipe ran down the side of the tower to pressure relief valves located 148 feet (45 m) below. As the pipe filled with liquid, the pressure at the bottom rose rapidly from about 21 pounds per square inch (psi) to about 64 psi. The three pressure relief valves opened for six minutes, discharging a large quantity of flammable liquid to a blowdown drum with a vent stack open to the atmosphere.

The blowdown drum and stack were overfilled with flammable liquid, which led to a geyser-like release. This blowdown system had never been connected to a flare system to safely contain liquids and combust flammable vapors released from the process. 

The released volatile liquid evaporated as it fell to the ground and formed a flammable vapor cloud. An ignition of the vapor cloud led to an explosion and the death of 15 contractors working in and around temporary trailers that had been previously sited by bp in close proximity to the blowdown drum. All the fatalities occurred in or near office trailers located close to the blowdown drum. A shelter-in-place order was issued that required 43,000 people to remain indoors. Houses were damaged as far away as three-quarters of a mile from the refinery. 

The investigation provided the following summary of key technical findings: 

  • The ISOM startup procedure required that the level control valve on the raffinate splitter tower be used to send liquid from the tower to storage. However, this valve was closed by an operator, and the tower was filled for over three hours without any liquid being removed. 
  • The process unit was started despite previously reported malfunctions of the tower level indicator, level sight glass, and a pressure control valve.
  • The size of the blowdown drum was insufficient to contain the liquid sent to it by the pressure relief valves. 
  • Neither Amoco nor bp replaced blowdown drums and atmospheric stacks, even though a series of incidents warned that this equipment was unsafe.
  • BP Texas City managers did not effectively implement their pre-startup safety review policy to ensure that nonessential personnel were removed from areas in and around process units during startups, an especially hazardous time in operations.

These examples illustrate the importance of proper planning, safety measures, risk assessment, and meticulous execution during SU/SD service activities in the industrial industry. Adhering to strict safety protocols, conducting thorough inspections, and implementing effective maintenance practices are crucial to prevent such failures and their devastating consequences.

 

The role of reliability programs in decreasing risk

Reliability programs play a crucial role in decreasing the risk of failure during SU/SD service in the refining and chemical industries. Here are some ways in which reliability programs can help mitigate such risks:

Equipment reliability: Reliability programs focus on ensuring the reliability and performance of critical equipment used in refining and chemical processes. By implementing preventive maintenance, condition monitoring, and equipment improvement strategies, reliability programs help identify and address potential equipment failures before they occur. This reduces the risk of equipment breakdowns during SU/SD service.

Predictive maintenance: Reliability programs often incorporate predictive maintenance techniques, such as vibration analysis, thermal imaging, and oil analysis, to detect early signs of equipment degradation. By monitoring equipment health in real-time, potential issues can be identified and addressed proactively, minimizing the chances of failures during SU/SD service.

Quantitative Reliability Optimization (QRO): Reliability programs utilize risk-based methodologies that combine traditional fixed and non-fixed equipment methods with data science and subject matter expertise to prioritize and optimize inspection activities through a holistic analysis during SU/SD service. By considering the equipment's criticality, failure consequences, and inspection history, QRO helps determine optimized inspection and maintenance activities. 

Risk-Based Inspection (RBI): Reliability programs utilize RBI methodologies to prioritize inspection activities during SU/SD service. By considering the equipment's failure consequences, probability of failure, and inspection history, RBI helps determine the appropriate inspection intervals, techniques, and coverage. This ensures that inspections are focused on areas with higher risk, reducing the likelihood of failures.

Spare parts and inventory management: Effective reliability programs include robust spare parts and inventory management practices. Maintaining an adequate inventory of critical spare parts and ensuring their availability during SU/SD service minimizes the risk of delays and extended downtime. Implementing inventory optimization techniques helps optimize spare parts management, ensuring that the required parts are on hand when needed.

Training and competency development: Reliability programs emphasize training and competency development of maintenance personnel. Well-trained employees are better equipped to perform maintenance tasks effectively, follow proper procedures, and identify potential issues during SU/SD service. This helps minimize errors, improve safety, and reduce the risk of failures related to human factors. Adequate staffing of well-trained operations personnel is also required for SU/SD operations.

Root cause analysis: Reliability programs employ root cause analysis (RCA) methodologies to investigate and understand the underlying causes of failures. By identifying and addressing the root causes, rather than just treating symptoms, reliability programs help prevent recurring failures during SU/SD service. This involves implementing corrective actions to eliminate or mitigate the identified root causes, improving the reliability of the equipment and processes.

Continuous improvement: Reliability programs foster a culture of continuous improvement by promoting the use of data and performance metrics. Through data-driven reliability, collecting and analyzing data related to failures, maintenance activities, and process parameters, organizations can identify trends, patterns, and areas for improvement. It’s critical that facilities are not just collecting data but are also ensuring that they are collecting the ideal data from the correct locations through programs like data driven CML Optimization. This allows for proactive decision-making, such as implementing process modifications, equipment upgrades, or maintenance optimization, to minimize the risk of failures during SU/SD service.

 

Conclusion

In conclusion, SU/SD service in the chemical and refining industries presents inherent risks that can compromise safety, environmental compliance, equipment integrity, energy efficiency, cost, and operational flexibility. The examples of the Buncefield Oil Depot explosion and the bp Texas City Refinery explosion serve as stark reminders of the devastating consequences that can result from failures during SU/SD.

Implementing reliability programs can significantly mitigate these risks. These programs focus on ensuring equipment reliability through preventive maintenance, condition monitoring, and improvement strategies. By embracing these strategies and investing in reliability programs, chemical and refining industries can proactively manage the hidden risks associated with SU/SD service. This not only ensures the safety and reliability of their operations but also contributes to improved efficiency, reduced costs, and enhanced operational flexibility. Ultimately, the commitment to mitigating risks during SU/SD paves the way for sustainable and successful industrial operations.

 

References

[1] MacDonald, G. (2011, Spring). Buncefield: Why did it happen?. UK Government Web Archive. https://www.nationalarchives.gov.uk/webarchive/

[2] U.S. CHEMICAL SAFETY AND HAZARD INVESTIGATION BOARD. (2007). REFINERY EXPLOSION AND FIRE: REPORT NO. 2005-04-I-TX . Retrieved July 3, 2023, from https://www.csb.gov/bp-america-texas-city-refinery-explosion/.