2016-07-05

Disaster by Design/Safety by Intent #39

Disaster by Design

The best aspect of the defense-in-depth approach to nuclear power safety is that one thing, even one very bad thing, is unlikely to trigger an accident. If offsite power is lost, onsite power from emergency diesel generators will automatically take over. If a pipe ruptures to drain cooling water from the reactor vessel, emergency pumps will automatically start up to supply makeup cooling water. If a pump fails, at least one other pump is ready to step in. And so on. It takes a lot of things to defeat defense-in-depth.

Defense-in-depth’s downside is that none of the protective layers is 100% reliable. If any single layer provided absolute reliability, the other layers would not be needed. Instead, defense-in-depth banks on the collective reliability from multiple layers.

This commentary begins a series of posts about times when the multiple layers collectively failed to prevent nuclear plant accidents. The good news is that the number of nuclear plant accidents is relatively small. The bad news is that nuclear plant accidents are becoming more severe.

Sodium Reactor Experiment (Santa Susana, CA) – July 1959

Construction of the Sodium Reactor Experiment (SRE) facility began in April 1955. The operators achieved the initial criticality of the reactor core on April 25, 1957, and supplied electricity to the power grid for the first time on July 12, 1957. Unlike nuclear reactors currently operating in the United States, the SRE used liquid sodium instead of water to remove the thermal energy (heat) produced by splitting atoms.

Fig. 1 (Source: U.S. Department of Energy)

Because liquid sodium reacts “aggressively” when exposed to either water or air (i.e., it explodes or catches on fire), SRE had two sodium loops. The primary loop circulated liquid sodium from the reactor vessel—containing radioactive materials—through a heat exchanger before returning the cooled liquid sodium to the reactor vessel (Fig. 1).

Heat from the primary loop passed through the metal walls of tubes inside the heat exchanger to warm liquid sodium in the secondary loop. The secondary loop circulated liquid sodium—non-radioactive—through a steam generator before returning the cooled liquid sodium to the heat exchanger. If a steam generator tube leak allowed liquid sodium to contact water, the resulting fire or explosion would not release radioactivity.

Heat from the secondary loop passed through the metal walls of tubes inside the steam generator to boil water. Steam from the steam generator flowed through a turbine/generator to generate about 6 megawatts of electricity when SRE operated at full power.

Pumps in the primary and secondary loops circulated the liquid sodium. Tanks in the loops accommodated the expansion of liquid sodium as it heated up and contraction when it cooled down.

Signs of trouble

In late May 1959, troubling signs began emerging. The temperature of the liquid sodium in the primary loop measured where it entered the reactor steadily increased from 545°F to 580°F over three days. Other signs revealed that the heat exchanger was not removing as much heat as it had been doing for the past two years. Consequently, the liquid sodium was not being cooled down as much while flowing through the heat exchanger. And a thermocouple indicated that the temperature of a fuel element inside the reactor core increased from 860°F to 945°F.

By June 2, 1959, workers concluded from the evidence that something was impairing the heat transfer characteristics of the system, causing fuel temperatures to rise and heat removal by the heat exchanger to decline. They attributed the impairment to tetralin leaking into the primary loop’s liquid sodium. Tetralin in-leakage topped the suspect list because tetralin had leaked into the primary loop in late 1958 and early 1959 to cause similar results, albeit of lesser severity.

Tetralin (tetrahydronaphthalene) is an organic compound used to seal the sodium pumps. The pumps’ electric motors spun long metal shafts connected to impellers that pushed the liquid sodium through the loops. Liquid sodium leaking through the tiny openings between the rotating shaft and the pump casing would cause problems upon contact with air. The tetralin filled these openings to prevent liquid sodium leaking from the pumps.

SRE was shut down on June 3 to repair the sodium pump in the primary loop. Workers replaced the tetralin seal on the pump with one of a different design. They bubbled nearly 400,000 cubic feet of nitrogen gas through the primary loop’s liquid sodium trying to remove the tetralin. Their efforts removed about three pints of tetralin and less than a half gallon of naphthalene crystals from the primary loop. The post-accident analysis estimated that four gallons of tetralin had leaked into the primary loop.

Restarting the reactor

Then they restarted SRE on July 12 at 6:50 am. At 3:30 pm that afternoon, the radiation levels in the reactor room significantly increased. Workers suspected that radioactivity was leaking from a penetration through the reactor’s top for an instrument monitoring the level of liquid sodium inside the reactor. They shut down the reactor and replaced the level probe with a solid plug.

Then they restarted SRE on July 13. This time they encountered power level anomalies. The operators began inserting control rods after SRE’s power level began increasing unintentionally. Despite their efforts to lower the power level, it more than doubled over the next 30 minutes. Matters worsened. At 6:24 pm, SRE’s power level began doubling every 8 seconds. The operators manually scrammed the reactor at 6:25 pm; causing the control rods to rapidly enter the reactor core and terminate the nuclear chain reaction.

Then they restarted SRE at 7:55 pm on July 13. At 1:00 pm on July 14, SRE automatically shut down when an electrical short turned off the sodium pump in the primary loop.

Then they restarted SRE at 1:11 pm on July 14. The impaired ability of the heat exchanger to transfer heat from the primary loop’s liquid sodium to the secondary loop’s liquid sodium meant that the steam generator could produce small amounts of steam. The heat exchanger and steam generator issues combined to limit the maximum power level SRE could attain. So, workers shut down SRE on July 15. They closed valves to isolate the steam generator in the secondary loop and opened valves to route the liquid sodium through the airblast heat exchanger. The airblast heat exchanger functioned like a car’s radiator. Liquid sodium flowed inside tubes within the airblast heat exchanger. Air flowing past the outside of the tubes removed heat, cooling the liquid sodium inside. Workers additionally increased the maximum limit on the temperature of the primary loop’s liquid sodium to enable SRE to operate at a higher power level.

Then they restarted SRE at 7:04 am on July 16. The control rods had to be withdrawn further to achieve criticality then was required during other recent startups. Workers noted this difference, but could not explain it. So, they increased SRE’s power level. SRE automatically shut down at 2:10 am on July 21 due to an indication that the power level was increasing too rapidly. The instrument providing that signal was supplied electricity from an unstable power source, so workers assumed the signal was spurious and false.

Then they restarted SRE at 2:25 am on July 21. Four hours later, radioactivity levels in the reactor building increased significantly. The operators manually shut down the reactor at 9:45 am; not to investigate the reason for the high radiation readings but due to a mechanical problem in the secondary loop. Workers repaired the problem.

Then they restarted SRE at 11:30 am on July 21. The temperature of the liquid sodium was fluctuating more than normal. The temperature of liquid sodium flowing out of some reactor core locations rose to 900 to 1,000°F, more than 200°F higher than had been experienced. And the thermocouple measuring the temperature of the fuel element in channel 55 indicated an abnormally high temperature. Workers attributed the anomalous temperature indications to faulty instruments and increased SRE’s power level.

SRE automatically shut down at 9:50 pm on July 23 due to an indication that the power level was increasing too rapidly. Workers discounted the signal, assuming it was caused by an electrical transient in the instrument’s power supply.

Then they restarted SRE at 10:15 pm on July 23. Thinking that foreign material might be obstructing the inlets to the fuel elements and partially blocking the liquid sodium flow passed them, workers “jiggled” fuel elements in the early morning hours of July 24 to dislodge any debris. They lowered fuel handling equipment into the reactor core to lift individual fuel elements a short distance before lowering them back into place. Four fuel elements were stuck in place and could not be “jiggled.” SRE automatically shut down at 12:50 pm on July 24 due to an indication that the power level was increasing too rapidly. Workers discounted the signal, assuming it was caused by an electrical transient in the instrument’s power supply.

Then they restarted SRE at 1:14 pm on July 24. Despite all the bubbling and jiggling, SRE continued to behave unexpectedly. Workers shut down SRE on July 26 to inspect each fuel element that had exhibited high outlet temperatures with a television camera and to try to remove any debris obstructing the inlets.

Fig. 2 (Source: U.S. Department of Energy)

Thirteen fuel elements had been damaged as the reactor operated for two weeks between July 12 and 26 (Fig. 2). For example, when workers tried to remove the fuel element in location 69 on July 27, it broke apart with the lower two-thirds remaining in the core. The reactor core only contained 43 fuel elements, so about one-fourth of the reactor core was damaged.

Missed Opportunities = Pre-Existing Problems = Reactor Accident

Numerous opportunities were missed at SRE to intervene at an early stage. Consequently, pre-existing problems were enabled rather than exorcised and they eventually conspired to damage nearly one quarter of the reactor core.

In late 1958, tetralin leaked into the primary loop’s liquid sodium due to a design flaw. Nothing was done about this design flaw, other than treat its symptoms, until it caused another tetralin leak in mid 1959. Then the design was modified to eliminate the potential for harmful tetralin in-leakage.

By June 1959, workers concluded that tetralin in-leakage caused fuel and primary loop liquid sodium temperatures to rise. This conclusion strongly suggested that the stuff had coated fuel and heat exchanger surfaces, blanketing them to impede the transfer of heat out of the system. They took steps to remove tetraline from the liquid sodium, but did nothing to verify that the substance had not coated internal surfaces.

By the third week of July 1959, there were ample signs of problems. Suspecting that debris was partially blocking cooling flow, workers attempted to “jiggle” fuel elements to dislodge the debris. But workers were unable to “jiggle” some fuel elements, likely because the debris prevented their movement.

During repeated restarts of the reactor during that period, the control rods had to be withdrawn to unexpected positions to achieve criticality. Workers could not explain why this was happening, but opted not to stop and try to figure it out. The post-accident evaluation concluded the control rods had to be withdrawn farther than expected to compensate for the increasing damage to the reactor core.

When the decision was finally made to visually inspect the fuel elements and to remove the debris, it was too late to prevent damage to one quarter of the reactor core.

Safety by Intent

Workers glimpsed pieces of a jigsaw puzzle as SRE operated between July 12 and 26, 1959, but did not see the full picture until all the pieces were later put together to clearly show a reactor accident unfolding. One could argue that they saw enough of the pieces before July 26 to figure out that the reactor core was not being adequately cooled and at least lessen the extent of the damage

But hindsight is nearly always 20/20. It is easy to grasp the significance of warning signs after they cause or contribute to an accident. It is considerably harder to acquire this same understanding before the post-accident investigations reveal what had been hiding in plain sight.

1959 was literally and figuratively a different time. One of the many differences between then and now is the formal assessments required before a reactor experiencing an unplanned shut down can be restarted. Following two unplanned shut downs at the Salem nuclear plant in New Jersey in February 1983, the NRC required owners to complete post-trip reviews before restarting reactors. These reviews are supposed to verify that the plant responded as expected. When the reviews identify “surprises,” workers are supposed to determine whether equipment, procedure, training, or design deficiencies factored in the “surprises.” If so, such factors must be corrected or compensated for prior to restarting the reactor.

Essentially, the post-trip reviews seek to gather all available puzzle pieces and put them together to see as much of the picture as possible. Had this process been in place at SRE in 1959, it might have revealed the inadequate reactor core cooling problem sooner. Perhaps not during the post-trip review into the shut down on July 12, or for the shut down on July 13, or even for the shut down on July 14. There were at least eight reactor shut downs between July 12 and July 26, each one turning over more pieces of the puzzle.

The easiest way to miss the bigger picture emerging was to not even look for it. The NRC essentially removed the easy way by forcing owners to take a look. The number of accidents and near-misses avoided by the post-trip reviews cannot be counted. But that number is almost certainly greater to or equal to the number of valid reasons for not doing an adequate post-trip review.

—–

UCS’s Disaster by Design/Safety by Intent series of blog posts is intended to help readers understand how a seemingly unrelated assortment of minor problems can coalesce to cause disaster and how effective defense-in-depth can lessen both the number of pre-existing problems and the chances they team up.

Blog.ucsusa.org

Nuclear Plant Accidents: Sodium Reactor Experiment