Required Changes

A lot has been written about the recovery from the Columbia accident in terms of changes we needed to make to get back to flying the Shuttle again. In general, the changes fell into two categories. One bucket contained changes to hardware, the other were changes to management practices.

In the early summer of 2003, we didn’t know how much time we’d eventually have to make these changes—just that we’d take whatever time was necessary to get them done, and with the confidence we did them right. But based on the recovery from the Challenger accident of 1986, we figured we wouldn’t be flying again for a couple of years. Could be longer; could be a little quicker. But the charge to all of us was to get the work done correctly, first and foremost. Sort of like resolving a problem in the final throes of launch countdown – solve the problem first, then look up at the clock and see if you have any time left in which to launch.

That’s not to say we were lackadaisical about it. Hardly. We were well aware of the need to get flying again to the ISS. But once again, it was ‘schedule awareness’ vs ‘schedule pressure’. There was a difference from the time critical launch environment of course where technical problems were solved based solely on data, and bad decisions couldn’t be recalled. In the recovery period, lengthy, philosophical debates were fairly common. But decisions needed to be made and progress in the improvements needed to be real.

The foam loss problem on the external tank needed to be fixed. Adding the capabilities to inspect the Orbiter’s tiles and effect some level of repair prior to re-entry was also necessary. These were obviously the top two flight hardware upgrades undertaken. But each Project (Orbiter, ET, SRB, Ground Processing, etc.) was asked to essentially recertify their existing system as flight-worthy, or suggest upgrades aimed at improving safety margins. These suggestions would be debated at the Program-level change boards and either accepted for implementation (and funded) or not.

Changes weren’t too widespread for us at KSC and the Ground Processing directorate. For the most part, our work practices on the flight hardware were mature and adequate. Extra care was to be taken when working on the External Tank’s foam to avoid damage, but nothing too onerous.

One significant finding in the accident review that we were responsible for correcting was the inadequate ascent imagery. As you may recall, on Columbia‘s final launch one ground tracking camera was inoperable, another was out of focus, and the just sheer number of assets documenting the critical portion of ascent couldn’t guarantee the full suite of images necessary to help resolve issues. As a result, we undertook a complete review of the ‘imagery system’ composed of tracking video cameras, still photography, high-speed engineering film assets, and the Operational Television System (pad cameras). We needed to be sure we had enough visual documentation to address issues, and have confidence on launch day the assets were working and could ‘see’ the vehicle. The Columbia Accident Investigation Board (CAIB) even recommended we have Launch Commit Criteria (LCC) for the system. More on that in a moment.

In addition to improving the visible launch documentation we needed some sort of long-range tracking system that could detect issues long after ground-based cameras effectively lost sight of the vehicle. Later – the C-band radar system. Likewise, on-orbit imagery needed to be understood and policies firmed up to enlist help from the intelligence community if needed.

C-band radar dish
This 50-ft. C-band radar dish was installed near Haulover Canal north of the KSC launch complex, as one of three radar dishes used in the new Debris Radar System. The other two were on ships. (NASA photo)

For the sake of brevity, the final ground-based system we installed was one of guaranteeing adequate views at least through SRB separation, from three independent positions, and from both north of the pad and south of the pad. We needed close-in views, mid-length (2-5 miles), and longer-range views from 10 miles or beyond. No distance requirement was set, just that we had these three ‘zones’ covered. Obviously, siting the individual assets would be case-dependent. At least two cameras at each location added to the certainty of coverage. The status of each would be reported to the responsible system engineer on the launch team and relayed to us. They would be committed for launch during the hold at T-9 minutes.

What about the CAIB launch commit criteria requirement? What about clouds obstructing one or more views? What about night launches? Good questions.

The CAIB did not specify what type LCC they wanted, although in informal talks they were going after specific camera views and operability. Given the uncertainty of guaranteeing views, I opted to enact an LCC based solely on the system operating properly. The issue of adequate views (cloud coverage, one or more specific cameras being down, etc.) was left to judgment on launch day. That decision would be made jointly by me (as Launch Director) and the Mission Management Team chairperson. The CAIB accepted the idea, so we pressed on with buying and installing an elaborate collection of video and still cameras located north and south of the pad. And we installed a control system for the cameras close to or at the pad. It was that control system that had the LCC. On launch day, the pre-launch MMT chair and I would get information on the views we would get during ascent and would decide if we’d launch with anything less than the full complement.

We had a requirement to launch during the light of day for the Return to Flight mission. That mandate remained in place until we had confidence the foam loss issue was resolved, AND that the radar system could detect debris issues regardless of daylight. We relaxed the lighted-launch requirement starting with STS-116 in December 2006, the first night launch of a Shuttle since the Columbia accident.

The system proved to be a great addition to the safety for the astronauts and the vehicle. Never again would the vehicle be hidden from view during ascent. We had enough cameras to make up for one or two not working as designed and had all angles covered. Ground-based imagery never caused a scrub and always provided clear views of the vehicle – and plenty of them.

The Stafford-Covey Return-to-Flight Task Group

International treaty required the United States to complete the core assembly of the International Space Station, up through the installation of Node 2 (later called the Harmony module) as soon as possible. NASA had previously committed to the US Congress that Node 2—onto which the European Space Agency’s Columbus module and the Japanese Kibo module would be berthed—would be launched by February 2004.

While meeting that date was clearly impossible after the Columbia accident, NASA was still compelled to complete its share of the work on the ISS as soon as possible. There were still many flights needed to complete the ISS’s central truss and expand its solar power system before Node 2 could be installed. None of that work was possible without the shuttle. The modules were already built, but there was no other way to get them into space and support the spacewalks necessary to install them. NASA therefore had to get the shuttle flying again.

In May 2003, three months after the accident and before the Columbia Accident Investigation Board (CAIB) had completed its investigation, NASA expected to resume shuttle operations by the end of 2003 or early 2004. NASA wanted to be sure that it was not letting schedule and political pressure force the agency into taking undue risks.

In early May, NASA Deputy Administrator Fred Gregory announced that former astronaut Lt. General Thomas Stafford had been requested to head a group to provide an independent assessment of NASA’s return-to-flight plans. On May 22, 2003, NASA named former shuttle astronaut Richard Covey to report to Stafford and lead a working group to oversee and test NASA’s compliance with the CAIB’s findings and recommendations. Some of the members of the panel included former Secretary of the Navy Richard Danzig, Apollo 8 astronaut Bill Anders (who was also the retired CEO of General Dynamics), and former NASA Launch Director Bob Sieck, among a host of other government and industry executives and technical experts.

NASA Administrator Sean O’Keefe said that NASA would only decide that it was safe to fly the shuttle again when the Administrator had the Stafford-Covey Task Group’s independent confirmation that NASA had fully complied with the CAIB’s recommendations.

Stafford Covey group jsc2003e56782
NASA’s Joy Huff shows a space shuttle leading edge subsystems panel to members of the Stafford-Covey Task Group in August 2003. From left: Dr. Amy Donahue, David Lengyel, Dr. Katherine Clark, Richard Covey, and William Wegner. (NASA photo)

The Task Group went into full operation once the CAIB’s report was issued in August 2003. The CAIB made 15 specific recommendations that NASA needed to address before the shuttle could return to flight. Many of those findings required extensive changes to hardware, procedures, and management practices.

NASA’s hopes of flying again in 2003 or 2004 quickly were overtaken by the realization that there was a long and difficult road ahead. By December 2003, the planned launch date had moved to September 2004. However, an interim report by the Stafford-Covey Task Group that month said that “progress on the many recommendations is uneven” and that it was too soon to say whether that new launch date was possible. The Task Group’s interim report also chided NASA for not being timely in responding to some requests for information.

It was not comfortable information for O’Keefe to hear. However, it meant that the Task Group was doing its job of being “an umpire calling balls and strikes in a zone defined by the CAIB recommendations.”

The Task Group issued additional interim reports in April 2004 and January 2005, noting progress as well as areas that still required attention.

By June 2005, NASA had closed out all but three of the CAIB’s recommendations. The Task Group believed that the three remaining recommendations were so challenging that NASA could not comply completely with the intent of the CAIB. For example, the most contentious open item was a vaguely-worded recommendation that NASA have the ability to repair the “widest possible range of damage” when the shuttle was on orbit.

In July 2005, the Task Group was satisfied that NASA had done everything in its power to make the shuttle as safe as possible to fly again, and they told the Administrator that NASA had met the intent of the CAIB’s requirements for returning to flight. The Task Group’s final report made it clear, however, that it was up to the NASA Administrator and his staff—not the CAIB or the Task Group—to determine if the remaining risk was low enough to allow the shuttle to fly.

Shuttle Discovery launched on the STS-114 mission on July 26, 2005. Although the external tank unexpectedly (alarmingly) shed foam again, the safety inspection and repair techniques that NASA developed in the wake of the CAIB report ensured that the crew was able to complete their mission and return safely to Earth.

Returning to a New Normal

As the work in the reconstruction hangar wound down and people gradually returned to their pre-accident jobs, we found ourselves being re-integrated back into a sort of ‘new normal’.

The atmosphere was different, the work itself was different, and the Shuttle was likely on borrowed time. Combine this new normal with the still-present emotional response to the Columbia accident, and you get a workforce with more questions than we could answer, more concern for their futures than confidence—people more in need of direction than ever.

Those of us in leadership and management positions had lots to do dealing with the ongoing CAIB investigation. We were concerned about what it was going to take to get us ready to fly again, debating changes to the External Tank, Orbiter, and other systems. But by far, the most important thing we had to do was to lay out the future for the workforce. The difficulty was that the future was anything but clear for months to come.

Many months.

We needed to stay together as a team despite having no firm game plan. And while everyone understood the uncertainty, it was still an extremely unusual feeling. It would clear up after a couple more months. We would fly again to fulfill international agreements and finish the International Space Station (ISS). But when would we fly again? Would layoffs be coming in the interim? And then once we got back in business, how long would the Shuttle continue in operation? We had originally envisioned flying until 2020, but that was likely to be cut short once ISS assembly was completed.

Open and honest communication throughout all organizations and at all levels became even more important than usual. While we were short on answers, we acknowledged it—and the folks appreciated the candor.

Personally, I thought it was very important to begin to look forward as soon as practical. Not as soon as possible, but as soon as it made sense to do so. In May, 2003 I asked a few close team members what they thought of getting back into launch countdown simulations soon. The responses were split about 50-50. I really wanted to do it to accomplish two main objectives. First, we needed to maintain our proficiency for the inevitable return to flight. Secondly, it would demonstrate to the launch team and to the rest of the processing team that we really were going to fly again. People knew when the team went into training for the day. It was obvious.

So I asked the simulation team to begin to develop a series of training sessions to begin as soon as they could. And on June 1, exactly 4 months after the accident, the Shuttle Launch Team was back together, doing what we did best.

The feeling in the Firing Room that day was unusual to be sure. It was a mix of somber and joy. Reflection and anticipation. But it felt right, too. The “rust” was virtually non-existent, and the team performed exceptionally well.

firing room console
Firing Room 4 launch console, with an open countdown procedure manual from the STS-135 mission. (Photo by Jonathan Ward)

It turned out to be exactly the right thing to do and at the right time. We held sims approximately every six weeks thereafter.


As the return to flight plan firmed up, numerous other training sessions were held—Mission Management Team sims, NASA HQ contingency sims, launch sims, landing sims, etc. Everyone got to participate, and rightfully so—because we were going to fly the Shuttle again.