The bungled design of BART’s automatic train control system is now being taught as part of an Engineering Ethics class at Cornell University:
The 1970s Bay Area Rapid Transit System (BART) whistle-blower case is widely taught as an engineering ethics case, and was originally taught as such within the Engineering Communications Program. In this case, three BART engineers found technical problems in the automatic train control (ATC) of the new commuter rail system–problems that, they concluded, could risk public safety. When the engineers brought the ATC problems to the attention of their managers, no action (apparently) was taken; when they persisted by bringing the problems to the attention of the BART’s Board of Directors, which led to the release of information in the local news media, the engineers were fired. The case led to a lawsuit in which the IEEE filed an amicus curiae brief on behalf of the three engineers and to a series of public hearings and reports.
The story of the BART whistleblowers is one I had not heard before. According to class materials, it went down like this:
The Automatic Train-Control (ATC) system was an innovative method for controlling train speed and access to stations. In most urban mass transit systems, this function is performed by human drivers reading trackside signals and receiving instructions via radio from dispatchers. Instead, BART relied on a series of onboard sensors that determined the train’s position and the location of other trains. Speeds on the track were automatically maintained by monitoring the location of the train and detecting allowed speed information.
One of the unique and problematic features of the system was that there were no fail-safe methods of train control [Friedlander, 1972]. Rather, all control was based on redundancy. This distinction is very important. “Fail safe” implies that if there is a failure, the system will revert to a safe state. In the case of BART, this would mean that a failure would cause the trains to stop. Redundancy, on the other hand, relies on switching failed components or systems to backups in order to keep the trains running.
There are two distinct phases of this type of engineering project, construction and operation, each requiring different skills. For this reason, early on, BART decided to keep its own staff relatively small and subcontract most of the design and construction work. This way, there wouldn’t be the need to lay off hundreds of workers during the transition from construction to operation [Anderson, 1980]. This system also encouraged the engineers who worked for BART not only to oversee the design and construction of the system, but also to learn the skills required to run and manage this complex transportation system. Contracts for design and construction of the railroad infrastructure were awarded to a consortium of large engineering firms known as Parsons, Brinkerhoff, Tudor, and Bechtel (PBTB). PBTB began construction on the system in January of 1967. The transbay tube was started in November of that year. Also in 1967, a contract was awarded to Westinghouse to design and build the ATC. In 1969, Rohr industries was awarded a contract to supply 250 railroad cars.
A little bit should be said about the management structure at BART. By design, BART was organized with a very open management structure. Employees were given great freedom to define what their jobs entailed and to work independently and were encouraged to take any concerns that they had to management Unfortunately, there was also a very diffuse and unclear chain of command that made it difficult for employees to take their concerns to the right person [Anderson, 1980].
The key players in this case were three BART engineers working on various aspects of the ATC: Roger Hjortsvang, Robert Bruder, and Max Blankenzee. The first to be employed by BART was Hjortsvang. As part of his duties for BART, Hjortsvang spent 10 months in 1969–70 in Pittsburgh at the Westinghouse plant working with the engineers who were designing the ATC. During this time, he became concerned about the lack of testing of some of the components of the ATC and also about the lack of oversight of Westinghouse by BART. After returning to San Francisco, Hjortsvang began raising some of these concerns with his management.
Soon after Hjortsvang returned from Pittsburgh, Bruder joined BART, working in a different group than Hjortsvang. He also became concerned about the Westinghouse test procedures and about the testing schedule, but was unable to get his concerns addressed by BART management. Both Hjortsvang and Bruder were told that BART management was satisfied with the test procedures Westinghouse was employing. Management felt that Westinghouse had been awarded the contract because of its experience and engineering skills and should be trusted to deliver what was promised.
Around this time, both engineers also became concerned about the documentation that Westinghouse was providing. Would the documentation be sufficient for BART engineers to understand how the system worked? Would they be able to repair it or modify it once the system was delivered and Westinghouse was out of the picture? Being unable to get satisfaction, Hjortsvang and Bruder dropped the matter. It is important to note that the concerns here were not just about testing, per se, but also about the effect that untested components might have on the safety and reliability of BART.
Blankenzee then joined BART and worked at the same location as Hjortsvang. Before joining BART, Blankenzee had worked for Westinghouse on the BART project, and so he knew about how Westinghouse was approaching its work. He too was concerned about the testing and documentation of the ATC. When Blankenzee joined BART, it rekindled Hortsvang’s and Bruder’s interest in these problems. To attempt to resolve these concerns, Hortsvang wrote an unsigned memo in November of 1971 to several levels of BART management that summarized the problems he perceived. Distribution of an anonymous memo was, of course, viewed with suspicion by management.
In January 1972, the three engineers contacted members of the BART board of directors, indicating that their concerns were not being taken seriously by lower management. This action was in direct conflict with the general manager of BART, whose policy was to allow only himself and a few others to deal directly with the board [Anderson, 1980]. As defined previously in this chapter, this action by the engineers constituted “internal whistleblowing.” The engineers also consulted with an outside engineering consultant, Edward Burfine, who evaluated the ATC on his own and came to conclusions similar to those of the three engineers.
One of the members of the board of directors, Dan Helix, spoke with the engineers and appeared to take them seriously. Helix took the engineer’s memos and the report of the consultant and distributed them to other members of the board. Unfortunately, he also released them to a local newspaper, a surprising act of external whistleblowing by a member of the board of directors. Naturally, BART management was upset by this action and tried to locate the source of this information. The three engineers initially lied about their involvement. They later agreed to take their concerns directly to the board, thus revealing themselves as the source of the leaks. The board was skeptical of the importance of their concerns. Once the matter was in the open, the engineers’ positions within BART became tenuous.
On March 2 and 3, 1972, all three engineers were offered the choice of resignation or firing. They all refused to resign and were dismissed on the grounds of insubordination, lying to their superiors (they had denied being the source of the leaks), and failing to follow organizational procedures. They all suffered as a result of their dismissal. None was able to find work for a number of months, and all suffered financial and emotional problems as a result. They sued BART for $875,000, but were forced to settle out of court, since it was likely that their lying to superiors would be very detrimental to the case. Each received just $25,000 [Anderson, 1980].
As the legal proceedings were taking place, the IEEE attempted to assist the three engineers by filing an amicus curiae (friend of the court) brief in their support. The IEEE asserted that each of the engineers had a professional duty to keep the safety of the public paramount and that their actions were therefore justified. Based on the IEEE code of ethics, the brief stated that engineers must “notify the proper authority of any observed conditions which endanger public safety and health.” The brief interpreted this statement to mean that in the case of public employment, the proper authority is the public itself [Anderson, 1980]. This was perhaps the first time that a national engineering professional society had intervened in a legal proceeding on behalf of engineers who had apparently been fulfilling their duties according to a professional code of ethics.
Safety concerns continued to mount as BART was put into operation. For example, on October 2, 1972, less than a month after BART was put into revenue service, a BART train overshot the station at Fremont, California and crashed into a sand embankment. There were no fatalities, but five persons were injured. The accident was attributed to a malfunction of a crystal oscillator, part of the ATC, which controlled the speed commands for the train. Subsequent to this accident, there were several investigations and reports on the operation of BART. These revealed that there had been other problems and malfunctions in the system. Trains had often been allowed too close to each other; sometimes a track was indicated to be occupied when it wasn’t and was indicated not to be occupied when it was. The safety concerns of the three engineers seemed to be borne out by the early operation of the system [Friedlander, 1972, 1973].
Ultimately, the ATC was improved and the bugs worked out. In the years since, BART has accumulated an excellent safety record and has served as the model for other high-tech mass transit systems around the country.
I have some doubts as to whether the problems have ever been completely fixed. On two occasions I’ve been on BART trains which overshot the station, though not in dramatic fashion like the “Fremont Flyer”.
There are a number of broken links from that Cornell class page (132.236.67.210/engrc350/… , instruct1.cit.cornell.edu/courses/engrc335/…), but I found them at
http://eng-web.engineering.cornell.edu/EngrWords/genres/analyses.cfm
The really horrible thing is that there ARE mostly-failsafe methods of train control, contrary to the statement of the Friedlander paper. There were even in the 1920s, when cab signalling and the PRR pulse-code system were introduced.
Nothing is 100% failsafe; there are always extreme corner cases where switches get stuck “on” when they’d normally be stuck off; but we’re not talking corner cases here. BART made *fundamental* errors in its original ATC design, errors which have not been made in other train control systems. BART has since added a bunch of kludges to try to make the system failsafe. Washington Metro, which copied BART’s system, never added the kludges and the fairly recent Metrorail crash was the result.
In fact, designing ATC to be failsafe is perfectly possible. The first principle is that the trains only move when they are receiving continuous movement authority through the signalling system (so, most failures in the signalling system cause the trains to stop) — but judging by the cause of the Metrorail crash, BART and Washington Metro *violated this rule* in their original design! Which is appalling.
I think I am giving the Friedlander paper insufficient credit, not having read it. The words quoted are ambiguous and probably mean simply that BART was not failsafe and that all other ATC systems were failsafe; presumably Friedlander cites the failsafe ATC systems in existence.
Photos of the so-called Frement Flyer: http://www.flickr.com/photos/walkingsf/8143196966/.