Gartner Inc.

03/27/2023 | Press release | Distributed by Public on 03/27/2023 11:57

Autonetics - The Approach to Intelligent Automation Design (Part 2: Learning from Tragedy)

"Automation surprises occur when operators of sophisticated automation, such as pilots of aircraft, hold a mental model of the behavior of the automation that does not reflect the actual behavior of the automation. This leads to increased workload, and reduced efficiency and safety." Formal Method for Identifying Two Types of Automation Surprises (Sherry, et., al.)

In Part 1 of this series, I took a bit of a history tour to discuss the Cybernetics contribution to automation and computer science in general. In Part 2, I look at how we can and should learn from automation going "sour" in real life.

On May 31, 2009 at approximately 7:30 PM local time, Air France Flight 447 took off from Rio de Janeiro to fly to Paris. Several hours later they flew into an area of severe storms. Some four minutes or so after entering the turbulent environment, the autopilot (and auto-thrust) disengaged which gave full control of the airplane (called flying in alternate law) to the pilots. In the next four minutes over 70 cockpit alarms would issue before the aircraft literally stalled into the ocean with the loss of 216 passengers and 12 crew members.

The subsequent investigation(s) found that the Pitot tubes had frozen internally due to a buildup of ice crystals (a known possibility but one which was not communicated to the crew) causing the avionic speed indicators to become erroneous which lead the computer-controlled system to hand-off control to the pilots. From the analysis of the recovered flight data recorder the information presented to the pilots as the transfer of control occurred is believed to be that below:

Figure 1. Electronic Centralized Aircraft Monitor (ECAM) messages.

And this information was presented within the context of the cockpit as indicated below (see Figure 2).

Figure 2. Airbus A-330 203 cockpit and ECAM message display positioning.

Keep in mind that as you view this from the perspective of the pilots, the aircraft was in a pitch-black environment likely punctuated by lightning and with turbulence exacerbated by the inputs from the pilots causing the aircraft to roll. In addition, the Flight Director displays (the consoles with the crossbars on either side of the cockpit that provide feedback on the airplane's path) would flicker on and off and this influenced both of the pilot's actions that would ultimately cause the airplane to stall. Now, let's run through a "checklist" of potential issues that the pilots had to deal with from their perspective as they attempted to control the airplane:

System information easily understood? No (Lacking context and not always available)
System warnings of a potential condition occurring that would necessitate transfer of control? No
System explanation for automation failure and hand-off? No
Pilot mental state likely ready to accept transfer of control? No (Likely because of a lack of warning)
System alarming designed to reduce cognitive load? No (More than 70 stall and other alarms were issued and these continued until near the very end of the flight)
System explanation for alarm shut off?

Note: the stall alarms stopped when the computers thought they were in error as it was not thought that an airplane's velocity could be so slow

No

In the final report, one of the recommendations was that the "display logic" needed to be reviewed especially in the context of a stall (note: while there were audio alarm warnings, there were also no visual indicators of stall). That's an incredible understatement, but as we can see from the above, that was not the only problem. In essence, the pilots didn't know what data to believe nor did they clearly understand the situation that they were in. The report also suggests that the failure to effectively respond to the numerous aural warnings could have been because of the heavy cognitive workloads being experienced by the pilots (note: the report actually references some classic automation and human factors papers that we'll touch on again in a future blog post).

In addition, the "Crew Resource Management" or communications between the pilots was ineffective as neither knew that their actions were cancelling each other out (one was trying to cause the airplane to pitch down to gain airspeed while the other was pulling up to gain lift and the result was a cancellation of each input). This lack of knowledge of the other's actions was because the joysticks in this type of aircraft at the time were not linked so there was no physical feedback of the cancellation scenario playing out.

What seems to be clear is that, besides the crew communication dysfunction, the aircraft itself failed to provide the information necessary for the pilots to understand the problem at hand and effect proper remediation responses. In Part 3, we'll review how the introduction of increasingly intelligent automation will cause even more challenges to occur.