Laura Bliss is a staff writer at CityLab, covering transportation and the environment. She also authors MapLab, a biweekly newsletter about maps (subscribe here). Her work has appeared in the New York Times, The Atlantic, Los Angeles magazine, and beyond.
Before the fatal crash in Tempe, Uber’s self-driving test program had safety weaknesses, ex-employees say. Some weren’t avoidable, technologists believe, but some might have been.
The first time Ryan Kelley lifted his hands off the wheel of a self-driving Uber, he felt like he’d landed a role in a dress rehearsal for the future.
This was in February of 2017 in Pittsburgh, where Uber had been testing SUVs equipped with proprietary self-driving technology on public streets for about five months. Some of the vehicles picked up passengers through Uber’s regular ride-hailing app—the first time self-driving cars had been so accessible in a U.S. market.
Encouraged by his nine-year-old daughter, Kelley had left a tech support job to become an Uber “developmental vehicle operator”—a backup safety driver for the nascent robot-cars. The social promise of autonomous vehicles has always been clear: The adoption of this technology could someday prevent tens of thousands of traffic fatalities every year. “This was cutting-edge technology,” Kelley said over the phone last week. “Who doesn’t want to be part of something like that?”
Now a pedestrian has been killed by a self-driving Uber vehicle in Tempe, Arizona. Interior video footage of the crash, which occurred the night of March 18, shows that the operator of the vehicle, 44-year-old Rafaela Vasquez, was looking down with her hands away from the wheel when the car struck Elaine Herzberg, a 49-year-old woman pushing her bike across the otherwise empty seven-lane road. The car was traveling at about 40 miles per hour when it struck Herzberg; according to police reports, it made no attempts to brake before the collision.
In the wake of the fatal crash, Uber voluntarily halted all of its test programs and has been suspended from testing in Arizona by state authorities. While a formal investigation has only begun, it’s clear that the crash represents more than another pedestrian death. It’s the milestone that autonomous vehicle supporters and researchers have long dreaded, and one that highlights critical weaknesses in the development of this emerging technology.
One failure involves the machine itself. Experts in autonomous vehicle research agree that the Uber vehicle’s Light Detection and Ranging (LiDAR) sensors should have been able to see Herzberg in advance of the collision and the car should have responded accordingly.
Yet the crash raises another question that is perhaps harder to answer by any one investigation: whether humans can be expected to perform safely as backup drivers for semi-autonomous cars at this stage of testing, by Uber or any other company. Research shows that humans will always be fallible, distractable, and, perhaps, easily seduced by machines that function safely—most of the time.
Furthermore, Kelley and another former backup driver interviewed by CityLab suggested that safety drivers were rushed into vehicle testing conditions for which neither they nor the technology was fully prepared. Kelley and a Tempe-based former Uber operator (who spoke on condition of anonymity due to a non-disclosure agreement) described a work environment in which they were expected to function not unlike the machines they were minding, staying alert through long hours with little stimulus as the company advanced its goal of racking up mileage.
Both Kelley and the former driver in Tempe were dismissed from their jobs with Uber earlier this year for safety infractions: Kelley said he was let go after rolling through a stop sign while he was operating the car, which he disputes; the individual in Tempe said he was dismissed for using his phone while the vehicle was in motion.
That former driver said he had worked for Uber’s self-driving project since its launch in 2016, and expressed pride in the “thirty- to forty-thousand safe autonomous miles” he had logged. He said that he regretted no longer working there. Kelley, however, maintains that his dismissal was wrongful and expressed bitterness.
The two ex-operators agreed that conditions they had once worked under led to fatigue and dangerous temptations, the likes of which may have taken Vasquez’s eyes off the road and hands away from the wheel. Referring to the fatal crash, Kelley said: “We saw this coming.”
“That’s her one job”
Suspensions notwithstanding, Uber tests its developmental driverless technology on public roads in Pittsburgh, Tempe, Phoenix, San Francisco, and Toronto with about 400 backup drivers in roughly 200 vehicles. Those numbers represent a fast hiring ramp-up focused in regulation-free Arizona over the past nine months, as the company has shifted the focus of its testing.
Depending on the day, some vehicles still pick up passengers. But Uber has switched much of its autonomous fleet to simply driving in an effort to orient operators towards “accumulating miles and gathering data to help the system become more reliable,” according to a New York Times article based on leaked company documents. As of September 2017, the autonomous fleet had driven one million miles across its test cities. By December 2017, it celebrated two million miles, and “added its next million at an even faster clip, according to company documents,” the Times reported.
Uber’s self-driving technology had already been struggling before the fatal crash in Tempe, according to several published reports. An internal performance report about Uber’s driverless testing in Arizona showed that some vehicles were having trouble going more than a mile without an operator taking over the wheel or braking, as Buzzfeed’s Priya Anand reported. During the week of March 5, 2017, cars traveled an average of 0.67 miles on one loop in Tempe without human intervention, and an average of 2 miles without a “bad experience,” which is the company’s term for an incident “in which a car brakes too hard, jerks forcefully, or behaves in a way that might startle passengers,” Anand wrote.*
However, both former operators with whom CityLab spoke said that Vasquez, who has not been charged, appears to bear a lot of the blame for the crash. Dashboard camera footage shows that Vasquez is looking away from the road and into her lap, where her hands also appear to be. The assumption is that, like so many other distracted drivers out there, Vasquez was looking at her mobile device instead of the road.
“That was one of our biggest rules: No phone in the left seat,” said the Tempe-based former operator. (Indeed, this was the offense that led to his own dismissal.) “Her one job was to be there to monitor and ensure that the robot keeps going safely. That’s her one job that she wasn’t doing.”
According to an Uber spokesperson, safety drivers undergo a “vigorous” three-week training, testing, and certification program before they hit the road. In it, they are trained to keep hands hovering near the wheel at all times so that they can quickly take control when the car does not safely respond to dangerous road conditions or “noncompliant actors,” such as people walking in the roadway. While Uber allows operators to keep phones in their vehicles in case of emergencies, the company has a zero-tolerance policy for the use of digital devices while vehicles are in motion. The company monitors driver behavior via spot-checks of interior dash-cam footage, self-reporting, and reports from other operators.
So far, roughly a dozen operators have been fired for using cellphones on the job, an Uber spokesperson said.
Behind the wheel, a numbing testing regimen
The job of a developmental vehicle operator may sound deceptively easy—after all, the vehicle is doing most of the work. But both ex-drivers described a grueling routine behind the wheel. The full-time role requires operators to sit alert behind the wheel for 8- to 10-hour shifts, with one scheduled 30-minute lunch break. Workers are often assigned to repeat the same “loops” over the course of their shift, in order to deeply familiarize the car’s self-driving technology with portions of the city Uber has mapped.
Keeping an eye on the road for hours on end as a robot learns to drive is an exhausting job, according to the ex-employees. The solitude and monotony of the repeated routes made it difficult to maintain focus—more so than driving a long-distance truck or a taxi. “There’s no interaction with other humans,” said Kelley. “You could listen to music or sit in silence.” Between that, sitting for long stretches in a single position, and the vehicles’ frequent hard braking, the daily grind of operating the vehicles wore down on his health, Kelley said.
Although operators were never given mileage goals, Kelley said that he felt pressured by some of his managers to rack up distance and forgo breaks. The Uber spokesperson said that drivers are encouraged and free to take 10 to 15 minute rests as much as needed to avoid fatigue, with no consequences.
Still, as the vehicle technology improved, the temptation to let the mind wander (or to pick up a phone) while on the road was that much greater, the ex-employees said. “An hour or two hours can go by without ever once having to take over the wheel,” said the anonymous former safety driver, who said he’d worked on the same loop in Tempe where the fatal crash occurred. “You think, maybe I could take my eyes off the road, because, if anything happens, the car is going to react.”
There were times when Kelley felt he didn’t trust himself in the car. Hours into his night shift, he said, he could feel his attention waning, his impulses relaxed by the most-of-the-time reliability of the self-driving technology. That bred a false sense of security.
The other operator agreed. “Uber is essentially asking this operator to do what a robot would do,” he said. “A robot can run loops and not get fatigued. But humans don’t do that.”
What’s more, many backup drivers work alone. When the company began testing in Pittsburgh, Uber staffed its autonomous vehicles with two operators at all times. The second person sat shotgun with a laptop to make notes and submit data for Uber engineers. But towards the end of 2017, the company moved to having many vehicle operators in the “left seat” only, working alone.
Both of the former employees CityLab spoke with said that eliminating the second human operator was premature because that person played an indirect role in maintaining safety. Not only did co-pilots provide a source of stimulation for “left seat” operators, “they would be watching the sidewalks to see if anybody is going to dart out and into traffic,” Kelley said.
An Uber spokesperson said that the second operator had been there strictly in a note-taking capacity and that the reduction was a carefully considered step in the technology’s progression towards higher levels of autonomy. She stated:
We decided to make this transition because after testing, we felt we could accomplish the task of the second person—annotating each intervention with information about what was happening around the car—by looking at our logs after the vehicle had returned to base, rather than in real time.
It was not and never has been the role of the passenger-seat operator to maintain the vehicle’s safety. That is and always has been the clear and primary responsibility of the operator behind the wheel.
Even so, Kelley said, the co-pilot had a safety benefit—an extra chance of staying alert, and better odds of spotting road dangers. From his perspective, even as the technology improved, “the car still wasn’t ready for that second set of eyes to be removed.”
When the “handoff" fails
Both former operators also expressed surprise that Uber’s self-driving technology had failed, too. Something went deeply wrong here, they said—the cars were often super-sensitive to obstructions, even non-existent ones. “Sometimes the car would brake because of steam coming up from a pothole in the ground,” the anonymous operator said.
Self-driving experts are equally perplexed. “The LiDAR should have seen the pedestrian at quite some distance,” Raj Rajkumar, who leads autonomous vehicle research at Carnegie Mellon University, wrote in an email. “All in all, this indicates pretty serious technical issues at the core of the Uber system.” Velodyne, the company that designed the LiDAR sensors used by Uber’s self-driving fleet, told the BBC it was “baffled” by the crash.
Flavio Beltran, a former Uber developmental vehicle operator, filmed the stretch of road in Tempe where the crash occurred under normal lighting conditions. The high contrast on Uber’s crash footage seems to make the road look much darker than in reality, he pointed out on Facebook, where he posted the video. The road was fully lit. Having watched the video, Rajkumar noted that the crash occurred under conditions that should have been navigable to a robot car.
Still, it’s understood that Uber’s technology is not currently capable of full autonomy—indeed, that is why backup drivers are there. Eventually, “our objective is to be able to operate [these vehicles] without anyone behind the wheel in select cities and environments,” Jeff Miller, Uber's head of business development and strategic initiatives, told Automotive News Europe in November. That stage of technology is known in the industry as Level 4 autonomy. But what Uber is testing right now is closer to Level 3, Miller said, “because there is a person behind the wheel who is ready to take over if the computer gets stumped.”
Yet implicit in such a relationship between car and human is what technologists call the “handoff problem,” or the flawed expectation that driving responsibility can be safely passed from machines to humans in a split-second. Studies have shown that human drivers rapidly lose their ability to focus on the road when a machine is doing most of the work. As Missy Cummings, the director of Duke University’s Humans and Autonomy Laboratory, told Slate, “humans are terrible babysitters of automation.”
That has led some academics to suggest that the handoff poses an unsolvable problem to developers aiming for Level 3 cars, even though they are simpler to build from a pure technology standpoint. “The notion that a human can be a reliable backup is a fallacy, ” said John Leonard, a mechanical engineering professor at the Massachusetts Institute of Technology who has researched the near-unfathomable complexities of human driving, in the New York Times last year.
Others believe that safety drivers play an important role in teaching and supporting robots when their performances lag. But that job gets harder as computer pupils improve. Especially in high-speed conditions, “it’s hard for people to maintain vigilance and even some of the muscle memory that goes along with controlling a car,” said J. Christian Gerdes, a Stanford University professor of mechanical engineering who has tested and researched autonomous vehicle technology since the 1990s. As vehicles’ capabilities improve, it’s a bit like humans are expected to do more and less at the same time, because they’re being asked to be observant of any failure at any time.
In some way, older systems of autonomy were safer, Gerdes said, because they weren’t as good. One early form of adaptive cruise control could only detect objects in motion, he recalled, which meant backup drivers were forced to keep their eyes peeled for whenever vehicles ahead came to a stop. “It was always clear when we needed to [be ready to] intervene, because the system had issues frequently enough,” he said.
There may be ways for technology itself to mitigate the handoff problem and intervene to keep humans alert in Level 3 systems. Rajkumar suggested that Uber could install in-car cameras that ensure operator eyeballs are on the road in real time. Gerdes said that, in general, in-car communication platforms could help signal to backup drivers when they need to be on extra-high alert.
An extra person might have also helped prevent the fatal Tempe crash, Rajkumar said: By his calculations of speed and distance, Herzberg’s death might have been avoided if the backup driver had seen her even a half second earlier. “No second person to provide a second set of eyes—that’s a problem I can see,” he said. Gerdes noted that part of his standard safety protocol for testing autonomous vehicles (which he does on test tracks, not public roads) is to always have a second person in the car.
An Uber spokesperson said that the company is exploring possible changes to testing operations but cannot discuss them until the cause of the Tempe crash is established.
Without a doubt, Uber and other companies find themselves in a challenging position, said Rajkumar. Autonomous vehicle developers need real-world mileage and data in order to remove humans from the driver’s seat, eventually. Without people behind the wheels of developmental vehicles, the cars would clearly be more dangerous; on the other hand, those operators, being people, will always be distracted. If society wants the promised safety benefits of fully autonomous vehicles, “the only option we have is to work through the transition, slowly and steadily, until the technology is reliable,” he said.
“Slow and steady” have not historically been traits commonly associated with Uber. The company’s hard-charging, disruptive ethos has defined the growth of its original ride-hailing services and now its race to higher levels of autonomy. Along the way, Uber has survived a series of blow-ups involving relationships with (and not limited to) regulators, law enforcement, passenger privacy, and its competitors; last month, the company settled a high-profile lawsuit over stolen intellectual property with $245 million in stock paid by Uber to arch-rival Waymo, Alphabet’s self-driving technology subsidiary.
Speaking of which, Waymo has chosen not to develop Level 3 or semiautonomous cars at all—that is, cars that are capable of simple driving trips, such as highway cruising, but that still need a human driver to watch the road and sometimes take the wheel (think: Tesla’s Autopilot). It has been testing Level 4 autonomous cars with no backup drivers since November, and has clocked at least four million autonomous miles over at least eight years of testing, with far fewer interventions per mile than Uber.
The Uber spokesperson said that “[miles per intervention] is not a measure of the overall safety of our testing operations and shouldn’t be interpreted as such.” In a statement provided to CityLab, she added,
We believe that technology has the power to make transportation safer than ever before and recognize our responsibility to contribute to safety in our communities. So as we develop self-driving technology, safety is our primary concern every step of the way. We’re heartbroken by what happened this week, and our cars remain grounded. We continue to assist investigators in any way we can.
Yet if Uber has appeared impatient in its pursuit of self-driving development, that may be explained by the fact that the world’s largest ride-hailing company, which launched in 2011 and has been recently valued at $72 billion, is still not profitable. Eliminating the considerable overhead represented by the roughly 600,000 drivers it contracts in the U.S. alone may be the company’s best path to profitability: It could help Uber reduce costs enough to stop subsidizing the cheap, on-demand rides that helped build up its enormous market share. Uber’s former CEO, Travis Kalanick, once called the autonomous vehicle project “existential” to the company’s survival. Dara Khosrowshahi, Uber’s current chief executive, has reportedly become similarly convinced. In November, the company placed an order for 24,000 Volvo SUVs to speed up testing and readiness of autonomous cars for its broader ride-hailing market.
There is a difference, in other words, between Uber and many of its major competitors in the self-driving race: Firms like Waymo, GM, and Ford are either carmakers selling private vehicles or software companies betting that this technology will help them secure space in the automated mobility market of the future. Uber, on the other hand, may need self-driving to happen as soon as possible, at scale, to bring in revenue.
Apart from illuminating the profound difficulties of relying on human backup drivers, the killing in Tempe may also point to the high risks of any company aggressively pushing self-driving technology to market. It’s hard to see how any stage of testing or automation that requires even occasional human assistance won’t be vulnerable to the frailties of attention that experts and former drivers described. To Kelley, Uber could have done more to mitigate those weaknesses.
“I understand that mileage is important to shareholders,” said Kelley, who is now working part-time as a server. “But I think they may just be pushing too hard, and too fast.”
*CORRECTION: A previous version of this article stated that Buzzfeed first reported this information. In fact, Johana Bhuiyan of Recode did. The report also reflected data from March 2017, not March 2018.