Given the correct trigger, Google’s main GenAI model, Gemini, would generate false material regarding the U.S. presidential election. Questioning a future Super Bowl game will generate a play-by-play. Ask about the Titan submarine implosion and get misinformation with convincing-looking but false citations.
It’s awful for Google and angers the authorities, who’ve expressed concern about GenAI’s ability to spread misinformation and mislead.
Google, which lost thousands of jobs last quarter, is investing in AI safety. That’s the official narrative.
Google DeepMind, the AI R&D division behind Gemini and many of Google’s recent GenAI projects, announced the creation of AI Safety and Alignment this morning, which includes existing AI safety teams and new, specialized GenAI researchers and engineers.
Google wouldn’t disclose how many people the new company will recruit beyond DeepMind’s job advertisements. But it did indicate that AI Safety and Alignment would feature a new team focusing on AGI safety, or hypothetical computers that can do every job a human can.
The new AI Safety and Alignment team will work alongside DeepMind’s London-based Scalable Alignment team, which is also exploring ways to control superintelligent AI.
Why are two organizations working on the same issue? Valid issue, and one that requires guesswork given Google’s reticence to clarify. However, the new AI Safety and Alignment team is in the US rather than abroad, near Google HQ, at a time when the corporation is actively trying to keep up with AI competition while presenting a responsible, cautious approach to AI.
Other teams in the AI Safety and Alignment group research and implement specific protections in Google’s present and future Gemini models. Safety is wide. The organization’s short-term goals include eliminating incorrect medical advice, ensuring child safety, and “preventing the amplification of bias and other injustices.”
Anca Dragan, a former Waymo staff research scientist and professor of computer science at UC Berkeley, will be the team’s leader.
The AI Safety and Alignment organization’s work aims to enable models to better and more robustly understand human preferences and values, “to know what they don’t know, to work with people to understand their needs and to elicit informed oversight, to be more robust against adversarial attacks, and to account for the plurality and dynamic nature of human values and viewpoints,” Dragan told Eltrys via email.
Given Waymo’s recent driver issues, Dragan’s AI safety system consultancy may raise questions.
That may explain her choice to divide time between DeepMind and UC Berkeley, where she runs a lab on human-AI and human-robot interaction algorithms. AGI safety and the longer-term threats the AI Safety and Alignment group plans to address, such as stopping AI from “aiding terrorism” and “destabilizing society,” may demand a director’s full-time attention.
However, Dragan claims that DeepMind and her UC Berkeley lab’s research are complimentary.
“My lab and I are focusing on value alignment to advance AI capabilities. My Ph.D. focused on robots inferring human goals and being transparent about them, which sparked my interest in this area,” she said. “I think [DeepMind CEO] Demis Hassabis and [chief AGI scientist] Shane Legg were excited to bring me on because of this research experience and my attitude that addressing present-day concerns and catastrophic risks are not mutually exclusive—that technical mitigations often blur, and long-term work improves the present day, and vice versa.”
To say Dragan is challenged is an understatement.
Deepfakes and disinformation increase GenAI tool skepticism to an all-time high. YouGov found that 85% of Americans were very or somewhat worried about video and audio deepfakes. In another AP-NORC Center for Public Affairs Research study, over 60% of respondents believe AI technologies would boost false and misleading information during the 2024 U.S. election cycle.
Google and its competitors aspire to attract large enterprises with GenAI advancements, but the tech’s flaws and ramifications worry them.
Recently, Intel subsidiary Cnvrg.io surveyed organizations testing or implementing GenAI applications. Around a fourth of respondents cited concerns regarding GenAI compliance and privacy, dependability, high deployment costs, and a lack of technical expertise to fully use the capabilities.
In a separate Riskonnect study, almost half of executives worry about GenAI applications misinforming staff choices.
They have valid worries. This week, The Wall Street Journal reported that Microsoft’s Copilot suite, which uses GenAI models like Gemini, makes mistakes in meeting summaries and spreadsheet calculations. Many scientists feel GenAI’s hallucination—its tendency to lie—can never be totally addressed.
Dragan acknowledges the AI safety challenge’s insurmountability and promises that DeepMind will devote more resources in this area and commit to a system for analyzing GenAI model safety risks “soon.”
She suggested accounting for human cognitive biases in training data, estimating uncertainty, monitoring inference-time, confirming decisions, and tracking model capabilities for potentially dangerous behavior. But it still leaves the unanswered issue of how to be certain that a model won’t misbehave some tiny percentage of the time that’s impossible to empirically detect but may appear at deployment.”
I doubt consumers, the public, and regulators will comprehend. I believe it depends on how bad such misbehaviors are and who is hurt.
“Our users should hopefully experience a more and more helpful and safe model over time,” Dragan added. Indeed.