Five Things I Learned from Talking to 20+ AI Safety Experts
As a democracy activist, I am naturally allergic to the accelerating concentration of power. That’s why the AI industry keeps me on edge.
As a democracy activist, I am naturally allergic to the accelerating concentration of power. That’s why the AI industry keeps me on edge: arms-race rhetoric has dominated nearly every discussion, and a race-to-the-bottom mentality treats concerns over power and accountability as noise interfering with economic growth.
I’ve benefited greatly from this most transformative technology of our time. But I’m equally wary of its brakeless development. This concern isn’t mine alone. According to this survey, over 80 percent of respondents supported using AI to safeguard democracy and backed stronger legal frameworks to regulate it. Another study found that a majority of the public believes governments should be equipped with a suite of safety powers to manage AI’s risks.
To better understand the challenge, I immersed myself in the literature on AI safety and governance. I spoke with over 20 researchers and analysts, many of whom authored the papers I read, and academics who now lead some of the world’s most influential research institutions.
Here’s what I’ve learned.
It’s a new space. Everyone is trying to figure out what’s happening.
While the concept of generative AI dates back to early programs like ELIZA in the 1960s, it wasn’t until the release of OpenAI’s ChatGPT in late 2022 that generative AI became commercially and culturally mainstream. Since then, a flood of competing tools for generating text, images, audio, and code has transformed sectors from education to journalism, even as serious concerns persist about misinformation, job displacement, and environmental costs.
Policymakers responded. UNESCO introduced its “Recommendation on the Ethics of AI” in 2021, while the OECD established its AI Principles in 2019. The EU moved more aggressively than most, finalising its comprehensive AI Act in 2024—a risk-based framework expected to come into force by 2026.
The regulatory environment today is fluid and reactive. The UK has opted for a more agile approach, relying on sector-specific regulators and soft-law principles through institutions like the Frontier AI Taskforce and the AI Safety Institute. Other countries are charting distinct paths. Japan issued voluntary safety guidelines in 2024. China, by contrast, introduced interim measures focused heavily on content control and censorship. In March 2024, the UN passed its first global AI resolution, non-binding, but a symbolic step toward international consensus on AI governance.
Despite the field’s decades-long academic roots, policy development has occurred only in the last few years. Political change also plays a significant role: AI policy direction under a Biden administration fundamentally differed from what a second Trump term might bring. This fluidity only underscores the need for more minds, perspectives, and disciplines to join the conversation.
Each country has its own unique position—and they’re constantly evolving.
Regulatory styles are shaped by political values and institutional cultures. The EU favours a horizontal, rights-based framework, introducing bans on high-risk applications such as biometric surveillance and mandating transparency. The UK, by contrast, promotes flexible, sector-led guidance, arguing that hard law may quickly become obsolete in such a fast-moving field.
Different regulatory philosophies inevitably influence one another. The EU AI Act has already become a global benchmark, prompting companies like Microsoft, Google, and Meta to adjust their operations globally. The UK’s lighter-touch approach is partly a response to this—some officials openly acknowledge they don’t need to duplicate what’s already happening "next door."
One AI researcher working for the government described the dilemma to me candidly:
“I do feel there should be more regulation here. But if the neighbouring EU is already building a strict framework, what’s the point in repeating the same thing?”
Currently, participation from AI firms in UK government safety evaluations is entirely voluntary. This leaves room for withdrawal, circumvention, or selective deployment—providing versions of models that are not representative of the actual consumer-facing products. While far from ideal, this may be a trade-off the government is willing to accept.
A lot of research is happening—but some of the most critical areas remain underfunded.
AI safety research is expanding, with growing attention to robustness, fairness, and alignment. But a foundational area, machine interpretability, remains under-resourced. Several leading researchers have told me this is the upstream issue that underlies many downstream concerns. Interpretability is the ability to understand and explain why a model made a certain decision. Without significantly improving it, trust, oversight, and accountability are difficult to achieve.
Despite its importance, interpretability receives a fraction of the funding allocated to high-visibility areas—alignment, monitoring, or red-teaming. Meanwhile, corporate R&D budgets continue to swell, exacerbating the imbalance. As one researcher put it, the funding infrastructure itself needs to change to incentivise work on these bedrock problems.
Not enough concerted effort, not enough advocacy.
World-class papers are being written. Think tanks and academic institutions are generating cutting-edge insights into AI safety and policy. But these efforts remain fragmented, as awareness-raising campaigns are scattered. Work on existential risks rarely intersects with environmental perspectives. Deep research institutes often lack pathways to convert insights into public engagement.
Stakeholder engagement is vital—working with the frontier labs and the governments is essential. Researchers inside major AI labs often express deep concern about safety. But the incentive structures around them shift quickly. One professor recounted how, after the Trump administration(widely seen as sceptical of regulation) took office, some lab representatives dialled down their advocacy for stricter oversight almost overnight. Goodwill alone is not enough. Political pressure and structural influence are needed to sustain meaningful engagement and accountability.
More non-profit and academic voices must shape the public conversation.
Industry has dominated the narrative. Executives appear on news shows and conference panels to offer visions of AI’s future. While they’re well-informed, they are also financially invested in accelerating adoption—and often in hyping its inevitability.
This dynamic forces the public to guess whether they’re hearing objective forecasts or sales pitches. Business incentives naturally steer discourse toward market optimism, not toward democracy protection or societal risks.
Non-profits and academics must be brought to the table with more power and visibility. Civil rights groups, university scholars, and public-interest technologists offer essential checks and balances. They ask the hard questions commercial players would rather avoid: How do we prevent systemic bias? What rights should citizens have over automated decisions? Who is accountable when AI harms?
Their voices aren’t just complementary—they’re essential for building a future in which AI serves democratic ends, not just corporate ones.
(I have been enjoying chatting with people concerned about AI’s impact on democracy. If you are also interested in it, drop me a message. )