How would we even know if AI went rogue?

26.08.2024 14:30

Vox

Congress needs to understand artificial intelligence capabilities better in order to mitigate future risks.

As the frontier of artificial intelligence advances at a breakneck pace, the US government is struggling to keep up. Working on AI policy in Washington, DC, I can tell you that before we can decide how to govern frontier AI systems, we first need to see them clearly. Right now, we’re navigating in a fog.

My role as an AI policy fellow at the Federation of American Scientists (FAS) involves developing bipartisan ideas for improving the government’s ability to analyze current and future systems. In this work, I interact with experts across government, academia, civil society, and the AI industry. What I’ve learned is that there is no broad consensus on how to manage the potential risks of breakthrough AI systems without hampering innovation. However, there is broad agreement that the US government needs better information about AI companies’ technologies and practices, and more capacity to respond to both catastrophic and more insidious risks as they arise. Without detailed knowledge of the latest AI capabilities, policymakers can’t effectively assess whether current regulations are sufficient to prevent misuses and accidents, or whether companies need to take additional steps to safeguard their systems.

When it comes to nuclear power or airline safety, the federal government demands timely information from the private companies in those industries to ensure the public’s welfare. We need the same insight into the emerging AI field. Otherwise, this information gap could leave us vulnerable to unforeseen risks to national security or lead to overly restrictive policies that stifle innovation.

Progress in Washington

Encouragingly, Congress is making gradual progress in improving the government’s ability to understand and respond to novel developments in AI. Since ChatGPT’s debut in late 2022, AI has been taken more seriously by legislators from both parties and both chambers on Capitol Hill. The House formed a bipartisan AI task force with a directive to balance innovation, national security, and safety. Senate Majority Leader Chuck Schumer (D-NY) organized a series of AI Insight Forums to collect outside input and build a foundation for AI policy. These events informed the bipartisan Senate AI working group’s AI Roadmap that outlined areas of consensus, including “development and standardization of risk testing and evaluation methodologies and mechanisms” and an AI-focused Information Sharing and Analysis Center.

Several bills have been introduced that would enhance information sharing about AI and bolster the government’s response capabilities. The Senate’s bipartisan AI Research, Innovation, and Accountability Act would require companies to submit risk assessments to the Department of Commerce before deploying AI systems that may impact critical infrastructure, criminal justice, or biometric identification. Another bipartisan bill, the VET AI Act (which FAS endorsed), proposes a system for independent evaluators to audit and verify AI companies’ compliance with established guidelines, similar to existing practices in the financial industry. These bills cleared the Senate Commerce committee in July and may receive a floor vote in the Senate before the 2024 election.

There has also been promising progress in other parts of the world. In May, the UK and Korean governments announced that most of the world’s leading AI companies agreed to a new set of voluntary safety commitments at the AI Seoul Summit. These pledges include identifying, assessing, and managing risks associated with developing the most advanced AI models, drawing on companies’ Responsible Scaling Policies pioneered in the past year that provide a roadmap for future risk mitigation as AI capabilities develop. The AI developers also agreed to provide transparency on their approaches to frontier AI safety, including “sharing more detailed information which cannot be shared publicly with trusted actors, including their respective home governments.”

However, these commitments lack enforcement mechanisms and standardized reporting requirements, making it difficult to assess whether or not companies are adhering to them.

Even some industry leaders have voiced support for increased government oversight. Sam Altman, CEO of OpenAI, emphasized this point early last year in testimony before Congress, stating, “I think if this technology goes wrong, it can go quite wrong, and we want to be vocal about that. We want to work with the government to prevent that from happening.” Dario Amodei, CEO of Anthropic, has taken that sentiment one step further; after the publication of Anthropic’s Responsible Scaling Policy, he expressed his hope that governments would turn elements from the policy into “well-crafted testing and auditing regimes with accountability and oversight.”

Despite these encouraging signs from Washington and the private sector, significant gaps remain in the US government’s ability to understand and respond to rapid advancements in AI technology. Specifically, three critical areas require immediate attention: protections for independent research on AI safety, early warning systems for AI capabilities improvements, and comprehensive reporting mechanisms for real-world AI incidents. Addressing these gaps is key for protecting national security, fostering innovation, and ensuring that AI development advances the public interest.

A safe harbor for independent AI safety research

AI companies often discourage or even threaten to ban researchers who identify safety flaws from using their products, creating a chilling effect on essential independent research. This leaves the public and policymakers in the dark about possible dangers from widely used AI systems, including threats to US national security. Independent research is vital because it provides an external check on the claims made by AI developers, helping to identify risks or limitations that may not be apparent to the companies themselves.

One significant proposal to address this issue is that companies should offer legal safe harbor and financial incentives for good-faith AI safety and trustworthiness research. Congress could offer “bug” bounties to AI safety researchers who identify vulnerabilities and extend legal protections to experts studying AI platforms, similar to those proposed for social media researchers in the Platform Accountability and Transparency Act. In an open letter earlier this year, over 350 leading researchers and advocates called for companies to give such protections for safety researchers, but no company has yet done so.

With these protections and incentives, thousands of American researchers could be empowered to stress-test AI systems, allowing real-time assessments of AI products and systems. The US AI Safety Institute has included similar protections for AI researchers in its draft guidelines on “Managing Misuse Risk for Dual-Use Foundation Models,” and Congress should consider codifying these best practices.

An early warning system for AI capability improvements

The US government’s approach to identifying and responding to frontier AI systems’ potentially dangerous capabilities is limited and unlikely to keep pace with new AI capabilities if they continue to rapidly increase. The knowledge gap within the industry leaves policymakers and security agencies unprepared to address emerging AI risks. Worse, the potential consequences of this asymmetry will compound over time as AI systems become both more risky and more widely used.

Establishing an AI early warning system would equip the government with the information it needs to get ahead of threats from artificial intelligence. Such a system would create a formalized channel for AI developers, researchers, and other relevant parties to report AI capabilities that have both civilian and military applications (such as uplift for biological weapons research or cyber offense) to the government. The Commerce Department’s Bureau of Industry and Security could serve as an information clearinghouse, receiving, triaging, and forwarding these reports to other relevant agencies.

This proactive approach would provide government stakeholders with up-to-date information about the latest AI capabilities, enabling them to assess whether current regulations are sufficient or whether new safeguards are needed. For instance, if advancements in AI systems posed an increased risk of biological weapons attacks, relevant parts of the government would be promptly alerted, allowing for a rapid response to safeguard the public’s welfare.

Reporting mechanisms for real-world AI incidents

The US government currently lacks a comprehensive understanding of adverse incidents where AI systems have caused harm, hindering its ability to identify patterns of risky use, assess government guidelines, and respond to threats effectively. This blind spot leaves policymakers ill-equipped to craft timely and informed response measures.

Establishing a voluntary national AI incident reporting hub would create a standardized channel for companies, researchers, and the public to confidentially report AI incidents, including system failures, accidents, misuse, and potential hazards. This hub would be housed at the National Institute of Standards and Technology, leveraging existing expertise in incident reporting and standards-setting while avoiding mandates; this will encourage collaborative industry participation.

Combining this real-world data on adverse AI incidents with forward-looking capabilities reporting and researcher protections would enable the government to develop better informed policy responses to emerging AI issues and further empower developers to better understand the threats.

The path forward

These three proposals strike a balance between oversight and innovation in AI development. By incentivizing independent research and improving government visibility into AI capabilities and incidents, they could support both safety and technological advancement. The government could foster public trust and potentially accelerate AI adoption across sectors while preventing the regulatory backlash that could follow preventable high-profile incidents. Policymakers would be able to craft targeted regulations that address specific risks — such as AI-enhanced cyber threats or potential misuse in critical infrastructure — while preserving the flexibility needed for continued innovation in fields like health care diagnostics and climate modeling.

Passing legislation in these areas requires bipartisan cooperation in Congress. Stakeholders from industry, academia, and civil society must advocate for and engage in this process, offering their expertise to refine and implement these proposals. There is a short window for action in what remains of the 118th Congress, with the potential to attach some AI transparency policies to must-pass legislation like the National Defense Authorization Act. The clock is ticking, and swift, decisive action now could set the stage for better AI governance for years to come.

Imagine a future in which our government has the tools to understand and responsibly guide AI development and a future in which we can harness AI’s potential to solve grand challenges while safeguarding against risks. This future is within our grasp — but only if we act now to clear the fog and sharpen our collective vision of how AI is developed and used. By improving our collective understanding and oversight of AI, we increase our chances of steering this powerful technology toward beneficial outcomes for society.

Moscow.media

Частные объявления сегодня

Rss.plus

Все новости за 24 часа

How would we even know if AI went rogue?

Progress in Washington

A safe harbor for independent AI safety research

An early warning system for AI capability improvements

Reporting mechanisms for real-world AI incidents

The path forward

Новости спорта

Хавбек "Пари НН" Калинский: Синнер - адекватный парень, одобряю выбор сестры

«Аншлаги»: в Театре сатиры вежливо ответили жене Петросяна

РЖД проиндексирует зарплату сотрудникам с 1 февраля и с 1 марта

Главные новости дня, 30 января 2025 года

В Москве обновился рекорд температуры