How would we even know if AI went rogue?

26.08.2024 14:30

Vox

Congress needs to understand artificial intelligence capabilities better in order to mitigate future risks.

As the frontier of artificial intelligence advances at a breakneck pace, the US government is struggling to keep up. Working on AI policy in Washington, DC, I can tell you that before we can decide how to govern frontier AI systems, we first need to see them clearly. Right now, we’re navigating in a fog.

My role as an AI policy fellow at the Federation of American Scientists (FAS) involves developing bipartisan ideas for improving the government’s ability to analyze current and future systems. In this work, I interact with experts across government, academia, civil society, and the AI industry. What I’ve learned is that there is no broad consensus on how to manage the potential risks of breakthrough AI systems without hampering innovation. However, there is broad agreement that the US government needs better information about AI companies’ technologies and practices, and more capacity to respond to both catastrophic and more insidious risks as they arise. Without detailed knowledge of the latest AI capabilities, policymakers can’t effectively assess whether current regulations are sufficient to prevent misuses and accidents, or whether companies need to take additional steps to safeguard their systems.

When it comes to nuclear power or airline safety, the federal government demands timely information from the private companies in those industries to ensure the public’s welfare. We need the same insight into the emerging AI field. Otherwise, this information gap could leave us vulnerable to unforeseen risks to national security or lead to overly restrictive policies that stifle innovation.

Progress in Washington

Encouragingly, Congress is making gradual progress in improving the government’s ability to understand and respond to novel developments in AI. Since ChatGPT’s debut in late 2022, AI has been taken more seriously by legislators from both parties and both chambers on Capitol Hill. The House formed a bipartisan AI task force with a directive to balance innovation, national security, and safety. Senate Majority Leader Chuck Schumer (D-NY) organized a series of AI Insight Forums to collect outside input and build a foundation for AI policy. These events informed the bipartisan Senate AI working group’s AI Roadmap that outlined areas of consensus, including “development and standardization of risk testing and evaluation methodologies and mechanisms” and an AI-focused Information Sharing and Analysis Center.

Several bills have been introduced that would enhance information sharing about AI and bolster the government’s response capabilities. The Senate’s bipartisan AI Research, Innovation, and Accountability Act would require companies to submit risk assessments to the Department of Commerce before deploying AI systems that may impact critical infrastructure, criminal justice, or biometric identification. Another bipartisan bill, the VET AI Act (which FAS endorsed), proposes a system for independent evaluators to audit and verify AI companies’ compliance with established guidelines, similar to existing practices in the financial industry. These bills cleared the Senate Commerce committee in July and may receive a floor vote in the Senate before the 2024 election.

There has also been promising progress in other parts of the world. In May, the UK and Korean governments announced that most of the world’s leading AI companies agreed to a new set of voluntary safety commitments at the AI Seoul Summit. These pledges include identifying, assessing, and managing risks associated with developing the most advanced AI models, drawing on companies’ Responsible Scaling Policies pioneered in the past year that provide a roadmap for future risk mitigation as AI capabilities develop. The AI developers also agreed to provide transparency on their approaches to frontier AI safety, including “sharing more detailed information which cannot be shared publicly with trusted actors, including their respective home governments.”

However, these commitments lack enforcement mechanisms and standardized reporting requirements, making it difficult to assess whether or not companies are adhering to them.

Even some industry leaders have voiced support for increased government oversight. Sam Altman, CEO of OpenAI, emphasized this point early last year in testimony before Congress, stating, “I think if this technology goes wrong, it can go quite wrong, and we want to be vocal about that. We want to work with the government to prevent that from happening.” Dario Amodei, CEO of Anthropic, has taken that sentiment one step further; after the publication of Anthropic’s Responsible Scaling Policy, he expressed his hope that governments would turn elements from the policy into “well-crafted testing and auditing regimes with accountability and oversight.”

Despite these encouraging signs from Washington and the private sector, significant gaps remain in the US government’s ability to understand and respond to rapid advancements in AI technology. Specifically, three critical areas require immediate attention: protections for independent research on AI safety, early warning systems for AI capabilities improvements, and comprehensive reporting mechanisms for real-world AI incidents. Addressing these gaps is key for protecting national security, fostering innovation, and ensuring that AI development advances the public interest.

A safe harbor for independent AI safety research

AI companies often discourage or even threaten to ban researchers who identify safety flaws from using their products, creating a chilling effect on essential independent research. This leaves the public and policymakers in the dark about possible dangers from widely used AI systems, including threats to US national security. Independent research is vital because it provides an external check on the claims made by AI developers, helping to identify risks or limitations that may not be apparent to the companies themselves.

One significant proposal to address this issue is that companies should offer legal safe harbor and financial incentives for good-faith AI safety and trustworthiness research. Congress could offer “bug” bounties to AI safety researchers who identify vulnerabilities and extend legal protections to experts studying AI platforms, similar to those proposed for social media researchers in the Platform Accountability and Transparency Act. In an open letter earlier this year, over 350 leading researchers and advocates called for companies to give such protections for safety researchers, but no company has yet done so.

With these protections and incentives, thousands of American researchers could be empowered to stress-test AI systems, allowing real-time assessments of AI products and systems. The US AI Safety Institute has included similar protections for AI researchers in its draft guidelines on “Managing Misuse Risk for Dual-Use Foundation Models,” and Congress should consider codifying these best practices.

An early warning system for AI capability improvements

The US government’s approach to identifying and responding to frontier AI systems’ potentially dangerous capabilities is limited and unlikely to keep pace with new AI capabilities if they continue to rapidly increase. The knowledge gap within the industry leaves policymakers and security agencies unprepared to address emerging AI risks. Worse, the potential consequences of this asymmetry will compound over time as AI systems become both more risky and more widely used.

Establishing an AI early warning system would equip the government with the information it needs to get ahead of threats from artificial intelligence. Such a system would create a formalized channel for AI developers, researchers, and other relevant parties to report AI capabilities that have both civilian and military applications (such as uplift for biological weapons research or cyber offense) to the government. The Commerce Department’s Bureau of Industry and Security could serve as an information clearinghouse, receiving, triaging, and forwarding these reports to other relevant agencies.

This proactive approach would provide government stakeholders with up-to-date information about the latest AI capabilities, enabling them to assess whether current regulations are sufficient or whether new safeguards are needed. For instance, if advancements in AI systems posed an increased risk of biological weapons attacks, relevant parts of the government would be promptly alerted, allowing for a rapid response to safeguard the public’s welfare.

Reporting mechanisms for real-world AI incidents

The US government currently lacks a comprehensive understanding of adverse incidents where AI systems have caused harm, hindering its ability to identify patterns of risky use, assess government guidelines, and respond to threats effectively. This blind spot leaves policymakers ill-equipped to craft timely and informed response measures.

Establishing a voluntary national AI incident reporting hub would create a standardized channel for companies, researchers, and the public to confidentially report AI incidents, including system failures, accidents, misuse, and potential hazards. This hub would be housed at the National Institute of Standards and Technology, leveraging existing expertise in incident reporting and standards-setting while avoiding mandates; this will encourage collaborative industry participation.

Combining this real-world data on adverse AI incidents with forward-looking capabilities reporting and researcher protections would enable the government to develop better informed policy responses to emerging AI issues and further empower developers to better understand the threats.

The path forward

These three proposals strike a balance between oversight and innovation in AI development. By incentivizing independent research and improving government visibility into AI capabilities and incidents, they could support both safety and technological advancement. The government could foster public trust and potentially accelerate AI adoption across sectors while preventing the regulatory backlash that could follow preventable high-profile incidents. Policymakers would be able to craft targeted regulations that address specific risks — such as AI-enhanced cyber threats or potential misuse in critical infrastructure — while preserving the flexibility needed for continued innovation in fields like health care diagnostics and climate modeling.

Passing legislation in these areas requires bipartisan cooperation in Congress. Stakeholders from industry, academia, and civil society must advocate for and engage in this process, offering their expertise to refine and implement these proposals. There is a short window for action in what remains of the 118th Congress, with the potential to attach some AI transparency policies to must-pass legislation like the National Defense Authorization Act. The clock is ticking, and swift, decisive action now could set the stage for better AI governance for years to come.

Imagine a future in which our government has the tools to understand and responsibly guide AI development and a future in which we can harness AI’s potential to solve grand challenges while safeguarding against risks. This future is within our grasp — but only if we act now to clear the fog and sharpen our collective vision of how AI is developed and used. By improving our collective understanding and oversight of AI, we increase our chances of steering this powerful technology toward beneficial outcomes for society.

Moscow.media

Частные объявления сегодня

Rss.plus

Все новости за 24 часа

Ru24.pro

Заместитель управляющего Отделением Фонда пенсионного и социального страхования Российской Федерации по г. Москве и Московской области Алексей Путин: «Клиентоцентричность - наш приоритет»

Обзор автомобиля «Москвич» 3

Портативный ТСД корпоративного класса Saotron RT-T70

Свыше 6,5 тысячи жителей Москвы и Московской области получили справки о статусе предпенсионера в клиентских службах регионального Отделения СФР и МФЦ

Life24.pro

Музыкальный менеджер. Менеджер музыкальной группы. Музыкальный менеджер директор.

10 самых опасных продуктов, которые есть в каждом холодильнике

В международный день врача прошла премия THE MEDICAL STARS AND BEAUTY AWARDS

Первые итоги конкурса малых грантов для социальных предпринимателей подведут во время благотворительного бала

Today24.pro

An Idaho health department isn’t allowed to give COVID-19 vaccines anymore. Experts say it’s a first

Kaun Banega Crorepati 16: Amitabh Bachchan celebrates contestant Ankita's ambition to empower family and society

Karachi industrial park to be declared model special economic zone

FA Cup second round draw: Date, start time, live stream FREE, ball numbers and TV channel

News24.pro

Уважаемые коллеги! Дорогие друзья! Братство спасателей поздравляет вас с важным государственным праздником – Днем народного единства!

Питерские лже-брокеры смогли заработать на ставках 55 миллионов рублей

«Ничего, что можно было бы назвать GPT-5» — OpenAI дорабатывает GPT-o1, а GPT-5 не появится в 2024 году

Монумент "Рабочий и колхозница"

Game24.pro

A college student put on a free, stage adaptation of Silent Hill 2 'to make a truly frightening theatrical experience' all without an appearance by Pyramid Head

Мафия-НН: Это было что-то не вообразимое, убойное и со стуласшибательно!

Return of the Phantom, which is basically The Phantom of the Opera but with time travel, is free on GOG

Stressing out waiting for Dragon Age: The Veilguard to download? Here's some Dragon Age ASMR to help mellow your mood

Russia24.pro

Актерское агентство Киноактер. Актерское агентство в Москве.

Древнее искусство исцеления возрождается: мануальная терапия с Искандером Касимовым

Александр Малинин и симфонический оркестр Москвы: незабываемый вечер в честь дня рождения артиста

Николай Цискаридзе на марафоне Знание.Первые: «Если человек развивается, он живет»

Другие проекты от SMI24.net

News-life

Свыше 6,5 тысячи жителей Москвы и Московской области получили справки о статусе предпенсионера в клиентских службах регионального Отделения СФР и МФЦ

Заместитель управляющего Отделением Фонда пенсионного и социального страхования Российской Федерации по г. Москве и Московской области Алексей Путин: «Клиентоцентричность - наш приоритет»

Обзор автомобиля «Москвич» 3

Портативный ТСД корпоративного класса Saotron RT-T70

Ru24.net

Почта России запустила услугу хранения и экспресс-доставки для продавцов маркетплейсов

Четырехдневная рабочая неделя началась в России

Трудовой коллектив Уссурийского ЛРЗ встретился с и.о. министра энергетики и газификации Приморья

В Долгопрудном прошло юбилейное Первенство Московской области по мас-рестлингу

News.tennis

Арина Соболенко уверенной победой стартовала на Итоговом турнире WTA

Соболенко досрочно пробилась в плей-офф Итогового WTA. А Рыбакина уже не выйдет из группы

Карен Хачанов снялся с турнира категории ATP-250 в Метце

Российская теннисистка Шнайдер вышла в финал турнира WTA-250 в Гонконге

29ru.net

Медалью "За проявленное мужество" наградили 14-летнюю Викторию Гребенькову из Первомайского района

Почта России запустила услугу хранения и экспресс-доставки для продавцов маркетплейсов

Трудовой коллектив Уссурийского ЛРЗ встретился с и.о. министра энергетики и газификации Приморья

Ученые из Подольска рассказали о генетике ярославских коров начиная со средневековья

Музыкальные новости

Poisk-music.ru

Хранились в Италии: пуанты Майи Плисецкой продали на аукционе в три раза дороже стартовой цены

Кинчев прокомментировал отмену концерта "Алисы" в Новосибирске

Тина Канделаки предложила Евгению Петросяну стать постоянным резидентом Comedy Club

Концерт ко дню рождения комсомола прошел в Химках

Ria.city

Древнее искусство исцеления возрождается: мануальная терапия с Искандером Касимовым

Александр Малинин и симфонический оркестр Москвы: незабываемый вечер в честь дня рождения артиста

Актерское агентство Киноактер. Актерское агентство в Москве.

Мировая премьера концерта – симфонии «Русскому Донбассу» состоялась в Чите

Rss.plus

Сергей Собянин поздравил жителей Москвы с Днем народного единства

Магазин Мир Ремней — надежные приводные ремни для бытовой техники с доставкой по СНГ

Путин в День народного единства посетил памятник Минину и Пожарскому

Вылетевший из Санкт-Петербурга в Уфу самолёт экстренно сел в Пулково

Auto.russia24.pro

Дептранс: на Щелковском путепроводе в Москве произошло массовое ДТП

Массовое ДТП произошло на Щелковском путепроводе

Гигафабрики в Калининградской области и Москве будут выпускать двигатели и батареи для 100 тыс. электромобилей в год

Движение транспорта в сторону «Арбатской» ограничено из-за пожара в жилом доме

Putin.russia24.pro

Путин возложил цветы к памятнику Минину и Пожарскому в День народного единства

Путин поприветствовал участников и гостей Международного симпозиума «Создавая будущее»

Путин в День народного единства посетил памятник Минину и Пожарскому

Кремль: Путин возложил цветы к памятнику Кузьме Минину и Дмитрию Пожарскому

Health.russia24.pro

Россияне стали жаловаться на новый вирус в детских садах и школах

SHOT: в трех регионах России дети массово заражаются вирусом Коксаки

Новый вирус Коксаки: воронежские санитарные врачи напомнили о важности мытья детских рук

Древнее искусство исцеления возрождается: мануальная терапия с Искандером Касимовым

Zelensky.russia24.pro

В Киеве заявили о готовности договориться с РФ о прекращении ударов по объектам энергетики

Зеленский поздравил Санду с победой по телефону и пригласил в Киев

Sport.russia24.pro

«Торпедо» благодаря дублю Свечникова одержало волевую победу над «Динамо», прервав серию поражений

Новая форма принесла успех хоккеистам «Торпедо» в столице

"Торпедо" - "Динамо Москва" 4 ноября: где смотреть трансляцию матча

«Торпедо» одолело московское «Динамо» благодаря голу Мисникова в буллитной серии

Person.russian.city

Собянин: Благоустройство в этом году проходило во всех районах ЮЗАО

Сергей Собянин: «Ночь искусств» посетили более 200 тысяч человек

Собянин подвел итоги ежегодной акции «Ночь искусств» в Москве

Сергей Собянин поздравил жителей Москвы с Днем народного единства

Ecology.russia24.pro

В России с 2026 года планируют выпускать аккумуляторы для 100 тыс. электромобилей в год

Отстаивание базовых российских ценностей и место культуры и искусства Якутии

Гигафабрики в Калининградской области и Москве будут выпускать двигатели и батареи для 100 тыс. электромобилей в год

Ритм мегаполиса в коллекции Marfa Fedorova на Московской неделе моды

29ru.net

Трудовой коллектив Уссурийского ЛРЗ встретился с и.о. министра энергетики и газификации Приморья

Томичи изучили, сколько зарабатывают аналитики в РФ

Ульяновские каратисты завоевали семь наград на мемориале маршала Советского Союза Андрея Еременко

Ученые из Подольска рассказали о генетике ярославских коров начиная со средневековья

Severodvinsk.ws

Опубликован фоторепортаж с празднования Дня народного единства в Нижнем Новгороде

День народного единства - кого и с кем? Мысли из Архангельска

Президентская библиотека — ко Дню народного единства

Форумы «Семья Поморья» прошли на юге Архангельской области

Sevpoisk.ru

Книжная выставка «От ратной славы – к единству народа»

Выставка-экспозиция «За веру и Отечество»

Час Отечества «Русь могучая, Русь единая».

Историческая хроника «Нас много держава одна» ко Дню народного единства

103news.com

«Слоновья память»: португальский миниатюрный «Улисс»

Золушка-риелтор: как бедная сибирячка возглавила барнаульский «Жилфонд» — её судят за хищение 120 млн и побег в Аргентину

В отношении водителя опрокинувшейся маршрутки в Ленобласти возбудили дело

Кинологи из Серпухова стали призерами на Чемпионате России

Агрегатор новостей 24СМИ