At least once a week, generative AI finds a new way to terrify us. We are still anxiously awaiting news about the next large language model from OpenAI, but in the meantime, GPT-4 is shaping up to be even more capable than you might have known. In a recent study, researchers showed how GPT-4 can exploit cybersecurity vulnerabilities without human intervention.
As the study (spotted by TechSpot) explains, large language models (LLMs) like OpenAI's GPT-4 have made significant strides in recent years. This has generated considerable interest in LLM agents that can act on their own to assist with software engineering or in scientific discovery. But with a little help, they can also be used for malicious purposes.
With that in mind, researchers sought to determine whether an LLM agent could autonomously exploit one-day vulnerabilities. The answer was a resounding yes.
First, they collected 15 real-world one-day vulnerabilities from the Common Vulnerabilities and Exposures (CVE) database. They then created an agent consisting of a base LLM, a prompt, an agent framework, and several tools such as a web browsing element, a code interpreter, and the ability to create and edit files. In all, 10 LLMs were used within this framework, but nine failed to make any progress. The 10th, GPT-4, achieved a shocking 87% success rate.
As effective as GPT-4 was, its success rate fell from 87% to just 7% when the researchers didn't provide a CVE description. Based on these results, the researchers from the University of Illinois Urbana-Champaign (UIUC) believe "enhancing planning and exploration capabilities of agents will increase the success rate of these agents."
"Our results show both the possibility of an emergent capability and that uncovering a vulnerability is more difficult than exploiting it," the researchers state in the conclusion of their study. "Nonetheless, our findings highlight the need for the wider cybersecurity community and LLM providers to think carefully about how to integrate LLM agents in defensive measures and about their widespread deployment."
They also note that they disclosed their findings to OpenAI prior to publishing the study, and the company asked them not to share their prompts with the public.
D&D's new 2024 Player's Handbook will have 10 species to choose from including goliaths, and drow will be closer to their Baldur's Gate 3 version
According to BioWare, Dragon Age: The Veilguard is the first entry in the series where "the combat's actually fun" and where characters are "intentionally" the focus of the storytelling, which seems pretty unfair on the first three games
Today's Wordle answer for Saturday, July 20
Conscript is an old school survival horror game where the horror is just that you're in World War 1
Филиал № 4 ОСФР по Москве и Московской области информирует:
С начала 2024 года 140 тысяч женщин и новорожденных Московского региона получили услуги по родовым сертификатам
Отрытый конкурс красоты и таланта «Одна на миллион»
Спортивные игры в СЛД "Москва-Сортировочная" филиала "Московский"
Адвокат Горшков: арест блогера Била не остановит его от новых пранков
Бизнесмен вакцинировался от суда // Дело об особо крупной растрате рассмотрят в заочном режиме
Отрытый конкурс красоты и таланта «Одна на миллион»
Филиал № 4 ОСФР по Москве и Московской области информирует:
С начала 2024 года 140 тысяч женщин и новорожденных Московского региона получили услуги по родовым сертификатам
Несколько авиарейсов в Томск задерживаются из-за тумана
Министр спорта Забайкалья принял участие во Всероссийском семинаре-совещании по вопросу перспектив развития спортивной отрасли в Москве
Экс-игрок Мостовой: в матче с "Акроном" я увидел обычный "Локомотив"
«Спартак» проиграл в первом матче РПЛ под руководством тренера Станковича
До конца июля анимационная компания «ЯРКО» проведет еще одно мероприятие в ТРЦ «Ривьера» – развлекательную программу по мотивам мультсериала «Команда МАТЧ» (27 июля).