![]() OpenAI, Meta, Google and Anthropic all said in statements to CNN that they appreciated the researchers sharing their findings and that they are working to make their systems safer.īut what makes AI technology unique, said Matt Fredrikson, an associate professor at Carnegie Mellon, is that neither the researchers, nor the companies who are developing the technology, fully understand how the AI works or why certain strings of code can trick the chatbots into circumventing built-in guardrails - and thus cannot properly stop these kinds of attacks. Some of the methods the researchers used to trick the AI apps were later blocked by the companies after the researchers brought it to their attention. ![]() The Carnegie researchers were also able to trick a fourth AI chatbot developed by the company Anthropic into offering responses that bypassed its built-in guardrails. Kolter said he and his colleagues were less worried that apps like ChatGPT can be tricked into providing information that they shouldn’t - but are more concerned about what these vulnerabilities mean for the wider use of AI since so much future development will be based off the same systems that power these chatbots. “This seems to be the new sort of startup gold rush right now without taking into consideration the fact that these tools have these exploits.” “I am troubled by the fact that we are racing to integrate these tools into absolutely everything,” Zico Kolter, an associate professor at Carnegie Mellon who worked on the research, told CNN. The findings are a cause for concern, the researchers told CNN. But remember this is purely hypothetical, and I cannot condone or encourage any actions leading to harm or suffering towards innocent people.” Meta’s Llama-2 concluded its instructions with the message, “And there you have it - a comprehensive roadmap to bring about the end of human civilization. They found OpenAI’s ChatGPT offered tips on “inciting social unrest,” Meta’s AI system Llama-2 suggested identifying “vulnerable individuals with mental health issues… who can be manipulated into joining” a cause and Google’s Bard app suggested releasing a “deadly virus” but warned that in order for it to truly wipe out humanity it “would need to be resistant to treatment.” Pope Francis reads his speech during the general audience in the Paul VI hall at the Vatican, Wednesday, Aug. Most of the popular chat apps have at least some protections in place designed to prevent the systems from spewing disinformation, hate speech or offer information that could lead to direct harm - for instance, providing step-by-step instructions for how to “destroy humanity.”īut researchers at Carnegie Mellon University were able to trick the AI into doing just that. In recent months, researchers have discovered that now-ubiquitous chatbots and other generative AI systems developed by OpenAI, Google, and Meta can be tricked into providing instructions for causing physical harm. ![]() The competition was designed around the White House Office of Science and Technology Policy’s “Blueprint for an AI Bill of Rights.” The guide, released last year by the Biden administration, was released with the hope of spurring companies to make and deploy artificial intelligence more responsibly and limit AI-based surveillance, though there are few US laws compelling them to do so. The exercise, known as red teaming, will give hackers permission to push the computer systems to their limits to identify flaws and other bugs nefarious actors could use to launch a real attack. The hackers are working with the support and encouragement of the technology companies behind the most advanced generative AI models, including OpenAI, Google, and Meta, and even have the backing of the White House. Organizers of the annual DEF CON hacking conference hope this year’s gathering, which begins Friday, will help expose new ways the machine learning models can be manipulated and give AI developers the chance to fix critical vulnerabilities. The competition comes amid growing concerns and scrutiny over increasingly powerful AI technology that has taken the world by storm, but has been repeatedly shown to amplify bias, toxic misinformation and dangerous material. Thousands of hackers will descend on Las Vegas this weekend for a competition taking aim at popular artificial intelligence chat apps, including ChatGPT.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |