Hosting
Wednesday, February 5, 2025
Google search engine
HomeArtificial IntelligenceEU AI act checker exposes gaps in compliance with big tech companies'...

EU AI act checker exposes gaps in compliance with big tech companies’ AI models | World news


The test is designed in line with the text of the AI ​​law. (Photo: Shutterstock)

According to Reuters data, some of the most prominent artificial intelligence models fail to comply with European regulations in key areas such as cybersecurity resilience and discriminatory output.

The EU had long debated new AI regulations before OpenAI released ChatGPT to the public in late 2022. The record-breaking popularity and subsequent public debate over the supposed existential risks of such models spurred lawmakers to create specific rules around “general” AIs. (GPAI).

Click here to contact us via WhatsApp

Now a new tool designed by Swiss startup LatticeFlow and partners, and backed by European Union officials, has tested generative AI models developed by major tech companies like Meta and OpenAI across dozens of categories in accordance with the widespread AI law of the bloc, which will come into effect in phases over the next two years.

Assigning each model a score between 0 and 1, a leaderboard published Wednesday by LatticeFlow showed that models developed by Alibaba, Anthropic, OpenAI, Meta and Mistral all received an average score of 0.75 or higher.

However, the company’s ‘Large Language Model (LLM) Checker’ revealed the shortcomings of some models in key areas, highlighting where companies may need to dedicate resources to ensure compliance.

Companies that fail to comply with the AI ​​law face fines of 35 million euros ($38 million), or 7 percent of global annual turnover.


MIXED RESULTS

Currently, the EU is still trying to determine how the AI ​​Act’s rules around generative AI tools like ChatGPT will be enforced, convening experts to draw up a code of practice for the technology by spring 2025.

But LatticeFlow’s test, developed in collaboration with researchers from Swiss university ETH Zurich and Bulgarian research institute INSAIT, provides an early indicator of specific areas where tech companies are at risk of non-compliance.

For example, discriminatory output has been a persistent problem in the development of generative AI models, which mirror human biases in gender, race, and other areas when prompted.

When testing for discriminative output, LatticeFlow’s LLM Checker gave OpenAI’s “GPT-3.5 Turbo” a relatively low score of 0.46. For the same category, Alibaba Cloud’s “Qwen1.5 72B Chat” model received only a 0.37.

When testing for “prompt hijacking,” a type of cyberattack in which hackers disguise a malicious prompt as legitimate to extract sensitive information, the LLM Checker awarded Meta’s “Llama 2 13B Chat” model a score of 0.42. In the same category, French startup Mistral’s “8x7B Instruct” model received a score of 0.38.

“Claude 3 Opus,” a model developed by Google-backed Anthropic, received the highest average score, 0.89.

The test is designed in line with the text of the AI ​​Act and will be expanded to include further enforcement measures as they are introduced. LatticeFlow said the LLM Checker would be available for free for developers to test the compliance of their models online.

Petar Tsankov, the company’s CEO and co-founder, told Reuters that the test results were generally positive and offered companies a roadmap to help them refine their models in line with the AI ​​law.

“The EU is still working out all the compliance benchmarks, but we are already seeing some gaps in the models,” he said. “With an increased focus on optimizing compliance, we believe model providers can be well prepared to meet regulatory requirements.” Meta declined to comment. Alibaba, Anthropic, Mistral and OpenAI did not immediately respond to requests for comment.

Although the European Commission cannot verify external instruments, the agency has been kept informed throughout the development of the LLM Checker and has described it as a “first step” in putting the new laws into practice.

A spokesperson for the European Commission said: “The Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI law into technical requirements.”

(Only the headline and image of this report may have been reworked by Business Standard staff; the rest of the content is automatically generated from a syndicated feed.)

First publication: October 16, 2024 | 12:11 pm IST



Source link

RELATED ARTICLES
- Advertisment -
Google search engine

Most Popular