Anthropic has introduced a new funding program to support the creation of novel AI benchmarks, crucial for assessing the capabilities and potential risks of AI models, including its own Claude. 

Anthropic is inviting third-party organizations to submit proposals for evaluations that can effectively gauge advanced AI functionalities. 

SPAIN-WIRELESS-TELECOMS-INTERNET-MOBILE
An AI (artificial intelligence) logo is pictured at the Mobile World Congress (MWC), the telecom industry's biggest annual gathering, in Barcelona on February 27, 2024. The world's biggest mobile phone fair throws open its doors in Barcelona with the sector looking to artificial intelligence to try and reverse declining sales.
(Photo : JOSEP LAGO/AFP via Getty Images)

New Funding Program of Anthropic Targets AI Benchmarks

The program, outlined in a blog post by Anthropic, emphasizes the need for robust evaluations in the AI safety domain and highlights current limitations in existing evaluation frameworks.

According to Anthropic, the demand for high-quality assessments is surpassing available options, prompting the company to provide financial support. Anthropic's initiative seeks to elevate the standards of AI safety assessments by funding evaluations that measure a range of advanced capabilities in AI models.

The company aims to foster a broader ecosystem of assessments that can address complex challenges in AI research and development. In its announcement, Anthropic underscores the importance of these evaluations in mitigating potential risks associated with AI advancements.

The company aims to support evaluations that could assess AI Safety Levels (ASLs), defined in its Responsible Scaling Policy, across various domains, including cybersecurity and management of chemical, biological, radiological, and nuclear (CBRN) risks.

The initiative also prioritizes the development of metrics that can effectively gauge the autonomy and safety of AI models without the need for traditional scoring methods.

Anthropic aims to advance the field by encouraging the creation of evaluations that go beyond current benchmarks in measuring the societal impacts and potential risks associated with AI technologies.

In addition to funding evaluations, Anthropic plans to invest in tools and infrastructure that streamline the development and deployment of high-quality benchmarks. These resources are intended to facilitate more efficient testing and evaluation processes within the AI community.

Read Also: FTC Launches Investigation into Big Tech's Investments in AI, With Google, Amazon, and Microsoft Under Scrutiny

Growing AI Scrutiny

Anthropic's announcement comes amid growing scrutiny and concern over AI technologies' ethical implications and potential risks. In supporting the development of rigorous evaluations, Anthropic wants to contribute to a safer and more accountable AI landscape where risks are identified and managed proactively.

The company invites interested parties to submit proposals for evaluation projects through its application process, promising a commitment to transparency and rigorous evaluation standards. Anthropic ultimately aims for this initiative to spur progress towards a future where thorough AI evaluation becomes standard industry practice.

"A robust, third-party evaluation ecosystem is essential for assessing AI capabilities and risks, but the current evaluation landscape is limited. Developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply," Anthropic wrote in the blog post.

"Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem," it added.

Related Article: Revolutionary AI Inspired by Sea Slug and Octopus Learns to Navigate, Explore on Its Own to Overcome Obstacles

Byline


ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion