09.09.2025 15:21 ● Author: Viacheslav Vasipenok

Anthropic Develops AI Classifier to Block Weapons of Mass Destruction Requests

Anthropic has introduced an AI classifier designed to detect and block dangerous queries related to technologies involving biological, chemical, and nuclear weapons. Preliminary tests indicate the system boasts a 96% accuracy rate.

The classifier aims to filter out information about weapons of mass destruction during the pre-training phase of AI models.

This approach seeks to prevent chatbots from providing instructions for creating such weapons, while preserving their ability to handle safe tasks.

Anthropic reiterated that safety must remain a core principle in AI development, emphasizing their commitment to responsible innovation.

Anthropic Develops AI Classifier to Block Weapons of Mass Destruction Requests

Popular

The Anatomy of an Entrepreneur

What is a Startup?

Advertising on QUASA

8 Logo Design Tips for Small Businesses

Top 5 Tips to Make More Money as a Content Creator

Latest news