Federal agencies can test and measure the biases of the artificial intelligence models they are experimenting with through the new governmentwide AI evaluation tool, USAi, according to David Shive, ...
The National Institute of Standards and Technology (NIST), the U.S. Commerce Department agency that develops and tests tech for the U.S. government, companies and the broader public, has re-released a ...
The "Petri" tool deploys AI agents to evaluate frontier models. AI's ability to discern harm is still highly imperfect. Early tests showed Claude Sonnet 4.5 and GPT-5 to be safest. Anthropic has ...