August 2025

New Layout

The updated side navigation introduces four main modules, each designed to group related capabilities for easier access. Available modules depend on your subscription plan:

AI Red Teaming
- Probe - Tools for automated red teaming and uncovering vulnerabilities.
- Remediation - Prompt Hardening to strengthen model system prompts against misuse.
- Monitoring - Log Analysis to review and analyze interaction logs to identify risks and improvement areas.
AI Runtime Protection
- UI supporting our runtime protection. Here, you can configure guardrail policies, visualize and interpret results, and use the integrated playground to test and refine configurations.
AI Benchmarks
- Standardized evaluation of AI models against SPLX probes.
Settings
- Manage user accounts, workspaces, permissions, and organizational configurations.

Public APIs

Probe Run: Remediation tasks (both static and dynamic, generated from AI analysis) are now included in the probe run endpoint response.
Target:
- New Scores endpoint returns the overall performance score, category scores, and totals for simulated and successful attacks.
- New Target Test Run export endpoint provides details of all test runs on a single target.
Benchmarks: Multiple new endpoints added. All information from the platform’s benchmark feature can now be exported via API, from high-level model details down to specific test cases contributing to scores.

Models Benchmarked

Our Model Benchmarks have been extended to include results for the following models:

Anthropic Claude 3.7 Sonnet
Anthropic Claude Opus 4.1
Deepseek-chat-v3.1
Gemma 3 27 It
Gemma 3n E4B It
Gemma-3-12b-it
Gemma-3-4b-it
Gemini 2.5 Flash
Gemini 2.5 Pro
Llama 3 70B Instruct
Llama 3 8B Instruct
Llama 3.1 70B Instruct
Llama 3.1 8B Instruct
Llama 3.2 1B Instruct
Llama 3.2 3B Instruct
Llama-3.3-70B-Instruct
Llama-3.3-8B-Instruct
Mistral Large 2
Mistral Medium 3
Mistral Small 3.1
OpenAI gpt-oss-120b
OpenAI gpt-oss-20b
OpenAI GPT 5
OpenAI GPT 5-chat
OpenAI o3
OpenAI o4-mini

Improvements & Tweaks

Custom Compliance

Add item action is now disabled if a new item section is already open.
Improved naming conventions for better consistency with predefined compliances.
The same probe can no longer be selected multiple times within a single compliance item.
Increased contrast on separation lines for clearer distinction between sections.

Test & Probe Run Improvements

Test run table now also displays results for cancelled test runs, improving consistency, visibility, and traceability.

Targets & Connections

Azure OpenAI: Added input fields for extra LLM parameters and API version.

PreviousSeptember 2025 NextJuly 2025

Last updated 1 month ago

Was this helpful?

hashtagNew Layout

hashtagPublic APIs

hashtagModels Benchmarked

hashtagImprovements & Tweaks

hashtagCustom Compliance

hashtagTest & Probe Run Improvements

hashtagTargets & Connections