# August 2025

## New Layout

<figure><img src="https://1029475228-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fi12bk7lo75SODuwcRCQp%2Fuploads%2FlRCNyBP98ThoygjLBYbk%2Fprobe.splx.ai_w_31_target_15%20(6).png?alt=media&#x26;token=5175e668-f048-4c4d-bf14-31c2848dbcd1" alt=""><figcaption></figcaption></figure>

The updated side navigation introduces four main modules, each designed to group related capabilities for easier access. **Available modules depend on your subscription plan**:

1. **AI Red Teaming**
   * **Probe** - Tools for automated red teaming and uncovering vulnerabilities.
   * **Remediation** - Prompt Hardening to strengthen model system prompts against misuse.
   * **Monitoring** - Log Analysis to review and analyze interaction logs to identify risks and improvement areas.
2. **AI Runtime Protection**
   * UI supporting our runtime protection. Here, you can configure guardrail policies, visualize and interpret results, and use the integrated playground to test and refine configurations.
3. **AI Benchmarks**
   * Standardized evaluation of AI models against SPLX probes.
4. **Settings**
   * Manage user accounts, workspaces, permissions, and organizational configurations.

## Public APIs

* [**Probe Run**](https://docs.probe.splx.ai/platform-api/api-reference/probe-run)**:** Remediation tasks (both static and dynamic, generated from AI analysis) are now included in the probe run endpoint response.
* [**Target**](https://docs.probe.splx.ai/platform-api/api-reference/target)**:**
  * New **Scores endpoint** returns the overall performance score, category scores, and totals for simulated and successful attacks.
  * New **Target Test Run export endpoint** provides details of all test runs on a single target.
* [**Benchmarks**](https://docs.probe.splx.ai/platform-api/api-reference/benchmarks)**:** Multiple new endpoints added. All information from the platform’s benchmark feature can now be exported via API, from high-level model details down to specific test cases contributing to scores.

## Models Benchmarked

Our [Model Benchmarks](https://docs.probe.splx.ai/ai-benchmarks/model-benchmarks) have been extended to include results for the following models:

* Anthropic Claude 3.7 Sonnet
* Anthropic Claude Opus 4.1
* Deepseek-chat-v3.1
* Gemma 3 27 It
* Gemma 3n E4B It
* Gemma-3-12b-it
* Gemma-3-4b-it
* Gemini 2.5 Flash
* Gemini 2.5 Pro
* Llama 3 70B Instruct
* Llama 3 8B Instruct
* Llama 3.1 70B Instruct
* Llama 3.1 8B Instruct
* Llama 3.2 1B Instruct
* Llama 3.2 3B Instruct
* Llama-3.3-70B-Instruct
* Llama-3.3-8B-Instruct
* Mistral Large 2
* Mistral Medium 3
* Mistral Small 3.1
* OpenAI gpt-oss-120b
* OpenAI gpt-oss-20b
* OpenAI GPT 5
* OpenAI GPT 5-chat
* OpenAI o3
* OpenAI o4-mini

## Improvements & Tweaks

### Custom Compliance

* Add item action is now disabled if a new item section is already open.
* Improved naming conventions for better consistency with predefined compliances.
* The same probe can no longer be selected multiple times within a single compliance item.
* Increased contrast on separation lines for clearer distinction between sections.

### Test & Probe Run Improvements

* Test run table now also displays results for **cancelled test runs**, improving consistency, visibility, and traceability.

### Targets & Connections

* [**Azure OpenAI**](https://docs.probe.splx.ai/ai-red-teaming/probe/target/index/azure-openai): Added input fields for **extra LLM parameters** and **API version.**
