August 2025

New Layout

The updated side navigation introduces four main modules, each designed to group related capabilities for easier access. Available modules depend on your subscription plan:

  1. AI Red Teaming

    • Probe - Tools for automated red teaming and uncovering vulnerabilities.

    • Remediation - Prompt Hardening to strengthen model system prompts against misuse.

    • Monitoring - Log Analysis to review and analyze interaction logs to identify risks and improvement areas.

  2. AI Runtime Protection

    • UI supporting our runtime protection. Here, you can configure guardrail policies, visualize and interpret results, and use the integrated playground to test and refine configurations.

  3. AI Benchmarks

    • Standardized evaluation of AI models against SPLX probes.

  4. Settings

    • Manage user accounts, workspaces, permissions, and organizational configurations.

Public APIs

  • Probe Run: Remediation tasks (both static and dynamic, generated from AI analysis) are now included in the probe run endpoint response.

  • Target:

    • New Scores endpoint returns the overall performance score, category scores, and totals for simulated and successful attacks.

    • New Target Test Run export endpoint provides details of all test runs on a single target.

  • Benchmarks: Multiple new endpoints added. All information from the platform’s benchmark feature can now be exported via API, from high-level model details down to specific test cases contributing to scores.

Models Benchmarked

Our Model Benchmarks have been extended to include results for the following models:

  • Anthropic Claude 3.7 Sonnet

  • Anthropic Claude Opus 4.1

  • Deepseek-chat-v3.1

  • Gemma 3 27 It

  • Gemma 3n E4B It

  • Gemma-3-12b-it

  • Gemma-3-4b-it

  • Gemini 2.5 Flash

  • Gemini 2.5 Pro

  • Llama 3 70B Instruct

  • Llama 3 8B Instruct

  • Llama 3.1 70B Instruct

  • Llama 3.1 8B Instruct

  • Llama 3.2 1B Instruct

  • Llama 3.2 3B Instruct

  • Llama-3.3-70B-Instruct

  • Llama-3.3-8B-Instruct

  • Mistral Large 2

  • Mistral Medium 3

  • Mistral Small 3.1

  • OpenAI gpt-oss-120b

  • OpenAI gpt-oss-20b

  • OpenAI GPT 5

  • OpenAI GPT 5-chat

  • OpenAI o3

  • OpenAI o4-mini

Improvements & Tweaks

Custom Compliance

  • Add item action is now disabled if a new item section is already open.

  • Improved naming conventions for better consistency with predefined compliances.

  • The same probe can no longer be selected multiple times within a single compliance item.

  • Increased contrast on separation lines for clearer distinction between sections.

Test & Probe Run Improvements

  • Test run table now also displays results for cancelled test runs, improving consistency, visibility, and traceability.

Targets & Connections

  • Azure OpenAI: Added input fields for extra LLM parameters and API version.

Last updated