LogoLogo
SplxAISign up for free
  • Home
  • Platform
    • Target
      • Add Target
        • Connection Setup
        • Target Configuration
      • Connections
        • Anthropic
        • Azure ML
        • Azure OpenAI
        • Bedrock
        • Dify AI
        • Gemini
        • Glean
        • Hugging Face
        • Mistral
        • OpenAI
        • OpenAI Assistant
        • REST API
        • Slack
        • Microsoft Teams
        • WhatsApp
        • Databricks
      • Target Settings
    • Probe Settings
    • Test Run
      • Test Run View
      • Test Run History
      • Test Run Report
      • Test Run Scheduler
    • Probe Run
      • Probe Run View
      • Test Case Details
      • Test Case Parametrization
      • Mitigation Strategy
      • Tracking an Issue
    • Overview Page
    • Prompt Hardening
    • User & Organization Settings
  • Platform API
    • Authentication
    • API Reference
      • Test Run
      • Probe Run
    • CI/CD with API
  • Updates
    • Product Updates
      • December 2024
      • November 2024
      • October 2024
      • September 2024
  • Links
    • Blog
    • GitHub
    • Community Slack
Powered by GitBook
On this page
  • New System Prompt Hardening
  • Hardened System Prompt
  • Prompt Hardening History
Export as PDF
  1. Platform

Prompt Hardening

PreviousOverview PageNextUser & Organization Settings

Last updated 4 months ago

Once you identify potential risks in your target using probes, the SplxAI Platform allows you to harden the target's system prompt to strengthen its security.

You can learn more about the importance and benefits of prompt hardening, along with use case comparisons to guardrails and our benchmark, in our blog post .

New System Prompt Hardening

To begin prompt hardening, navigate to the Prompt Hardening page in the Remediation section of the main navigation bar, and click the Harden System Prompt button in the top-right corner.

The hardening process begins by selecting the relevant probes you want to use to harden your system prompt. You can think of these as vulnerabilities you wish to protect against. The prompt hardening tool will then use the results of your probe runs to strengthen your system prompt against the identified vulnerabilities.

The table displays the probes, their categories, the last probe run on the target, and the percentage of failed test cases. This percentage serves as an indicator of where your target is most vulnerable and where there is the greatest opportunity for improvement through hardening. Once all relevant probes are selected (at least one is required), click Continue.

In the next step, simply provide your target's current system prompt and click Generate hardened system prompt, which will initiate the new prompt hardening process.

Depending on the number of selected probes and the length of the system prompt, prompt hardening may take a few minutes. Feel free to continue using other features of the app while the hardening process runs in the background, it will not be interrupted.

Hardened System Prompt

The latest prompt hardening will be displayed on the Prompt Hardening page. The header provides information about the generation date and time, the probes selected for hardening, the progress of the hardening, and the remediation status.

Once applied to your system prompt, you can flag the prompt hardening as Applied.

This action is not reversible.

Below, there are three sections:

  1. Current System prompt - displaying the system prompt before hardening.

  2. Generated system prompt - showing the generated hardened system prompt with options to:

    1. Highlight the differences,

    2. Expand the prompt for better readability,

    3. Copy the system prompt.

  3. Actions - lists all prompt hardening actions performed on your system prompt by our tool.

    1. Example: Stressing that competitor companies should neither be mentioned nor recommended.

Prompt Hardening History

The second tab on the prompt hardening page is History, which features a table displaying all previous prompt hardenings. The table includes information such as the generation date and time, selected probes, progress (in progress, generated, ...), and status (applied, not applied, ...).

System Prompt Hardening: The Backbone of Automated AI Security
Figure 1: Selecting Relevant Probes
Figure 2: Current System Prompt Input
Figure 3: Latest Prompt Hardening