Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Max Tokens - This parameter specifies the absolute maximum number of tokens that model can generate and return in the response.
Welcome to the SplxAI Platform documentation.
This comprehensive guide serves as a user manual to assist you in onboarding with our platform. It offers detailed explanations of key concepts, terms, and functionalities, ensuring you have all the information you need. Whether you are new to platform or seeking in-depth insights, this documentation is designed to support your understanding and enhance your experience with our platform.
Platform is designed to protect your generative AI applications from harmful activities. It tests your application using a customizable set of probes, automated red teaming tools, designed to trigger and detect specific vulnerabilities. Each probe targets a particular vulnerability, allowing for comprehensive security and safety testing. The SplxAI Platform also provides relevant mitigation strategies for detected issues.
Probes use generative AI to create attacks based on a comprehensive attack database, which is collected and constantly updated from various LLM CTFs, open-source data, and both automated and manual AI research. Each probe applies different variations and strategies depending on the target’s industry, company, and goals, in order to maximize the security and safety assessment.
The documentation is divided into four main sections.
Platform - Serves as a user manual for the platform, explaining key concepts and providing a step-by-step guide for effective usage. It helps users understand and make the most out of the platform's features.
Platform API - A detailed reference for developers, including endpoints and request and response formats. This section ensures smooth integration with the platform's APIs.
Updates - A dedicated space for periodic announcements about new features, improvements, and updates, keeping you informed about the latest developments.
Links - Useful resources, such as our blog for the latest news on AI security and SplxAI GitHub.
The first step in adding your target to the SplxAI Platform is setting up the connection between them. With SplxAI Platform, you can observe how your application performs across various layers, from the LLM to the platform level, which simulates real user interactions.
Start by selecting your connection type based on your use case.
The following integration methods are currently supported:
API: REST API connection between your GenAI application and the SplxAI Platform.
Platform: Test runs are executed on chatbots that are accessible through external platforms (e.g., Slack, WhatsApp, Glean). Probe uses the platform’s APIs to interact with the chatbots.
LLM: Tests are executed directly on the Large Language Model.
LLM Development Platform: The SplxAI Platform connects with the APIs provided by LLM development platforms.
Once you’ve selected the appropriate connection type, a configuration tab will appear on the next step, prompting you to input the required connection details. These inputs will be specific to the type of connection you’ve chosen, such as API keys, phone numbers, or endpoint URLs.
Once all the required information is entered, click the “Continue” button. A connection test between the SplxAI Platform and your target will run automatically in the background. The result of this test will be displayed in a dialog. You can proceed with the remaining configuration step once the connection test is successful.
To initiate automated testing with probes, you must first add your target. To do so, open the workspace & target drop-down menu. Once expanded, click the "Add Target +" button (Figure 1).
After clicking the button, the Add Target page will appear (Figure 2). On this page, you can choose your connection type and configure your target for automated testing.
Adding a target consists of three steps:
For detailed instructions on connecting your target with the SplxAI Platform, please proceed to the next page.
The Target of the testing is your generative AI application, specifically designed for conversational interactions. This application, whether a chatbot used internally within your organization or a public-facing platform for customer engagement, undergoes automated testing to identify potential vulnerabilities.
The testing will involve a variety of AI-generated attack scenarios to evaluate the application's resilience and security. This is achieved by selected set of probes that perform specific security assessments.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
API Key - Your Anthropic API Key, it can be generated in via Anthropic's web Console, in section in Account Settings.
Model - Anthropic API name of the large language model that your application is using, you can search for available models, under "Anthropic API" column of the "Model names" table .
For more information, you can explore the official documentation.
If you have any additional questions that are not addressed in the documentation, would like to provide feedback on your SplxAI Platform experience, or have encountered an issue that you wish to report, please join us on our official public & dedicated . We encourage you to reach out at any time, our team will be happy to assist you.
For details on the connection methods and descriptions of the input fields, find your preferred connection type in the navigation bar under Connections, or on the page.
Select your connection, where your preferred connection type is chosen. You can find a more detailed explanation of each connection on the page.
, where the connection between the SplxAI Platform and the target application is established.
, where the details and capabilities of the target are defined.
To check how to add your first target, please proceed to the page.
The final step before enabling the probes is providing the details and capabilities of your target (Figure 1). To do this, fill in all the input fields with your target’s information.
Target Name
The name of your target that will be displayed within the SplxAI Platform.
Target Environment
Field for tracking the various stages of your target (Development, Production, and Staging).
Target Description
A brief description of your application’s purpose and use case.
Language
The default language of your application. Red teaming attacks will be generated in this language.
Target Type
You can choose from four target types, depending on whether your application is publicly available or internal, and whether it uses Retrieval-Augmented Generation (RAG) or not.
Public With RAG
Public Without RAG
Private With RAG
Private Without RAG
Target type determines the default risk priorities for each probe, which are used to calculate your overall risk surface.
Rate Limit
The maximum number of requests your application can process per minute.
Parallel Requests
Turned on by default.
Toggle off to have the SplxAI Platform send requests to your target one at a time.
Modes Supported
This specifies the types of input your application can process, with text set as the default.
SplxAI Platform also supports attacks through uploaded images, voice, and documents.
If your application can handle any of these inputs and you want to test it against multimodal attacks, select the relevant modes.
After entering all the required inputs, your target can be saved, and you can proceed to configure the probes. Target is saved by clicking the "Save" button.
URL
Supported Azure OpenAI endpoints (protocol and hostname)
The general format for an endpoint is: https://{your-resource-name}.openai.azure.com.
For example: https://yourcompany.openai.azure.com
Endpoint is also found on the Keys and Endpoint page.
API Key - Your application's Dify API Key. Obtain the API key in the Dify Platform by navigating to the API Access section in the left-side menu. Here, you can manage the credentials required to access the API.
A success notification will indicate that your new target has been saved successfully. Your target will be automatically selected, and its name will appear in the targets list within the drop-down menu. The page will be displayed, allowing you to start configuring the probes.
If you need to make any changes later, you can edit the target at any time on the page.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
API Key - In the , find a key on the Keys and Endpoint page of the Azure OpenAI resource.
Deployment Name - Configured when setting up your Azure OpenAI model. This is the unique identifier that links to your specific model deployment. Deployment names can be found on the Deployments section of your project on the .
For more information, you can explore the official documentation.
For more information, you can explore the official documentation.
AWS Access Key ID & AWS Secret Access Key
These can be created and accessed via the AWS Management Console.
In the Security credentials tab, you can create a new key or view existing Access Key IDs.
Note that AWS Secret Access Keys are only shown during creation.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
Navigate to the , and under the Users tab, select the desired user.
For step by step guide, explore the official AWS documentation, Updating .
AWS Region - The AWS Region where your resource is located, it can be found in the top-right corner of the .
Model - Specify one of the models supported by Amazon Bedrock by entering its Model ID, which can be found on the within the Amazon Bedrock documentation or console.
For more information, you can explore the official documentation.
API Key
From the navigation bar, open the Endpoints page and select the Serverless endpoints tab.
Open your endpoint from the list.
Copy the Key and insert it into the Probe integration input field.
URL - On the same serverless endpoint where you got the API key, the required URL can be found in the Target URI field. Copy this URL and insert it into the Probe integration input field.
URL
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
In Azure Machine Learning Studio, select the workspace on the .
For more information, you can explore the official documentation.
Token - Your Glean token. You can learn more about token generation in the of the Glean documentation.
This is the endpoint used to connect with the Glean API. Refer to the Glean documentation for full details.
When entering the URL in SplxAI Platform, use only the base part of the URL, not the full path.
Valid: https://splx.ai-v2-be.glean.com
Invalid: https://splx.ai-v2-be.glean.com/rest/api/v1/chat
For more information, you can explore the official documentation.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
Token - For token-based authentication, your Hugging Face user access tokens can be generated on the Hugging Face platform's tab.
Model - The Hugging Face Hub hosts different models for a variety of machine learning tasks, choose one of the model's available on the page.
For more information, you can explore the official documentation.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
API Key - Your OpenAI API key, find the secret API key on the OpenAI Platform's .
Model - The OpenAI model of your choice, you can get the overview of the models on the of OpenAI Platform documentation.
For more information, you can explore the official documentation.
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
API Key - Your Mistral API Key, it can be generated in via Mistrals web console, in section.
Model - Specify the Mistral model you intend to use by selecting the value listed in the "API Endpoints" column of the Mistral for your chosen model.
For more information, you can explore the official documentation.
Available connection types between SplxAI Platform and the target:
API
PLATFORM
LLM
LLM Development Platform
Target Settings page allows you to make the changes on the selected target. All fields are configurable, the only exception is the connection type, which cannot be edited.
Whenever you make changes to your application that require an connection update, you can do so in the “Configure Connection” tab. Once you decide to save your changes, the test connection process will begin. You’ll only be able to save your changes once the connection is successfully established.
If you need to do the updates on the target's configuration, such as modifying the base language for your chatbot, make the necessary changes and click the "Save" button.
API Key - Your OpenAI API key. You can find the secret API key on the OpenAI Platform's .
Assistant ID - The ID of your OpenAI assistant. To retrieve the ID, go to the OpenAI Platform, navigate to your project's dashboard, and open the . The ID is displayed under the assistant's name.
For more information, you can explore the official OpenAI Platform documentation, .
For more information about the connection types and target configuration, revisit the , and pages.
After connecting your target to the SplxAI Platform, you can enable specific probes to be executed during test runs. These probes help identify potential vulnerabilities in the target (Figure 1).
All probes are specifically designed to provoke and detect a particular vulnerability.
Probes are divided into four major categories.
Beside each probe, there is a toggle button that enables it. The toggle opens a optimization dialog with input fields that help you tailor the probe to your application's needs, making it domain-specific and enhancing the relevance and realism of the simulated attacks (Figure 2).
The example of the probe's optimization can be seen in the provided example.
Clicking the "Save and Enable Probe" button stores the probe configuration and enables the probe on the given target.
To later edit the configuration of an already optimized probe, click the gear icon in the corresponding row.
Once you find a probe that you are interested in you can click the "Details" button to view its description including probe category, probe ID, supported modes (text, image, voice, document), and the cost of probe run in credits.
After connecting the target to the SplxAI Platform and selecting and configuring at least one of the available probes, you can start your first test run.
Each probe is assigned a risk priority based on the selected target type. A higher risk priority indicates that any identified vulnerabilities have a greater impact on the target’s overall risk surface, displayed on the page. The default risk priority can be adjusted in the optimization modal.
For assistance with your first test run, please visit the page.
With Probe, you can run automated tests directly on your WhatsApp chatbot. To do so, you first need to set up the integration. In the “Select your integration” tab, choose WhatsApp as the integration type, and proceed to the next step to access the configuration fields.
Number
Enter the number of your WhatsApp chatbot.
Make sure to follow the international phone number format as indicated in the placeholder.
Conversation Reset Message
Specify the message that resets your chatbot’s existing conversation.
This is the message that you defined that triggers your chatbot’s initial starting message and restarts the conversation in the same chat.
It is highly recommended to use this option as it allows Probe to run tests without starting a large numbers of new chats, thereby increasing execution speed.
Initial Messages
The initial message is the message with which your chatbot starts the conversation. This typically includes a question about the user’s conversation preferences.
The Question and Answer fields help Probe initiate the conversation with your chatbot using the preferences you select.
Example:
Question: Hi! Welcome to the SplxInsurance chatbot! Please select your language to continue the conversation: English, German, Italian.
Answer: English
If there are multiple initialization messages, for instance, if after selecting a language the user must choose a product of interest, you can add multiple question-and-answer combinations.
Once all the information is filled in, your WhatsApp connection will be tested when you click “Continue” to proceed. The test connection will send a message to your WhatsApp chatbot and mark it as passed if the chatbot responds.
After successfully establishing the connection, you can proceed to configure the target fields.
Bot ID
On the My bots page, select your desired bot.
Once on the bot’s page, click on Settings.
Copy the Bot handle value and paste it into the Bot ID input on Probe.
Bot Secret
On the bot’s page, click Edit in the Direct Line row.
Copy the Secret key and paste it into the Bot Secret input on Probe.
Once you’ve entered the required inputs, click Continue to test your connection and proceed.
To pentest your Slack bot with Probe, you’ll need to provide the bot’s ID in the Slack User Bot ID field:
On Slack's home screen, search for Apps
Under Apps, click on your bot's name to open its profile.
Open the details and find Member ID and click the copy button next to it
Paste this ID into the Slack User Bot ID field within the Probe platform
Slack integration operates in two modes.
Direct message
In this mode Probe sends messages to channels without any modifications (like adding a mention tag at the beginning). When a message is sent, Probe waits for the target to respond inside the same channel.
Mention
In this mode Probe mentions the target in messages sent to the channel, and continues the conversation inside the thread. Messages inside the thread do not mention the target.
To install Slack integration to your workspace, click the Connect new workspace button. You will then be prompted to log in to your Slack workspace. Once logged in, you will be asked to authorize the Probe integration for your workspace, with a detailed overview of the permissions the app will receive upon installation.
Once finished, to view your newly added workspace, click the refresh button next to the workspace dropdown on the Probe platform.
To enable integration to chat with your bot, you will need to create the private channels. In each created channel you should then add your target and Probe integration. You can do this by following these steps:
Create channels without adding any additional users (we recommend creating at least 4 channels)
Add your target by typing commands: /invite @YourAppName
and /invite @SplxAI Probe
, or you can follow the steps from Figure 6.
Optional: Mute notifications for created channels
After creating the channels, go to the Channels section on Slack’s home screen. Open your private channels, click on Details, and locate the Channel ID. Copy this ID and paste it into the Channels field in the Probe integration tab (similar to obtaining the Slack User Bot ID).
While your Slack bot is generating its response, various loading messages may be displayed to the user (e.g., “Please wait, generating response…”). To ensure Probe ignores these messages and waits for the bot’s final response, enter the regular expression (regex) for the constant part of the loading message in the Loading Messages section. For example:
Loading message: “Please wait, generating response…”
Regex: ^Please wait, generating response.*$
You can enter multiple loading messages that should be ignored by Probe.
When you start your Probe, Slack integration will look through all private channels, of which it is part, and it will select all channels containing targets. When the backend starts sending messages integration will distribute them across created channels, and wait for your bot to respond.
Microsoft Teams chatbots are Azure bots connected to Microsoft Teams. Testing is performed directly on the Azure bot, as it contains all the functionalities of the bot within Microsoft Teams. Azure Bot integration uses the .
To create the integration, you need to provide the bot ID and the bot Direct Line secret from the Bot Framework. To retrieve these, navigate to your bot in the .
After you install the integration bot to the workspace you will need to where bots (Probe and the target) will communicate.
For each test run, you can download a shareable PDF report. This report provides details about your test target, along with an overview and summary of the test run. It includes general information such as the start time, execution time, total number of test cases, and more. Additionally, it features the combined results and a chart of all probe runs, offering a clear, one-stop insight into the target’s vulnerabilities.
The report also provides specific results for each probe run and their suggested mitigation strategies, helping you understand vulnerabilities and how to fix them.
To download the report, click the Generate Report button on the .
System Prompt - Your application’s system prompt. It sets the initial instructions or context for the AI model, defines the behavior, tone, and specific guidelines the AI should follow while interacting. For best practices, refer to the .
API Key - Your Gemini API Key, which can be generated through Google AI Studio in the .
Model - Specify the Gemini model you intend to use by entering the ID found in the "Model Variant" column under the model's name in the .
For more information, you can explore the official documentation.
Each test run consists of one or more probes that create test cases and perform testing on your application. All test cases within a single probe run are associated with the specific vulnerability that the probe is designed to detect.
The probe implements various strategies and techniques to identify its target vulnerability within your conversational application. Despite these variations, all test cases share the specific domain of your chatbot and the details that you defined in the probe's optimization.
The cards of the most recent runs of each probe are displayed on the page. You can access details for specific probe run from there. To view the details of the older probe runs, navigate to the . Open one of the older test runs which included the desired probe, and in the click the arrowhead in the row corresponding to that probe.
To familiarize yourself with the detailed view of a probe run's results and to understand terms such as strategy, red teamer, and variation, please refer to the page.
The Test Run Scheduler enables you to plan and automate test runs at predefined times and frequencies, helping your team streamline evaluation workflows and reduce operational overhead. Instead of manually setting up tests every time you want to validate model behavior, scheduler allows you to set up recurring or single-run executions that run in the background.
Scheduler supports a wide range of use cases, from scheduled daily Context Leakage probe in production to monthly full security assessments—enabling consistent, repeatable testing practices aligned with your organization's policies and release cycles.
A new Scheduled tab has been added to the Test Runs page for easy management of your upcoming test runs, giving you clear visibility into what test runs are scheduled.
Starting a new test run now includes the option to schedule it for later. The familiar flow for launching a test is unchanged, but a new "Schedule For Later" button has been added to the test run creation modal.
Go to the Test Runs or Overview page.
Click New Test Run.
Configure the test name and select probes.
Click Schedule for Later.
Choose a date, time, and frequency for execution.
Confirm with Schedule Test Run.
All scheduled test runs can be fully managed from the Scheduled tab on the Test Runs page. This page provides a centralized view of upcoming and recurring test runs, allowing you to take direct actions as needed.
Each scheduled test run includes three control icons:
Start Test Run – Manually trigger the test run immediately.
Edit Test Run – Modify the scheduled test run configuration.
Delete Test Run – Permanently remove the scheduled run.
If a test run is not currently needed, you can toggle its status to Inactive without deleting it. This is useful for temporarily pausing automated runs without losing their configuration.
Click the Edit Test Run icon to modify any existing scheduled run. The following parameters can be updated:
Test Run Name
Date and Time of Next Run
Frequency (e.g. Daily, Weekly, Monthly, Single Run)
Selected Probes (custom or predefined probes)
Once changes are made, click Save Changes to update the schedule.
The Probe Run View displays and visualizes the test data of the selected probe's run within your executed test run. This view presents the results of all executed test cases, including details such as attack strategies, variations and the messages that the probe exchanged with the target.
At the top of the page, you will find basic information about the probe's run, including the total number of test cases, the number of failed and error test cases, the execution date and time, the run's status, and a progress bar.
The Sankey diagram (Figure 1) visually depicts the connections among strategy, red teamer, variation, and the outcomes of test cases. As previously mentioned, the width of the flow between nodes in the diagram corresponds to the number of test cases, while the color coding represents the percentage of failed test cases out of the total executed.
Probe Result Table displays all probe's test cases executed against your target.
The table contains:
Id: Unique identifier of the test case (attack).
Attempt: If the same attack is executed multiple times, each attempt is assigned a different number.
Result: Outcome of the executed test case.
Passed: The test hasn't detected the vulnerability on your application.
Failed: The attack found the targeted vulnerability.
Error: An error occurred while communicating with the target.
Both filtering and global search functionalities are available to customize the view according to your specific preferences.
The table (maintaining filter applied) can be exported in CSV and JSON formats, allowing for easy integration with other tools and systems for further analysis or reporting.
To view detailed interactions between the probe and the target for each test case, as well as an explanation of the test case result, click the "Details" button in the corresponding row. Refer to the next page for further explanation.
URL - This is your target endpoint to which the attack messages will be sent.
POST Request Payload Sample - Here, you provide the payload (body of the HTTP request). Once provided, the payload should include the following placeholders, which you need to insert:
{message} - This placeholder represents where the Probe will insert attack messages, simulating input from a user interacting with your application.
{session_id} - This placeholder marks the location where a unique string, identifying the current conversation session, will be placed. This ensures that the request is tied to a specific session for multi-step testing.
The payload can contain additional fixed arguments if needed.
Response Path - The JSON path pointing to the message within your chatbot's API response to the given request.
HTTP Headers - Enter the key-value pairs necessary for your API Requests. Authorization headers must be included for non-public APIs.
Sometimes, your application may use different endpoints to manage sessions that track messages in conversations. These endpoints can be separate from the ones used for sending messages. For example:
A session might be initiated with one request to an "open session" endpoint.
The session might then be closed with another request to a "close session" endpoint.
If your application operates this way, the Open Session and Close Session options can be toggled and configured in the integration settings. This allows you to add the appropriate requests for starting and ending sessions, ensuring the Probe can properly simulate and test multi-step conversations.
API endpoints can be implemented in various ways, and we can’t cover every scenario. To address this, we’ve developed the SplxAI Proxy Interface, which you can use for easier integration.
Feel free to contact us, and we’ll assist you in creating it.
Strategy, Red Teamer, Variation: Explained in section.
Probe allows you to integrate with project management tools and automatically create issues containing information from a probe run.
After clicking Create, a new issue will be created in the selected platform. It will include the details of the probe run, a link to it, and any additional notes provided.
Select one of your Jira projects.
Choose the issue type.
Add additional notes to be appended in the description after the Probe Run details.
Choose the priority of your incident.
Select it's category.
Add additional notes to be appended in the description after the Probe Run details.
Upon accessing the Test Case Details, you will find a list of one or more conversations conducted as part of the selected test case. Some strategies execute the attack through multiple separate conversations, for example to refine their approach with each subsequent interaction based on insights gained from previous ones. If this is the case, all related conversations are displayed together in the Attack Scenarios section. These conversations are presented within an expandable element, allowing you to view the specific conversation of interest.
When the conversation expands, you can observe the following:
Messages generated by the probe, marked as USER.
Responses from your application, marked as ASSISTANT.
The Encoded toggle switches the conversation's content between encoded and decoded:
Encoded view: This view displays exact original content exchanged between your application and the probe.
Decoded View: Certain variations may render the text unreadable to humans (e.g., base64 encoding, joined words, etc.) or on the foreign language. The decoded view transforms the content into readable, allowing you to understand the context of the attack.
The Explanation provides detailed information on why the test case passed or failed.
To track an issue, click the "Track Issue" button in the top-right corner of the . A modal will appear, prompting you to select the platform and fill in the required fields.
Issue tracking is only available once the tool has been integrated through in the User Settings.
For the integration steps, visit the section.
For the integration steps, visit the section.
Probe's test cases are dynamically AI generated based on a set of predefined instructions. Each test case is defined by selecting one value from each of the three components: strategy, red teamer, and variation. By varying these parameters, a wide range of test cases can be generated to cover different aspects of your application's specific vulnerability.
Strategy - Method of orchestrating attacks and the included context.
The strategy defines which messages will be available to attack generator, detectors, and targets, and determines the order in which each element of the Probe will be used. Various strategies can be deployed against your target.
Red Teamer - Instruction to the LLM on how to modify or craft prompts from the provided context.
The Probe platform features a variety of red teamers across the probes. Red teamers collect context (information about the attack) and contain instructions on how to handle them when crafting the attacks. The purpose of each red teamer should be understandable from its name and the conversation it generates.
Variation - Additional algorithmic or stochastic (with LLM) changes to the prompt before passing it to the target.
Variation involves making changes to the prompt, utilizing large language models, in order to: Increase Success Rate, by modifying adversarial prompts in various ways to enhance the effectiveness of the attack and to Avoid Detection by reducing the likelihood of the prompt being detected as adversarial by the security solutions employed in the application.
Once you identify potential risks in your target using probes, the SplxAI Platform allows you to harden the target's system prompt to strengthen its security.
To begin prompt hardening, navigate to the Prompt Hardening page in the Remediation section of the main navigation bar, and click the Harden System Prompt button in the top-right corner.
The hardening process begins by selecting the relevant probes you want to use to harden your system prompt. You can think of these as vulnerabilities you wish to protect against. The prompt hardening tool will then use the results of your probe runs to strengthen your system prompt against the identified vulnerabilities.
The table displays the probes, their categories, the last probe run on the target, and the percentage of failed test cases. This percentage serves as an indicator of where your target is most vulnerable and where there is the greatest opportunity for improvement through hardening. Once all relevant probes are selected (at least one is required), click Continue.
In the next step, simply provide your target's current system prompt and click Generate hardened system prompt, which will initiate the new prompt hardening process.
The latest prompt hardening will be displayed on the Prompt Hardening page. The header provides information about the generation date and time, the probes selected for hardening, the progress of the hardening, and the remediation status.
Below, there are three sections:
Current System prompt - displaying the system prompt before hardening.
Generated system prompt - showing the generated hardened system prompt with options to:
Highlight the differences,
Expand the prompt for better readability,
Copy the system prompt.
Actions - lists all prompt hardening actions performed on your system prompt by our tool.
Example: Stressing that competitor companies should neither be mentioned nor recommended.
The second tab on the prompt hardening page is History, which features a table displaying all previous prompt hardenings. The table includes information such as the generation date and time, selected probes, progress (in progress, generated, ...), and status (applied, not applied, ...).
The User and Organization Settings section serves as a place for managing both personal and organizational configurations within the Probe platform.
To access the User and Organization Settings, click on the user icon located in the top-right corner. This will open the landing page and replace the Platform’s navigation bar with the dedicated navigation for the settings section.
In the continuation of this page, we will provide detailed explanations for the features within the User and Organization Settings section.
When you’re done with the editing, click Leave Settings, located near the navigation bar, to exit the settings page.
You can also find the Logout option in the bottom of the navigation bar, if you wish to log out of your account.
Besides viewing the email address you registered with, the Account Settings page allows you to update your account password directly within the Probe interface.
Click the “Generate New Token +” button.
Provide Token Details.
Token Name: Assign a meaningful name to your token.
Description: Add a brief description for clarity.
Duration: Specify the token’s expiration period.
Once generated, the token will be displayed only once. Be sure to copy and securely store it immediately, as it cannot be accessed again.
Name
Partial Token Key (not the full key for security)
Created Date
Expiration Date
Delete Token Option
To add a user to your organization, navigate to User & Organization Settings (user icon on top right corner of the platform) and follow these steps:
Create the organization (if it hasn't been created already):
Invite the User:
Enter the email address of the user you want to invite.
Click "Invite". An invitation email will be sent to the user, prompting them to create a password.
Manage User Status:
In the Users table, you can view all users and their current status.
For users who haven’t created their password yet (status: Invited), you can either:
Reinvite: Send a new invitation.
Copy Invitation Link: You can copy the link and manually send it to the user.
The invitation link is expirable. If the user does not accept the invitation in time, a reinvite will be required.
If the invitation email isn't received promptly, check the spam folder.
You can view all available and installed integrations on the Integrations page within the Organization section.
Do not confuse organization integrations with target integrations, which link the Probe platform to its target.
After clicking "Install" for the Jira integration, you will be redirected to the Atlassian Authorization app. Here, the SplxAI Platform will request access to your Atlassian account. You will need to select the app you wish to integrate with, grant the necessary access permissions, and accept the privacy policy.
All tickets created by the SplxAI Platform will be displayed as if they were reported by the user who accepted the integration.
To integrate SplxAI Platform with ServiceNow, you will need the following information:
ServiceNow Instance URL
Provide the URL of your ServiceNow instance.
E.g., https://<your_instance>.service-now.com).
API Key
The API Key required for access to ServiceNow's APIs.
User ID
Your ServiceNow User ID.
All tickets created by the SplxAI Platform will be displayed as if they were reported by the user defined in the integration process. The incident will be posted in the Incident Management Module.
A Test Run is a group of executed probes performed at a specific point in time against your target. In each test run, you can choose which vulnerabilities to test by selecting one or more pre-configured probes.
A dialog box will appear, prompting you to:
Enter the name of your run.
Select one or multiple probes to be included in the run.
Your available probe credits are displayed in the header. Each selected probe deducts from your total.
The test run can have the following statuses:
Pending: The test run will start when the queue is clear.
Running: The test run is in progress.
Finished: All scheduled probes and their attacks are completed.
Canceled: The test run was canceled by the user before completion.
Error: The test run was aborted due to target misconfiguration.
The test run's progress indicates the percentage of completed attacks out of the total scheduled.
In addition to the probe results, each probe includes a tab showing a mitigation strategy to address vulnerabilities identified by the probe, which can be applied to your target application.
These steps are created based on content provided by our red team, who propose best practices they have identified through research and hands-on experience.
Each step includes a Mark as Completed button, allowing you to track whether the mitigation has been applied to a specific target.
For reference, you can view examples of mitigation strategy steps for Context Leakage probe.
To interact with the SplxAI Platform API, all users, including those on free accounts, can generate a Personal Access Token for authentication.
Once you've obtained your personal access token, include it in the X-Api-Key header of your API requests.
Example Request:
If the Authorization header is not provided, or if an invalid token is used, the API will return a 401 Unauthorized error. This response indicates that authentication is required to access the requested resource.
Example Response:
The Test Run View displays and visualizes the data from all the probes selected for that test run. It includes an overview of the test run status, a Sankey diagram, and the probes table, which provides an overview of the results for each probe. From this page, you can cancel an ongoing test run or rerun a completed test run.
In the continuation of this page, you will find detailed explanations of the sections within the Test Run View and the data they contain.
A Sankey diagram visualizes the flow of data from one node to another, with the width of the flow representing the quantity or magnitude of the data. In this context the data represents executed test cases against your target.
The diagram illustrates the flow from the probe categories to specific probes visualizing the ratio of executed test cases between probes.
From the single probe, the flow connects towards passed, failed and error test case outcome, where the width of each flow indicates the number of test cases. Passed, failed and error nodes show total number of corresponding test cases in the test run.
The gradient from green to red visually represents the ratio of failed to passed test cases for each node. Nodes with fewer failed test cases appear greener, while those with more failed test cases appear redder. This provides a clear visual summary of the test run results.
In the Test Run View, the Probes Table lists all probes executed in that test run. The probes are grouped by category and display the total number of executed test cases, along with the counts of passed, failed and error test cases. A progress bar is also visible.
The Test Run History provides a comprehensive list of all test runs associated with a single target, including those currently in progress. By default, test runs are sorted chronologically, with the most recent run appearing at the top.
The test run history table includes:
Test run's name.
Date and time of the execution.
Probes included in the test run.
The Result that displays the total number of passed, failed and error test cases across all probes.
You can filter test runs by the following criteria:
Name: Search for a test run with the specific name.
Status: Filter by test run status.
Results: Filter test runs to include those with at least one passed or one failed test case.
Probes: Show test runs with at least one probe from selection.
REST OpenAPI specification.
Once you start your first probes, the Overview page will begin populating with data. The dashboard provides a quick view of your target's metrics, delivering real-time insights into recent probe runs and their outcomes.
In the top left corner of the overview, the Target Risk Surface is displayed. It shows the total number of simulated attacks and successful attacks across different vulnerabilities tested against your target. The results are aggregated from the latest available probe runs of each probe.
The semi circle chart indicates the total risk of your target, taking into account the number of failed test cases, their severity, and their expected probability. A lower score is preferable. Toggling the chart switches it to a time series view, allowing you to track your application's risk level over time.
You can learn more about the importance and benefits of prompt hardening, along with use case comparisons to guardrails and our benchmark, in our blog post .
Navigate to the .
On the , the tokens table lists all previously created tokens with the following details:
Go to the page in the Organization section of the navigation bar. If you haven't already, make sure to create your organization within Probe by setting your organization’s name.
Next, navigate to the page.
work at the organization level, allowing the Probe platform to connect with various applications and tools (e.g., Jira for project management) for smoother integration with your existing workflows.
To learn how to generate API key, visit the .
To view your User ID after generating the API key, click the information button next to the User field on the API key page. For more details, refer to step 2.e. in the previously linked .
To initiate a new test run, click the "New Test Run +" button on the or page.
To stop a test run in progress, open the and click the "Cancel Test" button. This will abort the test run and stop all probes. All attacks executed up to that point will remain visible and will be included in the results.
You can re-run an existing test run by clicking the "Re-Run Test" button in the top right corner of the . A test run cannot be rerun until it is either completed or stopped.
The test run results are displayed in the , accessible by selecting the test run from either the or page.
To learn how to obtain your personal access token, refer to the section in the documentation.
At the top of the page, you will find the total number of test cases and the number of failed and error test cases from all included probes. The execution date and time are displayed next to the status and progress bar. A for the test run can also be generated by clicking on the button in the top right corner.
By clicking the arrowhead on the right, you can navigate to the details for the selected probe within the test run.
The current test run .
Clicking on the "Details" button will open the of the selected test run.
The latest test runs performed on the target are listed here, with the most recent run displayed at the top. Each entry shows the test run’s name and its current progress, represented as the percentage of completed tests out of the total scheduled. Clicking on a test run will open the .
In this section, probes are organized by category, displaying results from their most recent execution. Each probe card provides the probe’s name, the date and time of the last run, and a summary of the results. Selecting a probe card will open the corresponding .
Welcome to the Probe Product Updates! Stay informed about the latest improvements, new features, and important updates to the platform.
Click on a month below to view the full details of updates for that period:
Your input is the driving force behind the updates. If you have suggestions or encounter any issues, feel free to contact us or submit feedback directly through the platform. Together, we’ll make Probe even better!
We’ve introduced a major new feature to our platform: Prompt Hardening. This feature allows you to automatically harden your system prompt with security and safety instructions, based on our red teaming knowledge and probe run results.
Subscription details have been added to the Organization Settings page for better tracking and usage planning. This includes information about your current credit balance, subscription plan, billing cycle, credit renewal date, the amount of credits per renewal, and the subscription expiration date.
You now have the ability to reinvite users and copy the invitation link, for those who haven’t yet accepted invitation to the organization.
Tooltip support has been added to the probe optimization dialogs.
You can explore the use case, see how it works, and view benchmarks in our .
To get a full overview of the feature, check out the dedicated .
You now have the ability to export the in both CSV and JSON formats, making it easier to integrate with other tools and systems for further analysis or reporting. The export will preserve any filters you've applied to the table.
An additional filtering options and a reset filter button have been added to the .
Integrating CI/CD with the SplxAI Platform streamlines the process of continuously testing and securing your generative AI applications. Automating security and safety testing ensures that vulnerabilities are detected early in the development cycle, reducing the risk of their exploitation.
We provide a variety of CI/CD examples to help you integrate and automate tests using the SplxAI Platform across different platforms:
Azure DevOps
Bitbucket
GitHub
GitLab
Jenkins
Platform-Independent Examples:
Bash
You can find these examples in the GitHub repository. Explore the repository to find scripts specific to your CI/CD tool.
We've expanded our integration options with:
LLM:
Gemini
Bedrock
OpenAI Assistant
LLM Development Platform:
Dify AI
In addition to English, Probe can now test your target against attacks in the chatbot's default language. With support for over 60 languages, we provide more accurate and localized security assessments.
With Q&A Probe, you can test your application’s ability to respond correctly to questions you define. Probe will reformulate your questions, send them to your target, and check if the response content matches the correct answers you’ve set. Upload your question-answer combinations via CSV, provide a chatbot description, and let Probe handle the testing.
You can now create Jira issues directly from the Probe's result page, with the autogenerated summary of your Probe run included in the issue. This makes it easier for your team to track any vulnerabilities found through automated red teaming.
Upload file option for easier Probe configurations (e.g., for RAG precision).
Extended REST API integration with support for Open and Close Session endpoints.
We’ve added several Platform and LLM integrations that make connecting Probe with your GenAI applications seamless in just a few clicks! We now support:
Platforms:
Microsoft Teams
Slack
LLMs:
Azure OpenAI
Azure ML
Anthropic
Hugging Face
OpenAI
Mistral
You can now create custom probes for testing your specific use case, whether it’s security or safety-related. Add a description, set the related allowed and banned behaviors, and the Platform will handle the rest.
Our Red Team continues to update our probe datasets and implement new strategies and variations, ensuring your applications are tested against the latest threats.
For a detailed understanding of Custom Probe, watch our feature .
We’ve released our public API, which currently has the following functionalities:
Trigger new test run.
Cancel ongoing test run.
Get the status of test run.
Generate a PDF report of test run execution.
For detailed instructions on how to interact with the API, please refer to the API Reference page.
The settings page has been refactored to support new features. It’s no longer a single page, but instead divided into separate sections, each with its own page. The new structure includes:
Account Settings
Personal Access Tokens
General
Users
Integrations
Subscription
The Risk Surface section on the overview page now includes an option to display a time series chart of your target’s risk surface. This chart allows you to track how your target's risk has changed over time, reflecting the impact of any updates you’ve made on the target or new vulnerabilities discovered as a result of probe improvements.
The number of test cases that finished with Error has been added to all areas where results are displayed, ensuring better transparency and consistency.
Detection time for each test case is now included in the probe run table for more detailed insights.
Probes cost in credits is now visible in the Probe Details modal on the Probe Catalog page.
The target selection has been relocated for better distinction from the navigation bar.
Test and probe run titles are now included in the breadcrumbs, making it easier to drill up.
In addition, we’ve updated the platform’s UI to support API configuration. Target IDs and probe IDs are now visible within the UI. You can also directly from the platform.
To explore the available options, visit the .
Generate a PDF report for a completed Test Run. Visit the Test Run Report documentation page for more information about the report.
Test Run id.
Binary PDF data
Trigger a new test run for a specified target with a predefined set of probes configured via the SplxAI Platform UI.To learn more about Test Runs and see a visual reference, visit the Test Run documentation page.
Request payload to trigger the execution of a Test Run.
The id of the Target for which the Test Run will be triggered.
The ids of the Probes that will be used in a test run.
Name of the Test Run.
Get Test Run execution status. Learn more about Test Run statuses on the Test Run documentation page.
Test Run id.