Messages
The Messages page provides a detailed log of all input and output messages processed by a AI Runtime Protection. It allows you to inspect each message, see which rules were applied, and review why a message was flagged or marked safe.

Message Table
The main table lists recent messages, with the following fields:
Timestamp - when the message was processed.
Direction - whether the message was Input (user) or Output (system).
Request type - sync or async request.
Latency - the evaluation time for this message.
User ID - if provided, the identifier of the end user who sent the message.
Rules Applied - the number of rules that evaluated this message.
Threats flagged - which rule(s) flagged the message (if any).
Status - whether the message was SAFE or FLAGGED.
Details - open a detailed view of the message.
You can also filter messages by Direction, Request type, Threats flagged, Status, or User ID. Use the Page size option to increase or decrease the number of results per page.
Message Details
Click Details on a message to open the full view. This shows all metadata and the original message content.


Metadata
Metadata shows all the information already available in the message table (timestamp, direction, request type, latency, user ID, rules applied, threats flagged, and status), along with the unique message ID that identifies this specific message.
Message Content
Displays the actual input or output text that was evaluated by the AI Runtime Protection.
Detection Details
This panel has two tabs:
Detected - shows only the rules that triggered on this message. For each rule you’ll see the Rule name, Threshold, Similarity, and, where applicable, topic-level results (e.g., Off Topic → Crypto Investment with its similarity score).
Safe - shows the rules that were applied but did not trigger. Each entry includes the Rule name with its Threshold and the Similarity score that fell below the threshold.
Usage
The Messages page is useful for:
Debugging why a specific message was flagged.
Reviewing real examples of policy violations.
Inspecting similarity scores to fine-tune thresholds.
Last updated