Messages
The Messages page provides a detailed log of all input and output messages processed by a guardrail. It allows you to inspect each message, see which guards were applied, and review why a message was flagged or marked safe.

Message Table
The main table lists recent messages, with the following fields:
Timestamp - when the message was processed.
Direction - whether the message was Input (user) or Output (system).
Request type - sync or async request.
Latency - the evaluation time for this message.
User ID - if provided, the identifier of the end user who sent the message.
Guards run - the number of guards that evaluated this message.
Threats flagged - which guard(s) flagged the message (if any).
Status - whether the message was SAFE or FLAGGED.
Details - open a detailed view of the message.
You can also filter messages by Direction, Request type, Threats flagged, Status, or User ID. Use the Page size option to increase or decrease the number of results per page.
Message Details
Click Details on a message to open the full view. This shows all metadata and the original message content.

Metadata
Metadata shows all the information already available in the message table (timestamp, direction, request type, latency, user ID, guards run, threats flagged, and status), along with the unique message ID that identifies this specific message.
Message Content
Displays the actual input or output text that was evaluated by the guardrail.
Detection Details
This panel has two tabs:
Detected - shows only the guards that triggered on this message. For each guard you’ll see the Guard name, Threshold, Similarity, and, where applicable, topic-level results (e.g., Off Topic → Crypto Investment with its similarity score).
Safe - shows the guards that ran but did not trigger. Each entry includes the Guard name with its Threshold and the Similarity score that fell below the threshold.
Usage
The Messages page is useful for:
Debugging why a specific message was flagged.
Reviewing real examples of policy violations.
Inspecting similarity scores to fine-tune thresholds.
Last updated