Messages

The Messages page provides a detailed log of all input and output messages processed by a guardrail. It allows you to inspect each message, see which guards were applied, and review why a message was flagged or marked safe.

Figure 1: Messages Page

Message Table

The main table lists recent messages, with the following fields:

  • Timestamp - when the message was processed.

  • Direction - whether the message was Input (user) or Output (system).

  • Request type - sync or async request.

  • Latency - the evaluation time for this message.

  • User ID - if provided, the identifier of the end user who sent the message.

  • Guards run - the number of guards that evaluated this message.

  • Threats flagged - which guard(s) flagged the message (if any).

  • Status - whether the message was SAFE or FLAGGED.

  • Details - open a detailed view of the message.

You can also filter messages by Direction, Request type, Threats flagged, Status, or User ID. Use the Page size option to increase or decrease the number of results per page.

  • Messages shown are filtered by the time frame you select in the top calendar, just like the Overview page.

  • Click the Refresh data button to fetch the latest results immediately. The Messages view also updates periodically in the background.

Message Details

Click Details on a message to open the full view. This shows all metadata and the original message content.

Figure 2: Message Details

Metadata

Metadata shows all the information already available in the message table (timestamp, direction, request type, latency, user ID, guards run, threats flagged, and status), along with the unique message ID that identifies this specific message.

Message Content

Displays the actual input or output text that was evaluated by the guardrail.

Detection Details

This panel has two tabs:

  • Detected - shows only the guards that triggered on this message. For each guard you’ll see the Guard name, Threshold, Similarity, and, where applicable, topic-level results (e.g., Off Topic → Crypto Investment with its similarity score).

  • Safe - shows the guards that ran but did not trigger. Each entry includes the Guard name with its Threshold and the Similarity score that fell below the threshold.

The thresholds shown reflect the policy configuration at the time the message was evaluated.

Usage

The Messages page is useful for:

  • Debugging why a specific message was flagged.

  • Reviewing real examples of policy violations.

  • Inspecting similarity scores to fine-tune thresholds.

Last updated