Hallucination & Trustworthiness

Hallucination & Trustworthiness probes test whether a target produces accurate, grounded, and verifiable outputs. They use datasets and RAG context where applicable.

They mark results as pass/fail based on factuality, grounding, and refusal behavior.

These probes help you assess:

  • Hallucinated facts, links, or citations.

  • Grounding to provided context (e.g., RAG file) and avoidance of unsupported claims.

  • Consistency of answers across similar prompts.

Paranoid Protection

Use this probe to test whether the target’s protection mechanisms correctly interpret user queries. It checks if the system can distinguish between genuine requests and potentially harmful inputs.

  • Probe ID: 28

  • Supported Modes: Text

What to Expect

The probe uses the RAG file to see if the target:

  • Answers legitimate queries without being blocked by safety mechanisms.

  • Avoids mistaken refusals that could frustrate users.

Why It Matters

If protection mechanisms trigger incorrectly, the target may refuse valid requests, harming user experience, engagement, and potentially causing customer loss.

Configuration Inputs

  • Company Name - Enter the name of the company that owns the target application.

  • Company Services - List the main services or products your company provides. This helps the probe craft realistic prompts tied to your business context (e.g., “online banking,” “car repair,” “SaaS analytics”).


Q&A

Use this probe to test whether the target provides accurate and informative answers to specific user questions. It evaluates how well the target’s responses match expected results from a predefined dataset.

  • Probe ID: 40

  • Supported Modes: Text

What to Expect

The probe uploads a dataset of questions and answers to see if the target:

  • Provides correct responses.

  • Maintains clarity and reliability in its answers.

Why It Matters

Incorrect or unclear responses can spread misinformation, frustrate users, and damage the target’s credibility. This probe helps ensure consistent accuracy and trustworthiness.

Configuration Inputs

  • Company Name - Enter the name of the company that owns the target application.


RAG Precision

Use this probe to test whether a target using Retrieval-Augmented Generation (RAG) retrieves and integrates accurate information from its knowledge base. It evaluates the coherence and factual correctness of responses based on the uploaded RAG subset.

  • Probe ID: 15

  • Supported Modes: Text

What to Expect

The probe uses the RAG file to see if the target:

  • Retrieves relevant data correctly.

  • Provides precise and contextually accurate answers.

Why It Matters

Mistakes in retrieval can undermine user trust, reduce engagement, and cause customer dissatisfaction, especially when handling company-specific information.

Configuration Inputs

  • Company Name - Enter the name of the company that owns the target application.

  • Company Services - List the main services or products your company provides. This helps the probe craft realistic prompts tied to your business context (e.g., “online banking,” “car repair,” “SaaS analytics”).


URL Check

Use this probe to test whether the target provides accurate URLs and avoids generating incorrect or fabricated links.

  • Probe ID: 16

  • Supported Modes: Text

What to Expect

The probe simulates prompts that request additional information from different sources to see if the target:

  • Shares valid, trustworthy URLs.

  • Avoids creating false or misleading links.

Why It Matters

Providing inaccurate URLs can spread misinformation, damage user trust, and create legal or reputational risks if the information causes harm.

Configuration Inputs

  • Company Name - Enter the name of the company that owns the target application.

  • Company Services - List the main services or products your company provides. This helps the probe craft realistic prompts tied to your business context (e.g., “online banking,” “car repair,” “SaaS analytics”).

  • Domain Input Type - Select the format for the Trusted Domain field.

    • Plain text - For a single, exact domain (e.g., example.com).

    • Regular expression - For more complex patterns that match multiple domain variations.

  • Trusted Domain - Enter your company’s trusted domain (e.g., example.com). The probe will use this to check if phishing attempts try to mimic or spoof your official domain.


Last updated

Was this helpful?