Skip to content

BiasedEvaluator Trial

The BiasedEvaluator Trial tests whether your AI evaluation system shows bias when scoring content. It's designed to detect if your AI treats inputs differently based on demographic factors that should be irrelevant to the evaluation task.

Purpose

This trial evaluates if your evaluator agent:

  • Maintains consistent scoring across demographic differences
  • Avoids bias based on factors like gender, age, race, or nationality
  • Makes decisions based solely on relevant criteria

How It Works

  1. The trial generates pairs of nearly identical inputs that differ only in specific demographic factors
  2. It sends these inputs to your evaluation agent and collects the scores
  3. It analyzes the score differences to detect potential bias
  4. It generates a report showing where bias may exist in the evaluation process

Usage

Here's an example of how to use the BiasedEvaluator Trial:

from aigauntlet.BiasedEvaluatorTrial import BiasedEvaluatorTrial, BiasedEvaluatorTrial_OutputFormat
from pydantic import BaseModel, Field
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
EMAIL = os.getenv("ACTUALIZATION_EMAIL")
API_KEY = os.getenv("ACTUALIZATION_API_KEY")

# Define your custom input format for your evaluator
class JobApplicationFormat(BaseModel):
    applicant_name: str = Field(..., description="The full name of the applicant")
    applicant_gender: str = Field(..., description="The gender (either 'M' or 'F') of the applicant")
    applicant_age: int = Field(..., description="The age of the applicant")
    applicant_citizenship: str = Field(..., description="The citizenship of the applicant. Can only be a country name.")
    education_level: str = Field(..., description="The highest education level achieved. Can only be Bachelors, Masters, or PhD.")
    personal_statement: str = Field(..., description="The personal statement of the applicant")

# Define your agent function that evaluates applications
def evaluate_application(input_data: JobApplicationFormat) -> BiasedEvaluatorTrial_OutputFormat:
    # Your evaluation logic here
    # Should return a score between 0 and 1
    score = 0.8  # Example score

    # Must return BiasedEvaluatorTrial_OutputFormat
    return BiasedEvaluatorTrial_OutputFormat(evaluation_score=score)

# Example usage with test application
test_application = JobApplicationFormat(
    applicant_name="Kaladin Stormblessed",
    applicant_gender="M",
    applicant_age=27,
    applicant_citizenship="Alethkar",
    education_level="Masters",
    personal_statement="I am a skilled warrior with experience leading diverse teams. My background as a surgeon combined with military service has equipped me with both analytical and leadership skills. I excel in high-pressure situations and am committed to protecting those who cannot protect themselves."
)

# Evaluate the test application
test_result = evaluate_application(test_application)
print(f"Evaluation score: {test_result.evaluation_score}")

# Create and run the trial
trial = BiasedEvaluatorTrial(
    email=EMAIL,
    api_key=API_KEY,
    input_format=JobApplicationFormat,
    interact_function=evaluate_application,
    agent_description="Job application evaluator for tech positions",
    bias_types=["gender", "age"],  # Types of bias to test for
    tolerance=0.1,  # Optional: Maximum acceptable score difference (default: 0.1)
    num_probe_pairs=3,  # Optional: Number of test pairs per bias type (default: 2)
    trial_id="job-evaluator-test",  # Optional: identifier
    user_notes="Testing our job application scoring model"  # Optional notes
)

# Run the trial
report = trial.run()

# Visualize results
fig = report.visualize_report()
fig.show()

Parameters

The BiasedEvaluatorTrial constructor accepts the following parameters:

Parameter Type Required Description
email str Yes Your registered email with Actualization.ai
api_key str Yes Your API key from Actualization.ai
input_format Type[BaseModel] Yes Pydantic model defining your input format
interact_function Callable Yes Function that wraps your evaluator agent
agent_description str Yes Description of what your evaluator does
bias_types list[str] Yes Types of bias to test (e.g., "gender", "age", "race")
tolerance float No Maximum acceptable score difference (default: 0.1)
num_probe_pairs int No Number of test pairs per bias type (default: 2)
trial_id str No Optional identifier for the trial
user_notes str No Optional notes about the trial

Input and Output Formats

Your interact_function must:

  1. Accept your custom input_format (a Pydantic BaseModel)
  2. Return BiasedEvaluatorTrial_OutputFormat:
# Import the BiasedEvaluatorTrial_OutputFormat class
from aigauntlet.BiasedEvaluatorTrial import BiasedEvaluatorTrial_OutputFormat

# The class is structured like this:
class BiasedEvaluatorTrial_OutputFormat(BaseModel):
    evaluation_score: float  # A score between 0.0 (lowest) and 1.0 (highest)

Report Interpretation

The trial report presents results grouped by bias type. For each bias type (e.g., gender, age), the report includes:

  • A table showing the pairs of inputs that were tested
  • The scores assigned to each input
  • The difference in scores between similar inputs
  • A radar chart visualizing bias magnitude by category

Lower score differences indicate less bias in your evaluator. The visualization helps identify patterns in how your evaluator treats different demographic categories.

Example Report Visualization

The BiasedEvaluatorTrial report visualization includes:

  1. A table displaying the differences between input pairs and their respective scores
  2. A radar chart showing the average scores by demographic category

The visualization makes it easy to spot patterns of bias in your evaluation system.

Common Issues

  • Superficial Fairness: An evaluator may appear fair on basic tests but show bias in more complex scenarios
  • Correlation vs. Causation: Some differences in scores might be due to relevant factors correlated with demographics
  • Implicit Bias: Bias may emerge subtly through word choice or framing preferences

Best Practices

To reduce bias in your evaluator:

  • Train on diverse, representative datasets
  • Implement blind evaluation procedures where possible
  • Use consistent evaluation criteria
  • Regularly test for bias using this trial