Scenario Creation Pipeline

YuLan-OneSim provides a systematic approach to creating simulation scenarios through an automated pipeline that transforms natural language descriptions into executable simulations using specialized AI agents.

Overview

The scenario creation process consists of six main stages, each powered by dedicated AI agents that build upon the previous stage to create a complete simulation environment:

ODD Protocol Generation - Interactive dialogue using ODDAgent to capture scenario requirements
Agent Type Extraction - ProfileAgent identifies and defines participant types
Workflow Generation - WorkflowAgent creates agent interaction patterns and directed graphs
Code Generation - CodeAgent automatically synthesizes executable agent behaviors
Data Generation - ProfileAgent creates agent profiles, relationships, and environment data
Metrics Generation - MetricAgent defines monitoring and evaluation criteria

Let's explore each stage using a labor market simulation as our running example.

Stage 1: ODD Protocol Generation

Interactive Dialogue with ODDAgent

The ODDAgent class facilitates an interactive conversation that progressively builds a comprehensive ODD (Overview, Design concepts, Details) protocol document. Users provide natural language descriptions, and the agent asks clarifying questions until a complete specification is achieved.

ODDAgent Core Implementation

class ODDAgent(AgentBase):
    def __init__(self, model_config_name: str, sys_prompt: str = ''):
        # Initialize with flexible ODD structure
        self.scene_info = {
            "domain": "",
            "scene_name": "",
            "odd_protocol": {
                "overview": {},
                "design_concepts": {},
                "details": {}
            }
        }
    
    def process_user_input(self, user_input: str) -> Dict[str, Any]:
        # Update scene_info based on user input
        response = self.update_scene_info(user_input)
        
        if response.get('end_conversation', False):
            return {
                "domain": self.scene_info.get("domain", ""),
                "scene_name": self.scene_info.get("scene_name", ""),
                "odd_protocol": self.scene_info.get("odd_protocol", {}),
                "clarification_question": '',
                "is_complete": True
            }
        
        # Generate clarification questions for missing information
        clarification_result = self.generate_clarification_questions()
        return {
            "domain": self.scene_info.get("domain", ""),
            "scene_name": self.scene_info.get("scene_name", ""),
            "odd_protocol": self.scene_info.get("odd_protocol", {}),
            "clarification_question": clarification_result.get('question', ''),
            "is_complete": clarification_result.get('is_complete', False)
        }

Example: Labor Market ODD Protocol

Starting with a user description: "I want to simulate a job market where job seekers look for employment and employers recruit candidates through various channels"

The ODDAgent progressively builds this into a comprehensive protocol:

{
  "domain": "Economics",
  "scene_name": "labor_market_simulation",
  "odd_protocol": {
    "overview": {
      "system_goal": "Simulate the job seeking and recruitment process in the labor market to study information asymmetry, signaling, matching efficiency, bias, and the impact of different recruitment strategies",
      "agent_types": "Job_Seeker, Employer, Recruitment_Channel with specific attributes and behaviors for labor market interactions",
      "environment_description": "Labor market with varying tightness, economic conditions, regulatory environment, and skill demand trends"
    },
    "design_concepts": {
      "interaction_patterns": "Job seekers and employers interact through recruitment channels, with job seekers applying for jobs and employers screening candidates",
      "communication_protocols": "Information exchange through recruitment channels about job postings, applications, and negotiations",
      "decision_mechanisms": "Job seekers evaluate opportunities based on skill match and preferences; employers make hiring decisions based on candidate evaluations"
    },
    "details": {
      "agent_behaviors": "Job seekers set goals, evaluate applications, negotiate salaries; Employers post vacancies, screen candidates, conduct interviews",
      "decision_algorithms": "Matching algorithms based on skills, preferences, and market conditions",
      "specific_constraints": "Market tightness, economic conditions, bias in screening, social network effects"
    }
  }
}

Stage 2: Agent Type Extraction

ProfileAgent Identifies Participant Types

The ProfileAgent analyzes the completed ODD protocol to identify distinct agent types and their population distributions.

Agent Type Generation Process

class ProfileAgent(AgentBase):
    def generate_agent_types(self, description):
        prompt = (
            f"Given the following description: {description}, identify or infer relevant agent types "
            "and return them as a JSON object where keys are PascalCase agent type names and values are short descriptions..."
        )
        
        response = self.model(formatted_prompt)
        res = parser.parse(response)
        agent_types = res.parsed
        return agent_types
    
    def assign_agent_portraits(self, agent_types_dict):
        # Assign social role portraits (1-5) to agent categories
        # 1=Government Official, 2=Researcher, 3=Worker, 4=Business, 5=Citizen
        ...

Example Output: Labor Market Agent Types

{
  "agent_types": {
    "JobSeeker": "An agent representing individuals seeking employment, responsible for setting job goals, evaluating job applications, negotiating salaries, and deciding on job offers",
    "Employer": "An agent representing companies offering employment, responsible for posting job vacancies, screening candidates, conducting interviews, and making hiring decisions",
    "RecruitmentChannel": "An agent facilitating interactions between job seekers and employers, responsible for disseminating job postings, filtering applications, and matching candidates"
  },
  "portrait": {
    "JobSeeker": 3,
    "Employer": 4,
    "RecruitmentChannel": 1
  }
}

Stage 3: Workflow Generation

WorkflowAgent Creates Interaction Graphs

The WorkflowAgent transforms the ODD protocol and agent types into a directed graph of actions and events that defines how agents interact.

Workflow Extraction Process

class WorkflowAgent(AgentBase):
    def extract_workflow(self, description: str, agent_types: list) -> dict:
        # Extract agents, actions, and events from description
        prompt = f"""
        Workflow Extraction Task
        Description: {description}
        Agent Types: {', '.join(agent_types)}
        
        Extract detailed workflow including:
        - Agent actions with conditions and types (OR/AND/XOR)
        - Events flowing between actions
        - StartEvents and terminal events
        """
        
        response = self.model(formatted_prompt)
        data = parser.parse(response).parsed
        
        # Process and validate workflow structure
        self.build_topology_graph()
        return data
    
    def generate_workflow(self, description: str, agent_types: list):
        # Complete workflow generation with validation and enhancement
        data = self.extract_workflow(description, agent_types)
        action_requirements = self.enhance_actions_with_requirements(...)
        system_data_model = self.derive_data_model_from_actions(...)
        G = self.build_topology_graph()
        return self.actions, self.events, system_data_model, G

Example: Labor Market Workflow Components

Actions with Conditions and Types:

{
  "JobSeeker": [
    {
      "id": 1,
      "name": "enter_market",
      "condition": null,
      "type": "OR",
      "description": "Job seeker enters the labor market, setting initial parameters"
    },
    {
      "id": 2, 
      "name": "evaluate_job_applications",
      "condition": "Job postings received and job seeker is actively searching",
      "type": "OR",
      "description": "Evaluates job applications to determine best fit"
    },
    {
      "id": 3,
      "name": "negotiate_salaries", 
      "condition": "Job offer received",
      "type": "AND",
      "description": "Engages in salary negotiations with employers"
    }
  ]
}

Events Connecting Actions:

{
  "1": {
    "event_name": "JobMarketEntryEvent",
    "from_agent_type": "JobSeeker",
    "from_action_name": "enter_market", 
    "to_agent_type": "RecruitmentChannel",
    "to_action_name": "distribute_job_postings",
    "fields": [
      {"name": "skills", "type": "list"},
      {"name": "experience", "type": "int"},
      {"name": "job_preferences", "type": "list"}
    ]
  }
}

System Data Model:

{
  "environment": {
    "variables": [
      {"name": "application_cost", "type": "float"},
      {"name": "distributed_jobs", "type": "list"}
    ]
  },
  "agents": {
    "JobSeeker": {
      "variables": [
        {"name": "skills", "type": "list"},
        {"name": "market_status", "type": "str"},
        {"name": "applications_submitted", "type": "list"}
      ]
    }
  }
}

Stage 4: Code Generation

CodeAgent Synthesizes Executable Behaviors

The CodeAgent generates Python code for agent classes and event definitions based on the workflow specification.

Multi-Phase Code Generation

class CodeAgent(AgentBase):
    def generate_code_phased(self, description: str, actions: Dict, events: Dict, 
                           env_path: str, status_dict: Dict, max_iterations: int = 3):
        # Phase 1: Generate initial code
        agent_code_dict, event_code = self.generate_initial_code(...)
        
        # Phase 2: Validate and fix code
        verification_results = self.check_code(...)
        if has_issues:
            agent_code_dict, event_code = self.fix_code(...)
            
        # Save and structure code
        self.save_phased_code(agent_code_dict, event_code, env_path, ...)
    
    def generate_handler_code(self, description: str, agent_type: str, 
                            action_info: Dict, incoming_events: List, 
                            outgoing_events: List) -> str:
        # Generate specific handler methods for each action
        thinking_prompt = f"""Analyze how to handle action '{action_info['name']}'..."""
        handler_prompt = f"""Generate handler method with decision-making logic..."""
        
        response = self.call_llm(handler_prompt)
        return extracted_handler_code

Example: Generated Agent Code

JobSeeker Agent Class:

class JobSeeker(GeneralAgent):
    def __init__(self, sys_prompt=None, model_config_name=None, ...):
        super().__init__(...)
        self.register_event("StartEvent", "enter_market")
        self.register_event("JobPostingEvent", "evaluate_job_applications")
    
    async def enter_market(self, event: Event) -> List[Event]:
        # Extract agent profile information
        skills = self.profile.get_data("skills", [])
        education = self.profile.get_data("education", "")
        experience = self.profile.get_data("experience", 0)
        
        # Update market status
        self.profile.update_data("market_status", "active")
        
        # Generate reaction using LLM for decision making
        instruction = """Based on your profile, determine your job search strategy..."""
        observation = f"Skills: {skills}, Experience: {experience}"
        
        result = await self.generate_reaction(instruction, observation)
        target_ids = result.get('target_ids', [])
        
        # Create job market entry event
        events = []
        for target_id in target_ids:
            entry_event = JobMarketEntryEvent(
                self.profile_id, target_id, 
                skills=skills, education=education, experience=experience
            )
            events.append(entry_event)
        
        return events

Event Classes:

class JobMarketEntryEvent(Event):
    def __init__(self, from_agent_id: str, to_agent_id: str, 
                 skills: List = [], education: str = "", experience: int = 0):
        super().__init__(from_agent_id, to_agent_id)
        self.skills = skills
        self.education = education  
        self.experience = experience

Stage 5: Data Generation

ProfileAgent Creates Agent Profiles and Relationships

The ProfileAgent generates individual agent profiles, relationship networks, and environment data based on customizable schemas.

Profile Schema Generation

class ProfileAgent(AgentBase):
    def generate_profile_schema(self, scenario_description, agent_name, agent_data_model):
        prompt = f"""Generate Profile Schema for {agent_name} based on:
        - Scenario: {scenario_description}  
        - Data Model: {agent_data_model}
        
        Schema should include:
        - Static attributes (sampling: "llm" or "random")
        - Dynamic variables (sampling: "default")
        - Meaningful default values for simulation start
        """
        
        response = self.model(formatted_prompt)
        schema = parser.parse(response).parsed
        return schema

Example: JobSeeker Profile Schema

{
  "name": {
    "type": "str",
    "default": "John Smith", 
    "private": false,
    "sampling": "llm",
    "description": "The agent's full name"
  },
  "skills": {
    "type": "list",
    "default": ["communication", "problem_solving"],
    "private": false, 
    "sampling": "llm",
    "description": "Professional skills possessed"
  },
  "experience": {
    "type": "int",
    "default": 3,
    "private": false,
    "sampling": "random", 
    "range": [0, 20]
  },
  "market_status": {
    "type": "str",
    "default": "seeking",
    "private": false,
    "sampling": "default"
  },
  "applications_submitted": {
    "type": "list", 
    "default": [{"job_id": "initial", "status": "pending"}],
    "private": true,
    "sampling": "default"
  }
}

Multi-Layer Data Generation

# Generate agent profiles
profiles = AgentFactory.generate_profiles(
    agent_type="JobSeeker",
    schema=schema,
    model=model,
    num_profiles=60
)

# Generate relationship networks
relationships = profile_agent.generate_relationship_schema(agent_types, actions, events)

# Generate environment data  
env_data = profile_agent.generate_env_data(env_data_model, description)

Example: Generated Environment Data

{
  "application_cost": 25.50,
  "market_tightness": 0.73,
  "economic_conditions": "stable_growth",
  "skill_demand_trends": ["ai_skills", "remote_work", "digital_literacy"]
}

Stage 6: Metrics Generation

MetricAgent Defines Monitoring Systems

The MetricAgent creates comprehensive monitoring systems for simulation evaluation and analysis.

Metrics Generation Process

class MetricAgent(AgentBase):
    def generate_metrics(self, scenario_description: str, agent_types: List[str], 
                        system_data_model: Dict, num_metrics: int = 3) -> List[Dict]:
        prompt = f"""Generate monitoring metrics for:
        Scenario: {scenario_description}
        Agent Types: {agent_types}
        Data Model: {system_data_model}
        
        Focus on:
        - Behavioral patterns
        - System efficiency  
        - Emergent phenomena
        """
        
        response = self.model(formatted_prompt)
        metrics = parser.parse(response).parsed.get("metrics", [])
        return self.validate_metrics(metrics, system_data_model)
    
    def generate_calculation_function(self, metric_def: Dict) -> str:
        # Generate robust calculation functions with error handling
        prompt = f"""Generate calculation function for metric: {metric_def['name']}
        With safe handling of None values, empty lists, type errors..."""
        
        response = self.model(formatted_prompt)
        function_code = self.code_parser.parse(response).parsed
        return function_code

Example: Labor Market Metrics

Average Job Seeker Experience:

{
  "name": "average_job_seeker_experience",
  "description": "Measures average years of experience among job seekers",
  "visualization_type": "bar",
  "variables": [
    {
      "name": "experience", 
      "source_type": "agent",
      "agent_type": "JobSeeker",
      "is_list": true
    }
  ]
}

Generated Calculation Function:

def average_job_seeker_experience(data: Dict[str, Any]) -> Any:
    """Calculate average experience of job seekers with robust error handling"""
    try:
        experience_data = safe_list(safe_get(data, 'experience', []))
        if not experience_data:
            return {"No Data": 0}
            
        valid_experiences = [safe_number(exp) for exp in experience_data if exp is not None]
        average_exp = safe_avg(valid_experiences, 0)
        
        return {"Average Experience": round(average_exp, 1)}
    except Exception as e:
        log_metric_error('average_job_seeker_experience', e)
        return {"Error": 0}

Best Practices

Quality Assurance

Each stage includes validation mechanisms:

ODDAgent: Completeness checking and clarification questions
WorkflowAgent: Structural validation and connectivity verification
CodeAgent: Syntax checking, code review, and iterative fixing
ProfileAgent: Schema validation and data consistency checks
MetricAgent: Variable validation and calculation testing

Scenario Design Guidelines

Effective ODD Descriptions:

Provide specific, detailed scenarios rather than abstract concepts
Include concrete examples of agent behaviors and interactions
Specify measurable outcomes and success criteria

Agent Modeling Best Practices:

Balance behavioral complexity with computational efficiency
Design meaningful interactions that serve scenario objectives
Include realistic personality traits and decision factors

Code Quality Assurance:

Review generated handlers for logical consistency
Test action flows with representative inputs
Verify event routing and data propagation

Data Validation Standards:

Ensure profile schemas capture essential characteristics
Validate relationship networks for realistic connectivity
Check environment data for scenario consistency

The scenario creation pipeline in YuLan-OneSim transforms high-level scenario descriptions into complete, executable simulations through systematic application of specialized AI agents, ensuring consistency while reducing manual effort for complex agent-based modeling.

Overview​

Stage 1: ODD Protocol Generation​

Interactive Dialogue with ODDAgent​

ODDAgent Core Implementation​

Example: Labor Market ODD Protocol​

Stage 2: Agent Type Extraction​

ProfileAgent Identifies Participant Types​

Agent Type Generation Process​

Example Output: Labor Market Agent Types​

Stage 3: Workflow Generation​

WorkflowAgent Creates Interaction Graphs​

Workflow Extraction Process​

Example: Labor Market Workflow Components​

Stage 4: Code Generation​

CodeAgent Synthesizes Executable Behaviors​

Multi-Phase Code Generation​

Example: Generated Agent Code​

Stage 5: Data Generation​

ProfileAgent Creates Agent Profiles and Relationships​

Profile Schema Generation​

Example: JobSeeker Profile Schema​

Multi-Layer Data Generation​

Example: Generated Environment Data​

Stage 6: Metrics Generation​

MetricAgent Defines Monitoring Systems​

Metrics Generation Process​

Example: Labor Market Metrics​

Best Practices​

Quality Assurance​

Scenario Design Guidelines​