Skip to main content

Report Generation

This module is the final step in the AI Social Researcher workflow. It is responsible for reading the data and plots generated by a completed simulation experiment and automatically writing a comprehensive, professional research report.


Prerequisites

Before running report generation, you must have a scene that was either created by the "Environment Design" module or configured manually. This scene directory must contain the following:

  1. Scene Information File:

    • Path: src/envs/<scene_name>/scene_info.json
    • Content: Contains the ODD protocol and metadata for the experiment. This is crucial for the analysis module to understand the "Methodology" section.
  2. Metrics Calculation Code (Optional but Recommended):

    • Path: src/envs/<scene_name>/code/metrics/metrics.py
    • Content: Contains Python functions used to calculate the core metrics of the simulation. The DataAnalysisAgent reads this file to understand the meaning of the metrics.
  3. Metric Plots:

    • Path: src/envs/<scene_name>/metrics_plots/
    • Content: Stores various metric plots (e.g., .png files) generated during the simulation. These plots will be automatically embedded into the "Results" section of the final report.

Core Logic and Process

The report generation functionality is driven by the main script src/researcher/report_generation.py. It manages different tasks through a series of subcommands and is also internally driven by LLM agents.

Key Agents and Processes

  • DataAnalysisAgent

    • Responsibility: Comprehensively analyzes all input information (ODD protocol, metrics code, plots) to extract key findings, trends, and insights from the simulation results.
    • Output: analysis_result.md, a detailed textual analysis of the simulation results.
  • OutlineWritingAgent

    • Responsibility: Builds a report outline that conforms to academic standards based on the content of analysis_result.md.
    • Output: report_outline.md, which defines the chapter structure of the report (e.g., Introduction, Related Work, Methods, Results, Discussion, Conclusion).
  • Writing Process

    • Handled by the src/researcher/report_generation/writing_process/generate_full_report.py module.
    • Responsibility: Following the generated outline, it calls the LLM to fill in the content section by section, integrating all text, code snippets, and plot references into a .tex file. Finally, it calls a LaTeX compiler to generate the PDF.
  • ReviewerAgent (Optional)

    • Responsibility: Reviews the generated .tex report draft, checking for logic, clarity, fluency, and completeness.
    • Output: Suggestions for revising the report (in JSON format), to be used in the next iteration.

Workflow and Outputs

  1. Specify Scene: The user tells the script which experiment to generate a report for using the --scene_name argument.
  2. Execute Subcommand: The user selects a subcommand (e.g., full) to start the entire process.
  3. Analysis and Outline: The DataAnalysisAgent and OutlineWritingAgent run first to generate the analysis and outline files.
  4. Writing and Compilation: The system writes a draft of the report (.tex) based on the outline and compiles it into a PDF.
  5. Iterative Optimization (Optional): If enabled, the ReviewerAgent will review the report, and the system will then revise it based on the feedback and recompile. This process can be repeated multiple times.
  6. Final Output: All report-related files are saved in the src/envs/<scene_name>/research/report/ directory.

Main Input:

  • A ready-to-use scene directory (src/envs/<scene_name>/)

Main Output Files:

  • src/envs/<scene_name>/research/analysis_result.md: The data analysis report.
  • src/envs/<scene_name>/research/report_outline.md: The outline for the final report.
  • src/envs/<scene_name>/research/report/simulation_report_final.pdf: The final PDF research report.
  • src/envs/<scene_name>/research/report/simulation_report.tex: The LaTeX source code for the final report.
  • src/envs/<scene_name>/research/report/review_*.json: (If enabled) The review comments from each iteration.

How to Use (Subcommands)

The report_generation.py script uses subcommands to perform specific tasks. All subcommands require the --scene_name argument.

  • analyze: Only performs data analysis.
  • outline: Only generates the report outline (depends on analysis results).
  • report: Writes and compiles the report (can be iterative), skipping the analysis and outline steps by default.
  • full: The most commonly used command. Executes the full process from analysis and outline to the final report.
  • review: Performs a single review of an existing .tex report file.

To see detailed usage and arguments for all subcommands, please go to Usage Examples.


Documentation for YuLan-OneSim - A Next Generation Social Simulator with LLMs