Report Generation
This module is the final step in the AI Social Researcher workflow. It is responsible for reading the data and plots generated by a completed simulation experiment and automatically writing a comprehensive, professional research report.
Prerequisites
Before running report generation, you must have a scene that was either created by the "Environment Design" module or configured manually. This scene directory must contain the following:
-
Scene Information File:
- Path:
src/envs/<scene_name>/scene_info.json - Content: Contains the ODD protocol and metadata for the experiment. This is crucial for the analysis module to understand the "Methodology" section.
- Path:
-
Metrics Calculation Code (Optional but Recommended):
- Path:
src/envs/<scene_name>/code/metrics/metrics.py - Content: Contains Python functions used to calculate the core metrics of the simulation. The
DataAnalysisAgentreads this file to understand the meaning of the metrics.
- Path:
-
Metric Plots:
- Path:
src/envs/<scene_name>/metrics_plots/ - Content: Stores various metric plots (e.g.,
.pngfiles) generated during the simulation. These plots will be automatically embedded into the "Results" section of the final report.
- Path:
Core Logic and Process
The report generation functionality is driven by the main script src/researcher/report_generation.py. It manages different tasks through a series of subcommands and is also internally driven by LLM agents.
Key Agents and Processes
-
DataAnalysisAgent- Responsibility: Comprehensively analyzes all input information (ODD protocol, metrics code, plots) to extract key findings, trends, and insights from the simulation results.
- Output:
analysis_result.md, a detailed textual analysis of the simulation results.
-
OutlineWritingAgent- Responsibility: Builds a report outline that conforms to academic standards based on the content of
analysis_result.md. - Output:
report_outline.md, which defines the chapter structure of the report (e.g., Introduction, Related Work, Methods, Results, Discussion, Conclusion).
- Responsibility: Builds a report outline that conforms to academic standards based on the content of
-
Writing Process
- Handled by the
src/researcher/report_generation/writing_process/generate_full_report.pymodule. - Responsibility: Following the generated outline, it calls the LLM to fill in the content section by section, integrating all text, code snippets, and plot references into a
.texfile. Finally, it calls a LaTeX compiler to generate the PDF.
- Handled by the
-
ReviewerAgent(Optional)- Responsibility: Reviews the generated
.texreport draft, checking for logic, clarity, fluency, and completeness. - Output: Suggestions for revising the report (in JSON format), to be used in the next iteration.
- Responsibility: Reviews the generated
Workflow and Outputs
- Specify Scene: The user tells the script which experiment to generate a report for using the
--scene_nameargument. - Execute Subcommand: The user selects a subcommand (e.g.,
full) to start the entire process. - Analysis and Outline: The
DataAnalysisAgentandOutlineWritingAgentrun first to generate the analysis and outline files. - Writing and Compilation: The system writes a draft of the report (
.tex) based on the outline and compiles it into a PDF. - Iterative Optimization (Optional): If enabled, the
ReviewerAgentwill review the report, and the system will then revise it based on the feedback and recompile. This process can be repeated multiple times. - Final Output: All report-related files are saved in the
src/envs/<scene_name>/research/report/directory.
Main Input:
- A ready-to-use scene directory (
src/envs/<scene_name>/)
Main Output Files:
src/envs/<scene_name>/research/analysis_result.md: The data analysis report.src/envs/<scene_name>/research/report_outline.md: The outline for the final report.src/envs/<scene_name>/research/report/simulation_report_final.pdf: The final PDF research report.src/envs/<scene_name>/research/report/simulation_report.tex: The LaTeX source code for the final report.src/envs/<scene_name>/research/report/review_*.json: (If enabled) The review comments from each iteration.
How to Use (Subcommands)
The report_generation.py script uses subcommands to perform specific tasks. All subcommands require the --scene_name argument.
analyze: Only performs data analysis.outline: Only generates the report outline (depends on analysis results).report: Writes and compiles the report (can be iterative), skipping the analysis and outline steps by default.full: The most commonly used command. Executes the full process from analysis and outline to the final report.review: Performs a single review of an existing.texreport file.
To see detailed usage and arguments for all subcommands, please go to Usage Examples.
Documentation for YuLan-OneSim - A Next Generation Social Simulator with LLMs