CLI Command Reference
Command line tool usage instructions for YuLan-OneSim.
Examplesβ
Quick Startβ
# Run a simulation with default settings
yulan-onesim-cli --config config/config.json --model_config config/model_config.json --mode single --env labor_market_matching_process
Distributed Mode Exampleβ
1. Start the Master nodeβ
yulan-onesim-cli --config config/config.json --model_config config/model_config.json --mode master --expected_workers 2 --env labor_market_matching_process
2. Start the Worker node (assuming the master is at 192.168.1.100:50051)β
yulan-onesim-cli --config config/config.json --model_config config/model_config.json --mode worker --master_address 192.168.1.100 --master_port 50051 --env labor_market_matching_process
Or you can ...
3. Start distributed mode in one commandβ
# Use the distributed script to launch master + 2 workers in one step
bash scripts/distributed/distributed.sh \
--address 127.0.0.1 \
--port 10051 \
--workers 2 \
--config config/config.json \
--model config/model_config.json
NoteοΌ
- When running in distributed mode, start the master node first, then the worker nodes.
--master_address
and--master_port
must match the actual master nodeβs address and port.
Configuration Optionsβ
Argument | Type | Required | Default | Description |
---|---|---|---|---|
--config | str | β | β | PPath to the configuration file. |
--model_config | str | β | β | Model configuration file. |
--env | str | β | None | Simulation environment. |
--mode | str | β | single | Operating mode: single, master, or worker. |
--master_address | str | β | localhost | Address of the master node. |
--master_port | int | β | 50051 | Master node port. |
--worker_address | str | β | None | Worker node address (for master mode) |
--worker_port | int | β | 0 | Worker node port (0 for auto-assign) |
--node_id | str | β | auto | Node identifier (generated if not provided) |
--expected_workers | int | β | 1 | Number of worker nodes to wait for (for master mode) |
--enable_db | flag | β | False | Enable database component. |
--enable_observation | flag | β | False | Enable observation system. |
Script Explanationsβ
Script | Main Function Description |
---|---|
scripts/run.sh | Quickly run a simulation (typically single-node / for testing) |
scripts/distributed/distributed.sh | Launch distributed simulation environment (auto-manage master/worker) |
scripts/distributed/kill_distributed.sh | Terminate distributed simulation processes |
scripts/model/launch_llm.sh | Launch the LLM (large language model) service |
scripts/model/kill_llm.sh | Terminate the LLM service |
scripts/model/launch_all_llm.sh | Launch all LLM services |
scripts/model/embedding_vllm_setup.sh | Launch/configure the embedding vLLM service |
run.sh
β
Usageβ
bash scripts/run.sh
Process overviewβ
Launch directly with default settings
Use casesβ
- Rapid local iterations of the full simulation flow
scripts/distributed/distributed.sh
β
Usageβ
bash scripts/distributed/distributed.sh [options]
Optional variablesβ
--address, -a
: Master address (default 127.0.0.1
)
--port, -p
: Master port (default 10051
)
--workers, -w
: Number of worker nodes (default 2
)
--config, -c
: Path to the global configuration file
--model, -m
: Path to the model configuration file
--help, -h
: Show help message
Process overviewβ
- Parse command-line arguments to override defaults.
- Print startup summary (master address, port, config paths, worker count).
- Create
logs/
directory if missing. - Launch the master node.
- Wait briefly (
sleep 2
) for the master node to initialize. - Loop through workers (0 to
NUM_WORKERS-1
). - Print process IDs for master and workers, and suggest monitoring commands.
Use casesβ
- Running large-scale simulations requiring parallel computation.
scripts/distributed/kill_distributed.sh
β
Usageβ
bash scripts/distributed/kill_distributed.sh [options]
Optional variablesβ
--graceful, -g
: Use SIGTERM instead of SIGKILL for graceful termination (default: SIGKILL)
--show-only, -s
: List processes without killing them (default: false)
--help, -h
: Show help message
Process overviewβ
- Parse command-line arguments (
--graceful
,--show-only
,--help
). - Locate OneSim master and worker processes using
pgrep
. - Display found processes with
ps
. - Count total processes; if
--show-only
, report counts and exit. - Prompt user for confirmation to kill processes.
- Send specified signal (
SIGKILL
orSIGTERM
) to each worker, then master process. - Report the number of killed processes.
- Pause briefly and verify no remaining OneSim processes are running.
Use casesβ
- Quickly terminate all running OneSim simulations in bulk.
- Perform a dry-run to review processes before killing.
scripts/model/launch_llm.sh
β
Usageβ
bash scripts/model/launch_llm.sh <port> <gpu_id> <model_path> [lora_dir]
Required argumentsβ
<port>
: Port to host the vLLM API server on
<gpu_id>
: CUDA device ID for CUDA_VISIBLE_DEVICES
<model_path>
: Path to the LLM model directory or file
Optional Variables:β
[lora_dir]
: Optional directory for LoRA adapters
Process overviewβ
- Exit on errors (
set -e
). - Validate that at least three positional arguments are provided; exit with usage if not.
- Assign input arguments to
port
,gpuid
,model_path
, and optionallora_dir
. - Export environment variables:
CUDA_VISIBLE_DEVICES
set togpuid
.VLLM_ATTENTION_BACKEND
set toXFORMERS
for memory-efficient inference.
- Determine script directory (
current_dir
), defineLOG_FILE
andPID_FILE
in that directory. - Append startup metadata (timestamp, port, GPU ID, model path, LoRA directory if provided) to
llms.log
. - Check port availability using
lsof
; abort with an error if the port is in use. - Construct the vLLM server command (
python -m vllm.entrypoints.openai.api_server
) with flags for model, port, dtype, pipeline parallelism, caching, decoding backend, tokenizer mode, seed, request logging, multiprocessing, and GPU memory utilization. - If
lora_dir
is provided or anadapter_config.json
exists inmodel_path
, enable LoRA flags in the command. - Launch the constructed command in the background, redirecting output to the log file; capture the launcher PID.
- Sleep for 5 seconds to allow the server to initialize.
- Identify the actual server PID by searching for the child process of the launcher.
- Record the server PID in
llms.pid
. - Log a successful startup message with port and PID.
Use casesβ
- Deploy a vLLM-based API server on a specific GPU for development or testing.
- Integrate optional LoRA adapters for fine-tuning experiments.
scripts/model/kill_llm.sh
β
Usageβ
bash scripts/model/kill_llm.sh
Process overviewβ
- Exit immediately on any error (
set -e
). - Determine the script directory (
current_dir
), and definePID_FILE
andLOG_FILE
within it. - If the PID file does not exist, report and exit (nothing to stop).
- Read each PID from
llms.pid
:- If the process is running, send
SIGKILL
to terminate it. - Report if any PID is not found.
- If the process is running, send
- Remove the PID file and log file to clean up.
- Confirm that all vLLM servers have been stopped.
Use casesβ
- Clean up after testing or development runs of vLLM servers.
scripts/model/launch_all_llm.sh
β
Usageβ
bash scripts/model/launch_all_llm.sh
Process overviewβ
- Determine the script directory (
script_dir
). - Define the arrays:
port_list
: list of ports to use (e.g., 9881β9888).gpu_list
: corresponding GPU IDs for each server (0β7).model_path
: path to the pretrained model directory.
- Loop over each index of
port_list
:- Extract
port
andgpu_id
by index. - Invoke
launch_llm.sh
withport
,gpu_id
, andmodel_path
in the background.
- Extract
- Use
wait
to block until all background server processes have started. - Print a confirmation message once all LLM API servers are running.
Use casesβ
- Concurrently launch a fleet of LLM API servers across multiple GPUs and ports.
scripts/model/embedding_vllm_setup.sh
β
Usageβ
bash scripts/model/embedding_vllm_setup.sh [-m model_name_or_path] [-p port]
Optional variablesβ
-m
: Model name or path (default: openai/gpt-3.5-turbo
)
-p
: Port to host the embedding API server (default: 9890
)
Process overviewβ
- Exit on errors (
set -e
). - Parse flags using
getopts
:-m
formodel_name_or_path
.-p
forport
.
- Export
CUDA_VISIBLE_DEVICES=7
to use GPU 7. - Determine script directory and set
LOG_FILE
and PID file paths. - Append startup metadata (timestamp, model path, port, GPU ID) to the log file.
- Check port availability with
lsof
; abort if in use. - Construct the vLLM embedding server command (
python3 -m vllm.entrypoints.openai.api_server
) with flags for embedding task, dtype, parallelism, remote code trust, seed, request logging, multiprocessing, and GPU memory utilization. - Launch the server in the background, logging output to
vllm_embedding.log
; capture the launcher PID. - Wait briefly (
sleep 5
) for initialization. - Identify the actual server PID and write it to
vllm_embedding.pid
. - Log and echo a success message with port and PID.
Use casesβ
- Serve embedding requests via the vLLM API for downstream vectorization tasks.
- Experiment with different LLM-backed embedding models on a dedicated GPU.